Standford NLP POS Tagger Example(Maven + Eclipse)

By Dhiraj Ray, 12 July,2017  

In this tutorial we will be discussing about Standford NLP POS Tagger with an example. We will be creating a simple project in eclipse IDE with maven as a building tool and look into how Standford NLP can be used to tag any part of speech. We will be using MaxentTagger provided by Standford to tag POS using english-left3words-distsim.tagger.

What is Part-of-Speech Tagging

As per wiki, POS tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context—i.e., its relationship with adjacent and related words in a phrase, sentence, or paragraph. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

Different POS Tags Meanings

Following is the POS Tags with their corresponding meaning.

pos-tags-meaning

Project Structure

standford-nlp-tagger-project-strct

Maven Dependencies for OpenNLP

pom.xml
<dependencies> <dependency> <groupId>edu.stanford.nlp</groupId> <artifactId>stanford-corenlp</artifactId> <version>3.8.0</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> <scope>test</scope> </dependency> </dependencies>

Implementing POS Tagging using Apache OpenNLP

Following is the class that takes text as an input parameter and tags each word.Here is an example of Apache OpenNLP POS Tagger Example if you are looking for OpenNLP taggger.

MaxentTagger is the main class for users to run, train, and test the part of speech tagger.Here we are initialzing MaxentTagger with a constructor taking as argument the location of parameter file with a trained tagger as english-left3words-distsim.tagger

TaggerExample.java
package com.devglan; import edu.stanford.nlp.tagger.maxent.MaxentTagger; import java.io.IOException; public class TaggerExample { public void tag(String text) throws IOException, ClassNotFoundException { MaxentTagger maxentTagger = new MaxentTagger("english-left3words-distsim.tagger");; String tag = maxentTagger.tagString(text); String[] eachTag = tag.split("\\s+"); System.out.println("Word " + "Standford tag"); System.out.println("----------------------------------"); for(int i = 0; i< eachTag.length; i++) { System.out.println(eachTag[i].split("_")[0] +" "+ eachTag[i].split("_")[1]); } } }

Testing OpenNLP POS Tagger

Following is the test class to test the tagger class.

package com.devglan; import org.junit.Test; import java.io.IOException; public class TaggerTest { @Test public void tag() throws IOException, ClassNotFoundException { TaggerExample tagging = new TaggerExample(); tagging.tag("If you have several test classes, you can combine them into a test suite."); } }

Output

standford-nlp-pos-tagger-output

Conclusion

I hope this article served you that you were looking for. If you have anything that you want to add or share then please share it below in the comment section.

Download the source

References

Standford Tagger

Maxent Tagger

Stanford Tagger DOC

Suggest more topics in suggestion section or write your own article and share with your colleagues.

Is this page helpful to you? Please give us your feedback below. We would love to hear your thoughts on these articles, it will help us improve further our learning process.

Further Reading: