Open NLP POS Tagger Example

By Dhiraj 11 July, 2017

In this article we will be discussing about apache OpenNLP POS Tagger with an example. The example will be a maven based project and we will be using en-pos-maxent.bin model file to tag any part of speech. We will be using WhitespaceTokenizer provided by OpenNLP to tokenize the text.

What is Part-of-Speech Tagging

As per wiki, POS tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its contextâ€”i.e., its relationship with adjacent and related words in a phrase, sentence, or paragraph. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

 Other NLP Articles
Standford NLP Named Entity Recognition
Apache OpenNLP Maven Eclipse Example
Standford NLP Maven Example
Standford NLP POS Tagger Example
Apache OpenNLP Named Entity Recognition Example

Different POS Tags Meanings

Following is the POS Tags with their corresponding meaning.

Maven Dependencies for OpenNLP

pom.xml

	
    <dependencies>
	<dependency>
            <groupId>org.apache.opennlp</groupId>
            <artifactId>opennlp-tools</artifactId>
	    <version>1.8.1</version>
        </dependency>
		
	<dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
	    <version>4.12</version>
	    <scope>test</scope>
        </dependency>
   </dependencies>

Implementing POS Tagging using Apache OpenNLP

Following is the class that takes a chunk of text as an input parameter and tags each word. In this example, first we are using sentence detector to split a paragraph into muliple sentences and then the each sentence is then tagged using OpenNLP POS tagging. Here is the complete article on Sentence Detector.

WhitespaceTokenizer tokenizer uses white spaces to tokenize the input text. en-pos-maxent.bin is the maxent model with tag dictionary.

POSTaggingExample.java

package com.devglan;

import opennlp.tools.postag.POSModel;
import opennlp.tools.postag.POSTaggerME;
import opennlp.tools.sentdetect.SentenceDetector;
import opennlp.tools.sentdetect.SentenceDetectorME;
import opennlp.tools.sentdetect.SentenceModel;
import opennlp.tools.tokenize.WhitespaceTokenizer;

import java.io.IOException;
import java.io.InputStream;

/**
 * Created by only2dhir on 11-07-2017.
 */
public class POSTaggingExample {

    POSTaggerME tagger = null;
    POSModel model = null;

    public void initialize(String lexiconFileName) {
        try {
            InputStream modelStream =  getClass().getResourceAsStream(lexiconFileName);
            model = new POSModel(modelStream);
            tagger = new POSTaggerME(model);
        } catch (IOException e) {
            System.out.println(e.getMessage());
        }
    }

    public void tag(String text){
        initialize("/en-pos-maxent.bin");
        try {
            if (model != null) {
                POSTaggerME tagger = new POSTaggerME(model);
                if (tagger != null) {
                    String[] sentences = detectSentences(text);
                    for (String sentence : sentences) {
                        String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE
                                .tokenize(sentence);
                        String[] tags = tagger.tag(whitespaceTokenizerLine);
                        for (int i = 0; i < whitespaceTokenizerLine.length; i++) {
                            String word = whitespaceTokenizerLine[i].trim();
                            String tag = tags[i].trim();
                            System.out.print(tag + ":" + word + "  ");
                        }
                    }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public String[] detectSentences(String paragraph) throws IOException {

        InputStream modelIn = getClass().getResourceAsStream("/en-sent.bin");
        final SentenceModel sentenceModel = new SentenceModel(modelIn);
        modelIn.close();

        SentenceDetector sentenceDetector = new SentenceDetectorME(sentenceModel);
        String sentences[] = sentenceDetector.sentDetect(paragraph);
        for (String sent : sentences) {
            System.out.println(sent);
        }
        return sentences;
    }
}

Testing OpenNLP POS Tagger

Following is the test class to test the tagger class.

package com.devglan;

import org.junit.Test;

/**
 * Created by only2dhir on 11-07-2017.
 */
public class POSTaggerTest {

    @Test
    public void tag(){
        POSTaggingExample tagging = new POSTaggingExample();
        tagging.tag("If you have several test classes, you can combine them into a test suite. Running a test suite executes all test classes in that suite in the specified order. A test suite can also contain other test suites");
    }
}

Output

Conclusion

I hope this article served you that you were looking for. If you have anything that you want to add or share then please share it below in the comment section.

Download source

Support This Free Tool!

Buying me a coffee helps keep the project running and supports new features.

Thank you for helping this blog thrive!

A technology savvy professional with an exceptional capacity to analyze, solve problems and multi-task. Technical expertise in highly scalable distributed systems, self-healing systems, and service-oriented architecture. Technical Skills: Java/J2EE, Spring, Hibernate, Reactive Programming, Microservices, Hystrix, Rest APIs, Java 8, Kafka, Kibana, Elasticsearch, etc.

What is Part-of-Speech Tagging

Different POS Tags Meanings

Maven Dependencies for OpenNLP

Implementing POS Tagging using Apache OpenNLP

Testing OpenNLP POS Tagger

Output

Conclusion

Download source

Google Artificial Intelligence And Seo

Opennlp Named Entity Recognition Example

Support This Free Tool!

About The Author

Further Reading on Artificial Intelligence

Recommended

Category

Apache Kafka

Spring Cloud

Angular JS

Spring Boot

Spring Security

React JS

Python

Artificial Intelligence

Java 8

Hibernate

Spring MVC

Core Java

Spring Jdbc

Node JS

Android

Data Structure

Core Java Programs

Contact Us

Quick Links

Quick Links

Newsletter

Open NLP POS Tagger Example (Maven + Eclipse)

What is Part-of-Speech Tagging

Different POS Tags Meanings

Maven Dependencies for OpenNLP

Implementing POS Tagging using Apache OpenNLP

Testing OpenNLP POS Tagger

Output

Conclusion

Download source

Support This Free Tool!

About The Author

Further Reading on Artificial Intelligence

Recommended

Category