The idea behind chunking is to group POS-related words together. In this recipe, we will use the OpenNLP ChunkerME class to perform chunking. This class uses maximum entropy to perform this task.
Using a chunker to find POS
Getting ready
To prepare, we need to do the following:
- Create a new Maven project.
- Add the following dependency to the project's POM file:
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-tools</artifactId>
<version>1.9.0</version>
</dependency>
Download the files, en-pos-maxent.bin and en-chunker.bin, from http://opennlp.sourceforge.net/models-1.5/. Add the files to the root level of your project.