The LingPipe NLP API provides techniques to train a model and to classify documents based upon these models. In this recipe, we will demonstrate how a model is trained. Once trained, we will then serialize the model for later use. In the next recipe, Using LingPipe to classify text, we will use this model to classify sample text.
LingPipe comes with a set of training data. We will use this data in this recipe. Other training datasets can be found at sites such as http://qwone.com/~jason/20Newsgroups/.