Chapter 1, Applied Machine Learning Quick Start, introduces the field of natural language processing (NLP). The tools and basic techniques that support NLP are discussed. The use of models, their validation, and their use from a conceptual perspective are presented.
Chapter 2, Java Libraries and Platforms for Machine Learning, covers the purpose and uses of tokenizers. Different tokenization processes will be explored, followed by how they can be used to solve specific problems.
Chapter 3, Basic Algorithms – Classification, Regression, and Clustering, covers the problems associated with sentence detection. Correct detection of the end of sentences is important for many reasons. We will examine different approaches to this problem using a variety of examples.
Chapter 4, Customer Relationship Prediction with Ensembles, covers the process and problems associated with name recognition. Finding names, locations, and various things in a document is an important step in NLP. The techniques available are identified and demonstrated.
Chapter 5, Affinity Analysis, covers the process of determining the part of speech that is useful in determining the importance of words and their relationships in a document. It is a process that can enhance the effectiveness of other NLP tasks.
Chapter 6, Recommendation Engine with Apache Mahout, covers traditional features that do not apply to text documents. In this chapter, we'll learn how text documents can be presented.
Chapter 7, Fraud and Anomaly Detection, covers information retrieval, which entails finding documents in an unstructured format, such as text that satisfies a query.
Chapter 8, Image Recognition with Deeplearning4J, covers the issues surrounding how documents and text can be classified. Once we have isolated the parts of text, we can begin the process of analyzing it for information. One of these processes involves classifying and clustering information.
Chapter 9, Activity Recognition with Mobile Phone Sensors, demonstrates how to discover topics in a set of documents.
Chapter 10, Text Mining with Mallet – Topic Modeling and Spam Detection, covers the use of parsers and chunkers to solve text problems that are then examined. This important process, which normally results in a parse tree, provides insights into the structure and meaning of documents.
Chapter 11, What is Next?, brings together many of the topics in previous chapters to address other more sophisticated problems. The use and construction of a pipeline is discussed. The use of open source tools to support these operations is presented.