Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering Java Machine Learning

You're reading from   Mastering Java Machine Learning A Java developer's guide to implementing machine learning and big data architectures

Arrow left icon
Product type Paperback
Published in Jul 2017
Publisher Packt
ISBN-13 9781785880513
Length 556 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Authors (2):
Arrow left icon
Uday Kamath Uday Kamath
Author Profile Icon Uday Kamath
Uday Kamath
Krishna Choppella Krishna Choppella
Author Profile Icon Krishna Choppella
Krishna Choppella
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Machine Learning Review FREE CHAPTER 2. Practical Approach to Real-World Supervised Learning 3. Unsupervised Machine Learning Techniques 4. Semi-Supervised and Active Learning 5. Real-Time Stream Machine Learning 6. Probabilistic Graph Modeling 7. Deep Learning 8. Text Mining and Natural Language Processing 9. Big Data Machine Learning – The Final Frontier A. Linear Algebra B. Probability Index

Topics in text mining


As we saw in the first section, the area of text mining and performing Machine Learning on text spans a wide range of topics. Each topic discussed has some customizations to the mainstream algorithms, or there are specific algorithms that have been developed to perform the task called for in that area. We have chosen four broad topics, namely, text categorization, topic modeling, text clustering, and named entity recognition, and will discuss each in some detail.

Text categorization/classification

The text classification problem manifests itself in different applications, such as document filtering and organization, information retrieval, opinion and sentiment mining, e-mail spam filtering, and so on. Similar to the classification problem discussed in Chapter 2, Practical Approach to Real-World Supervised Learning, the general idea is to train on the training data with labels and to predict the labels of unseen documents.

As discussed in the previous section, the preprocessing...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime