Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Machine Learning in Java

You're reading from   Machine Learning in Java Helpful techniques to design, build, and deploy powerful machine learning applications in Java

Arrow left icon
Product type Paperback
Published in Nov 2018
Publisher Packt
ISBN-13 9781788474399
Length 300 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Ashish Bhatia Ashish Bhatia
Author Profile Icon Ashish Bhatia
Ashish Bhatia
Bostjan Kaluza Bostjan Kaluza
Author Profile Icon Bostjan Kaluza
Bostjan Kaluza
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Applied Machine Learning Quick Start FREE CHAPTER 2. Java Libraries and Platforms for Machine Learning 3. Basic Algorithms - Classification, Regression, and Clustering 4. Customer Relationship Prediction with Ensembles 5. Affinity Analysis 6. Recommendation Engines with Apache Mahout 7. Fraud and Anomaly Detection 8. Image Recognition with Deeplearning4j 9. Activity Recognition with Mobile Phone Sensors 10. Text Mining with Mallet - Topic Modeling and Spam Detection 11. What Is Next? 12. Other Books You May Enjoy

Topic modeling for BBC News

As discussed earlier, the goal of topic modeling is to identify patterns in a text corpus that correspond to document topics. In this example, we will use a dataset originating from BBC News. This dataset is one of the standard benchmarks in machine-learning research, and is available for non-commercial and research purposes.

The goal is to build a classifier that is able to assign a topic to an uncategorized document.

BBC dataset

In 2006, Greene and Cunningham collected the BBC dataset to study a particular document—Clustering challenge using support vector machines. The dataset consists of 2,225 documents from the BBC News website from 2004 to 2005, corresponding to the stories collected...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image