You're reading from Machine Learning Techniques for Text Apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation

Product type Paperback

Published in Oct 2022

Publisher Packt

ISBN-13 9781803242385

Length 448 pages

Edition 1st Edition

Languages

Python

Concepts

Machine Learning

Author (1):

Nikos Tsourakis

View More author details

Table of Contents (13) Chapters

Preface

1. Chapter 1: Introducing Machine Learning for Text

2. Chapter 2: Detecting Spam Emails FREE CHAPTER

3. Chapter 3: Classifying Topics of Newsgroup Posts

4. Chapter 4: Extracting Sentiments from Product Reviews

5. Chapter 5: Recommending Music Titles

6. Chapter 6: Teaching Machines to Translate

7. Chapter 7: Summarizing Wikipedia Articles

8. Chapter 8: Detecting Hateful and Offensive Language

9. Chapter 9: Generating Text in Chatbots

10. Chapter 10: Clustering Speech-to-Text Transcriptions

11. Index

Why subscribe?

12. Other Books You May Enjoy

Introducing the LDA algorithm

In Chapter 3, Classifying Topics of Newsgroup Posts, we examined how to classify the instances of a newsgroup dataset into predefined topics. A related situation is encountered when we want to assign a topic label to a piece of text without prior knowledge of the available topics. Topic modeling refers to the task of identifying groups of items, in our case words, that best describes a collection of documents or sentences. The topics emerge during the specific process; hence they are called latent.

A popular topic modeling technique to extract the hidden topics from a given corpus is the latent dirichlet allocation (LDA). Strictly speaking, LDA is not a clustering algorithm because it produces a distribution of groupings over the sentences being processed. However, as a document can be a part of multiple topics, LDA resembles a soft clustering algorithm in which each data point belongs to more than one cluster. For this reason, we made it part of this...

The rest of the chapter is locked

You're reading from Machine Learning Techniques for Text Apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation

Table of Contents (13) Chapters

Introducing the LDA algorithm

Unlock this book and the full library FREE for 7 days

Authors (1)