In Chapter 6, Clustering - Finding Related Posts, we grouped text documents using clustering. This is a very useful tool, but it is not always the best. Clustering results in each text belonging to exactly one cluster. This book is about machine learning and Python. Should it be grouped with other Python-related works or with machine-related works? In a physical bookstore, we need to choose a single place to stock the book. In an internet store, however, the answer is that this book is about both machine learning and Python, and the book should be listed in both sections. This does not mean that the book will be listed in all sections, of course. We will not list this book with other baking books.
In this chapter, we will learn methods that do not cluster documents into completely separate groups but that allow each document to refer to several topics. These topics...