In the last chapter we studied topic models and how they can help us in organizing and better understanding our documents and its substructure. We will now move on to our next set of machine learning algorithms, and for two particular tasks — clustering and classification. We will learn what the intuitive reasoning of these two tasks is, as well as how to perform these tasks using the popular Python machine learning library, scikit-learn:
- Clustering text
- Classifying text