Introduction
In the previous chapters, you learned about various extraction methods, such as tokenization, stemming, lemmatization, and stop-word removal, which are used to extract features from unstructured text. We also discussed Bag of Words and Term Frequency-Inverse Document Frequency (TFIDF).
In this chapter, you will learn how to use these extracted features to develop machine learning models. These models are capable of solving real-world problems, such as detecting whether sentiments carried by texts are positive or negative, predicting whether emails are spam or not, and so on. We will also cover concepts such as supervised and unsupervised learning, classifications and regressions, sampling and splitting data, along with evaluating the performance of a model in depth. This chapter also discusses how to load and save these models for future use.