In this chapter, we discussed the possibility of using ensemble learning in order to classify tweets. Although a simple logistic regression can outperform ensemble learning techniques, it is an interesting introduction to the realm of natural language processing and the techniques that are used in order to preprocess the data and extract useful features. In summary, we introduced the concepts of n-grams, IDF feature extraction, stemming, and stop word removal. We discussed the process of cleaning the data, as well as training a voting classifier and using it to classify tweets in real time using Twitter's API.
In the next chapter, we will see how ensemble learning can be utilized in the design of recommender systems, with the aim of recommending movies to a specific user.