Summary
In this chapter, we explored machine learning on a very high level and familiarized ourselves with the big picture and major concepts that we are going to explore in the next chapters in more detail.
We learned that supervised learning is composed of two important subfields: classification and regression. While classification models allow us to categorize objects into known classes, we can use regression analysis to predict the continuous outcomes of target variables. Unsupervised learning not only offers useful techniques for discovering structures in unlabeled data, but it can also be useful for data compression in feature preprocessing steps.
We briefly went over the typical roadmap for applying machine learning to problem tasks, which we will use as a foundation for deeper discussions and hands-on examples in the following chapters. Eventually, we set up our Python environment and installed and updated the required packages to get ready to see machine-learning in action.
In the following chapter, we will implement one of the earliest machine learning algorithms for classification that will prepare us for Chapter 3, A Tour of Machine Learning Classifiers Using Scikit-learn, where we cover more advanced machine learning algorithms using the scikit-learn open source machine learning library. Since machine learning algorithms learn from data, it is critical that we feed them useful information, and in Chapter 4, Building Good Training Sets—Data Preprocessing we will take a look at important data preprocessing techniques. In Chapter 5, Compressing Data via Dimensionality Reduction, we will learn about dimensionality reduction techniques that can help us to compress our dataset onto a lower-dimensional feature subspace, which can be beneficial for computational efficiency. An important aspect of building machine learning models is to evaluate their performance and to estimate how well they can make predictions on new, unseen data. In Chapter 6, Learning Best Practices for Model Evaluation and Hyperparameter Tuning we will learn all about the best practices for model tuning and evaluation. In certain scenarios, we still may not be satisfied with the performance of our predictive model although we may have spent hours or days extensively tuning and testing. In Chapter 7, Combining Different Models for Ensemble Learning we will learn how to combine different machine learning models to build even more powerful predictive systems.
After we covered all of the important concepts of a typical machine learning pipeline, we will implement a model for predicting emotions in text in Chapter 8, Applying Machine Learning to Sentiment Analysis, and in Chapter 9, Embedding a Machine Learning Model into a Web Application, we will embed it into a Web application to share it with the world. In Chapter 10, Predicting Continuous Target Variables with Regression Analysis we will then use machine learning algorithms for regression analysis that allow us to predict continuous output variables, and in Chapter 11, Working with Unlabelled Data – Clustering Analysis we will apply clustering algorithms that will allow us to find hidden structures in data. The last chapter in this book will cover artificial neural networks that will allow us to tackle complex problems, such as image and speech recognition, which is currently one of the hottest topics in machine-learning research.