In this chapter, we covered the most important tools that machine learning practitioners use in order to make sense of their data and get the learning algorithm to get the most out of their data.
Feature engineering was the first and commonly used tool in data science; it's a must-have component in any data science pipeline. The purpose of this tool is to make better representations for your data and increase the predictive power of your model.
We saw how a large number of features can be problematic and lead to worse classifier performance. We also saw that there is an optimal number of features that should be used to get the maximum model performance, and this optimal number of features is a function of the number of data samples/observations you got.
Subsequently, we introduced one of the most powerful tools, which is bias-variance decomposition. This tool is widely...