Introduction
In the previous chapters, we discussed the two types of supervised learning problems: regression and classification. We looked at a number of algorithms for each type and delved into how those algorithms worked.
But there are times when these algorithms, no matter how complex they are, just don't seem to perform well on the data that we have. There could be a variety of causes and reasons – perhaps the data is not good enough, perhaps there really is no trend where we are trying to find one, or perhaps the model itself is too complex.
Wait. What? How can a model being too complex be a problem? Oh, but it can! If a model is too complex and there isn't enough data, the model could fit so well to the data that it learns even the noise and outliers, which is never what we want.
Oftentimes, where a single complex algorithm can give us a result that is way off, aggregating the results from a group of models can give us a result that's closer to the actual truth. This is because there...