Summary
In this chapter, we explored some of the most popular non-parametric regression algorithms: K-nearest neighbors, decision trees, and random forests. Models built with these algorithms can perform well, with a few limitations. We discussed some of the advantages and limitations of each of these techniques, including dimension and observation limits, as well as concerns about the time required for training, for KNN models. We discussed the key challenge with decision trees, which is high variance, but also how that can be addressed by a random forest model. We explored gradient boosted regression trees as well and discussed some of their advantages. We continued to improve our skills regarding hyperparameter tuning since each algorithm required a somewhat different strategy.
We discuss supervised learning algorithms where the target is categorical over the next few chapters, starting with perhaps the most familiar classification algorithm, logistic regression.