Section 4 – Modeling Dichotomous and Multiclass Targets with Supervised Learning
There are a good number of high performing algorithms for predicting categorical targets. We will examine the most popular classification algorithms in this part. We will also consider why we might choose one algorithm over any of the others given the attributes our data and our domain knowledge.
We are as concerned with underfitting and overfitting with classification models as we were with regression models in the previous part. When the relationship between features and the target is complicated, we need to use an algorithm that can capture that complexity. But there is often a non-trivial risk of overfitting. We will discuss strategies for modeling complexity without overfitting in the chapters in this part. This usually involves some form of regularization for logistic regression models, limits on tree depth for decision trees, and adjusting the tolerance for margin violations with support...