In Chapter 8, Identifying Credit Default with Machine Learning, we introduced the workflow of solving a real-life problem using machine learning. We went over the entire pipeline, from cleaning the data to training a model (a classifier, in that case) and evaluating its performance. However, this is rarely the end of the project. We used a simple decision tree classifier, which most of the time can be used as a benchmark or minimum viable product (MVP). We will now approach a few more advanced topics.
We start the chapter by presenting how to use more advanced classifiers (also based on decision trees). Some of them (such as XGBoost or LightGBM) are frequently used for winning machine learning competitions (such as those found on Kaggle). Additionally, we introduce the concept of stacking multiple machine learning models, to further...