Predicting Online Ad Click-Through with Tree-Based Algorithms
In the previous chapter, we built a movie recommender. In this chapter and the next, we will be solving one of the most data-driven problems in digital advertising: ad click-through prediction—given a user and the page they are visiting, this predicts how likely it is that they will click on a given ad. We will focus on learning tree-based algorithms (including decision trees, random forest models, and boosted trees) and utilize them to tackle this billion-dollar problem.
We will be exploring decision trees from the root to the leaves, as well as the aggregated version, a forest of trees. This won’t be a theory-only chapter, as there are a lot of hand calculations and implementations of tree models from scratch included. We will be using scikit-learn and XGBoost, a popular Python package for tree-based algorithms.
We will cover the following topics in this chapter:
- A brief overview of ad...