Predicting Online Ad Click-Through with Logistic Regression
In the previous chapter, we predicted ad click-through using tree algorithms. In this chapter, we will continue our journey of tackling the billion-dollar problem. We will focus on learning a very (probably the most) scalable classification model – logistic regression. We will explore what the logistic function is, how to train a logistic regression model, adding regularization to the model, and variants of logistic regression that are applicable to very large datasets. Besides its application in classification, we will also discuss how logistic regression and random forest models are used to pick significant features. You won’t get bored as there will be lots of implementations from scratch with scikit-learn and TensorFlow.
In this chapter, we will cover the following topics:
- Converting categorical features to numerical – one-hot encoding and original encoding
- Classifying data with...