Classifying data with logistic regression
In the last chapter, we trained tree-based models only based on the first 300,000 samples out of 40 million. We did so simply because training a tree on a large dataset is extremely computationally expensive and time consuming. Since we are not limited to algorithms directly taking in categorical features thanks to one-hot encoding, we should turn to a new algorithm with high scalability for large datasets. As mentioned, logistic regression is one of the most, or perhaps the most, scalable classification algorithms.
Getting started with the logistic function
Let’s start with an introduction to the logistic function (which is more commonly referred to as the sigmoid function) as the algorithm’s core before we dive into the algorithm itself. It basically maps an input to an output of a value between 0 and 1, and is defined as follows:
We define the logistic function as follows:
>>> import numpy as...