Logistic Regression
Logistic regression is very similar to the linear regression technique we introduced in the previous section, with the only difference that the target variable, Y
, assumes only values in a discrete set; say, for simplicity {0, 1}. If we were to approach such a problem as a logistic regression problem, the output of the right-hand side of the equation in Figure 3.17 could easily go way beyond the values 0 and 1. Furthermore, even by limiting the output, it will still be able to assume all the values in the interval [0, 1]. For this reason, the idea behind logistic regression is to model the probability of the target variable Y, to assume one of the values (say 1). In this case, all the values between 0 and 1 will be reasonable.
With p
, let's denote the probability of the target variable, Y
, being equal to 1 when it's given a specific feature x
:
Let's also define the logit
function: