Regularization in logistic regression
One of the dangers of machine learning is over-fitting: the algorithm captures not only the signal in the training set, but also the statistical noise that results from the finite size of the training set.
A way to mitigate over-fitting in logistic regression is to use regularization: we impose a penalty for large values of the parameters when optimizing. We can do this by adding a penalty to the cost function that is proportional to the magnitude of the parameters. Formally, we re-write the logistic regression cost function (described in Chapter 2, Manipulating Data with Breeze) as:
where is the normal logistic regression cost function:
Here, params is the vector of parameters, is the vector of features for the ith training example, and is 1 if the i th training example is spam, and 0 otherwise. This is identical to the logistic regression cost-function introduced in Chapter 2, Manipulating data with Breeze, apart from the addition of the regularization...