Training a logistic regression model
Now, the question is how we can obtain the optimal w such that J(w) is minimized. We can do so using gradient descent.
Training a logistic regression model using gradient descent
Gradient descent (also called steepest descent) is a procedure of minimizing an objective function by first-order iterative optimization. In each iteration, it moves a step that is proportional to the negative derivative of the objective function at the current point. This means the to-be-optimal point iteratively moves downhill toward the minimal value of the objective function. The proportion we just mentioned is called the learning rate, or step size. It can be summarized in a mathematical equation as follows:
Here, the left w is the weight vector after a learning step, and the right w is the one before moving, η is the learning rate, and ∆w is the first...