Training a logistic regression model
Now, the question is as follows: how can we obtain the optimal w such that J(w) is minimized? We can do so using gradient descent.
Training a logistic regression model using gradient descent
Gradient descent (also called steepest descent) is a procedure for minimizing a loss function by first-order iterative optimization. In each iteration, the model parameters move a small step that is proportional to the negative derivative of the objective function at the current point. This means the to-be-optimal point iteratively moves downhill toward the minimal value of the objective function. The proportion we just mentioned is called the learning rate, or step size. It can be summarized in a mathematical equation as follows:
Here, the left w is the weight vector after a learning step, and the right w is the one before moving, is the learning rate, and is the first-order derivative, the gradient.
To train a logistic regression model...