Now, the question is how we can obtain the optimal w such that is minimized. We can do so using gradient descent:
Training a logistic regression model
Training a logistic regression model using gradient descent
Gradient descent (also called steepest descent) is a procedure of minimizing an objective function by first-order iterative optimization. In each iteration, it moves a step that is proportional to the negative derivative of the objective function at the current point. This means the to-be-optimal point iteratively moves downhill towards the minimal value of the objective function. The proportion we just mentioned is called learning rate, or step size. It can be summarized in a mathematical equation as follows:

Here...