Now, the question is how we can obtain the optimal w such that is minimized. We can do so using gradient descent:
Training a logistic regression model
Training a logistic regression model using gradient descent
Gradient descent (also called steepest descent) is a procedure of minimizing an objective function by first-order iterative optimization. In each iteration, it moves a step that is proportional to the negative derivative of the objective function at the current point. This means the to-be-optimal point iteratively moves downhill towards the minimal value of the objective function. The proportion we just mentioned is called learning rate, or step size. It can be summarized in a mathematical equation as follows:
![](https://static.packt-cdn.com/products/9781789616729/graphics/assets/ab1be2e1-63f6-44b7-9f81-7f67f61b9582.png)
Here...