General optimization algorithms
The most common optimization problem encountered in ML is continuous function optimization, wherein the function’s input arguments are (real) numeric values. In training ML models, optimization entails minimizing the loss function till it reaches or converges to a local minimum (value).
An entire search domain is utilized in global optimization whereas only a neighborhood is explored in local optimization, which requires the knowledge of an initial approximation, as evident from Figure 10.2a. If the objective function has local minima, then local search algorithms (gradient methods, for example) can also be stuck in one of the local minima. If the algorithm attains a local minimum, it is nearly impossible to reach the global minimum in the search space. In discrete search space, the neighborhood is a finite set that can be completely explored, while the objective function is differentiable (gradient methods, Newton-like methods) in continuous...