So far, we have dealt with optimization problems where we have a ball or particle that we edge along the curved space gradually and move toward the minima using gradient descent or Newton's method. Now, however, we will take a look at another class of optimization, where we use a population of individuals.
We spread these individuals across the optimization space, which prevents the optimization algorithm from getting stuck at local minima or a saddle point. These individuals can share information with each other about the local area they're in and use this to find an optimal solution that minimizes our function.
With these algorithms, we have an initial population and we would like to distribute them so that we cover as much ground as we can to give us the best chance of finding a globally optimal region.
We can sample our population from...