Summary
In this chapter, we saw two examples of black-box optimization methods: evolution strategies and genetic algorithms, which make less assumptions about the reward system, but nevertheless can provide competition for other analytical gradient methods. Their strength lies in good parallelization on a large amount of resources and the smaller amount of assumptions that they have on the reward function.
In the next chapter, we'll take a look at a different sphere of modern RL development: model-based methods.