Summary
In this chapter, we quickly skimmed through a very interesting domain of continuous control, using RL methods and checked three different algorithms on one problem of a four-legged robot. In our training, we used an emulator, but there are real models of this robot made by the Ghost Robotics company (you can check out the cool video on YouTube: https://youtu.be/bnKOeMoibLg).
We applied three training methods to this environment: A2C, DDPG, and D4PG (which has shown the best results). In the next chapter, we'll continue exploring the continuous action domain and will check a different set of improvements: trust region.