Summary
In this chapter, we quickly skimmed through the very interesting domain of continuous control using RL methods, and we checked three different algorithms on one problem of a four-legged robot. In our training, we used an emulator, but there are real models of this robot made by the Ghost Robotics company. (You can check out the cool video on YouTube: https://youtu.be/bnKOeMoibLg.) We applied three training methods to this environment: A2C, DDPG, and D4PG (which showed the best results).
In the next chapter, we will continue exploring the continuous action domain and check a different set of improvements: trust region extension.