Things to try
Here is a list of things you can do to improve your understanding of the topic:
-
In the D4PG code, I used a simple replay buffer, which was enough to get good improvement over DDPG. You can try to switch the example to the prioritized replay buffer in the same way as we did in Chapter 8.
-
There are lots of interesting and challenging environments around. For example, you can start with other PyBullet environments, but there is also the DeepMind Control Suite (Tassa et al., DeepMind Control Suite, arXiv abs/1801.00690 (2018)), MuJoCo-based environments in Gym, and many others.
-
You can play with the very challenging Learning to Run competition from NIPS-2017 (which also took place in 2018 and 2019 with more challenging problems), where you are given a simulator of the human body and your agent needs to figure out how to move it around.