In this chapter, we have learned about several recent advancements in RL. We saw how I2A architecture uses the imagination core for forward planning followed by how agents can be trained according to human preference. We also learned about DQfd, which boosts the performance and reduces the training time of DQN by learning from demonstrations. Then we looked at hindsight experience replay where we learned how agents learn from failures.
Next, we learned about hierarchical RL, where the goal is decompressed into a hierarchy of sub-goals. We learned about inverse RL where the agents try to learn the reward function given the policy. RL is evolving each and every day with interesting advancements; now that you have understood various reinforcement learning algorithms, you can build agents to perform various tasks and contribute to RL research.