The question list is as follows:
- What are policy gradients?
- Why are policy gradients effective?
- What is the use of the Actor Critic network in DDPG?
- What is the constraint optimization problem?
- What is the trust region?
- How does PPO overcome the drawbacks of TRPO?