Questions
Let's evaluate our understanding of the algorithms we learned in this chapter. Try answering the following questions:
- What is a trust region?
- Why is TRPO useful?
- How does the conjugate gradient method differ from gradient descent?
- What is the update rule of TRPO?
- How does PPO differ from TRPO?
- Explain the PPO-clipped method.
- What is Kronecker factorization?