Chapter 7: Policy-Based Methods
Value-based methods that we covered in the previous chapter achieve great results in many environments with discrete control spaces. However, a lot of applications, such as robotics, require continuous control. In this chapter, we go into another important class of algorithms, called policy-based methods, which enable us to solve continuous-control problems. In addition, these methods directly optimize a policy network, and hence stand on a stronger theoretical foundation. Finally, policy-based methods are able to learn truly stochastic policies, which are needed in partially observable environments and games, which value-based methods could not learn. All in all, policy-based approaches complement value-based methods in many ways. This chapter goes into the details of policy-based methods to gain you a strong understanding of how they work.
In particular, we discuss the following topics in this chapter.
- Need for policy-based methods
- Vanilla...