Chapter 2: Implementing Value-Based, Policy-Based, and Actor-Critic Deep RL Algorithms
This chapter provides a practical approach to building value-based, policy-based, and actor-critic algorithm-based reinforcement learning (RL) agents. It includes recipes for implementing value iteration-based learning agents and breaks down the implementation details of several foundational algorithms in RL into simple steps. The policy gradient-based agent and the actor-critic agent make use of the latest major version of TensorFlow 2.x to define the neural network policies.
The following recipes will be covered in this chapter:
- Building stochastic environments for training RL agents
- Building value-based (RL) agent algorithms
- Implementing temporal difference learning
- Building Monte Carlo prediction and control algorithms for RL
- Implementing the SARSA algorithm and an RL agent
- Building a Q-learning agent
- Implementing policy gradients
- Implementing actor-critic...