Overview of the actor-critic method
The actor-critic method is one of the most popular algorithms in deep reinforcement learning. Several modern deep reinforcement learning algorithms are designed based on actor-critic methods. The actor-critic method lies at the intersection of value-based and policy-based methods. That is, it takes advantage of both value-based and policy-based methods.
In this section, without going into further detail, first, let's acquire a basic understanding of how the actor-critic method works and then, in the next section, we will get into more detail and understand the math behind the actor-critic method.
Actor-critic, as the name suggests, consists of two types of network—the actor network and the critic network. The role of the actor network is to find an optimal policy, while the role of the critic network is to evaluate the policy produced by the actor network. So, we can think of the critic network as a feedback network that...