Exploration versus Exploitation Trade-Off
Learning happens by exploring new things and exploiting or applying what has been learned before. The right combination of these is the essence of any learning. Similarly, in the context of reinforcement learning, we have exploration and exploitation. Exploration is trying out different actions, while exploitation is following an action that is known to have a good reward.
Reinforcement learning has to balance between exploration and exploitation. Every agent can learn only from the experience of trying an action. Exploration helps try new actions that might enable the agent to make better decisions in the future. Exploitation is choosing actions that yield good rewards based on experience. The agent needs to trade off gaining rewards by exploitation by experimenting in exploration. If an agent exploits more, the agent might miss learning about other policies with even greater rewards. If the agent explores more, the agent might miss the...