Q-learning is a reinforcement learning method that utilizes the action value function, or Q function, to solve tasks. In this section, we'll talk about both traditional Q-learning as well as Deep Q-learning.
Standard Q-learning works off the core concept of the Q-table. You can think of the Q-table as a reference table; every row represents a state and every column represents an action. The values of the table are the expected future rewards that are received for a specific combination of actions and states. Procedurally, we do the following:
- Initialize the Q-table
- Choose an action
- Perform that action
- Measure the reward that was received
- Update the Q- value
Let's walk through each of these steps to better understand the algorithm. First, we initialize the Q-table as zeros, and it is subsequently updated throughout the Q-learning training process....