So far, in the previous sections, we have been taking random actions in a given state. Additionally, we have also been defining the environment and calculating the next state, actions, and the reward for a move via code. In this section, we will leverage OpenAI's Gym package to navigate through the Frozen Lake environment.
Q-learning to maximize rewards when playing Frozen Lake
Getting ready
The Frozen Lake environment looks as follows:
The agent starts from the S state and the goal is to reach the G state by avoiding the H state as far as possible.
In the preceding environment, there are 16 possible states that an agent can be in. Additionally, the agent can take four possible actions (move up, down, right, or left...