Implementing Q-learning
Technically, now that we have calculated the various state-action values we need, we can identify the action that will be taken in every state. However, in the case of a more complex scenario – for example, when playing video games – it gets tricky to fetch state information. OpenAI’s Gym environment comes in handy in this scenario. It contains a pre-defined environment for the game we’re playing. Here, it fetches the next state information, given an action that’s been taken in the current state. So far, we have considered the scenario of choosing the most optimal path. However, there can be scenarios where we are stuck at the local minima.
In this section, we will learn about Q-learning, which helps to calculate the value associated with the action in a state, as well as about leveraging the Gym environment so that we can play various games. For now, we’ll take a look at a simple game called Frozen Lake that is...