The optimal policies for the MDPs we have dealt with so far are pretty intuitive. However, it won't be that straightforward in most cases, such as the FrozenLake environment. In this recipe, let's play around with the FrozenLake environment and get ready for upcoming recipes where we will find its optimal policy.
FrozenLake is a typical Gym environment with a discrete state space. It is about moving an agent from the starting location to the goal location in a grid world, and at the same time avoiding traps. The grid is either four by four (https://gym.openai.com/envs/FrozenLake-v0/) or eight by eigh.
t (https://gym.openai.com/envs/FrozenLake8x8-v0/). The grid is made up of the following four types of tiles:
- S: The starting location
- G: The goal location, which terminates an episode
- F: The frozen tile, which is a walkable location...