This is about a gridworld environment in OpenAI gym called FrozenLake-v0, discussed in Chapter 2, Training Reinforcement Learning Agents Using OpenAI Gym. We implemented Q-learning and Q-network (which we will discuss in future chapters) to get the understanding of an OpenAI gym environment.
Now, let's try to implement value iteration to obtain the utility value of each state in the FrozenLake-v0 environment, using the following code:
# importing dependency libraries
from __future__ import print_function
import gym
import numpy as np
import time
#Load the environment
env = gym.make('FrozenLake-v0')
s = env.reset()
print(s)
print()
env.render()
print()
print(env.action_space) #number of actions
print(env.observation_space) #number of states
print()
print("Number of actions : ",env.action_space.n)
print("Number...