Here we report a basic Q-learning implementation for the FrozenLake-v0 problem.
Import the following two basic libraries:
import gym
import numpyasnp
Then, we load the FrozenLake-v0 environment:
environment = gym.make('FrozenLake-v0')
Then, we build the Q-learning table; it has the dimensions SxA, where S is the dimension of the observation space, S, while A is the dimension of the action space, A:
S = environment.observation_space.n
A = environment.action_space.n
The FrozenLake environment provides a state for each block, and four actions (that is, the four directions of movement), giving us a 16x4 table of Q-values to initialize:
Q = np.zeros([S,A])
Then, we define the a parameter for the training rule and the discount g factor:
alpha = .85
gamma = .99
We fix the total number of episodes (trials):
num_episodes = 2000
Then, we initialize the rList, where we&apos...