The environment considered for this section is the Frozen Lake v0. The actual documentation of the concerned environment can be found at https://gym.openai.com/envs/FrozenLake-v0/.
This environment consists of 4 x 4 grids representing a lake. Thus, we have 16 grid blocks, where each block can be a start block(S), frozen block(F), goal block(G), or a hole block(H). Thus, the objective of the agent is to learn to navigate from start to goal without falling in the hole:
import Gym
env = Gym.make('FrozenLake-v0') #loads the environment FrozenLake-v0
env.render() # will output the environment and position of the agent
-------------------
SFFF
FHFH
FFFH
HFFG
At any given state, an agent has four actions to perform, which are up, down, left, and right. The reward at each step is 0 except...