In this recipe, we will solve the Taxi environment with the SARSA algorithm and fine-tune the hyperparameters with the grid search algorithm.
We will start with our default set of hyperparameter values under the SARSA model. These are selected based on intuition and a number of trials. Moving on, we will come up with the best set of values.