The environment for the self-driving car is CarRacing-v0 from the OpenAI Gym. The states presented to the agent from this OpenAI environment are images from the front of the simulated car in CarRacing-v0. The environment also returns a reward based on the action taken by the agent at a given state. We penalize the reward if the car treads on grass and also normalize the reward to be in the range of (-1,1) for stable training. The detailed code for the environment is as below
import gym
from gym import envs
import numpy as np
from helper_functions import rgb2gray,action_list,sel_action,sel_action_index
from keras import backend as K
seed_gym = 3
action_repeat_num = 8
patience_count = 200
epsilon_greedy = True
max_reward = 10
grass_penalty = 0.8
max_num_steps = 200
max_num_episodes = action_repeat_num*100
'''
Enviroment to interact with...