Discretizing actions for deep Q learning is very important, since a three-dimensional continuous action space can have infinite Q values, and it would not be possible to have separate units for each of them in the output layer of the Deep Q network. The three dimensions of the actions space are as follows:
Steering: ∈ [-1, 1]
Gas: ∈ [0, 1]
Break: ∈ [0, 1]
We convert this three dimensional action space into four actions of interest to us as follows:
Brake : [0.0, 0.0, 0.0]
Sharp Left: [-0.6, 0.05, 0.0]
Sharp Right: [0.6, 0.05, 0.0]
Straight: [0.0, 0.3, 0.0]