Value Iteration
The algorithm of value iteration is given as follows:
- Compute the optimal value function by taking maximum over the Q function, that is,
- Extract the optimal policy from the computed optimal value function
The algorithm of value iteration is given as follows: