- How does an agent learn following the RL approach?
A) Via the experience that it gets from the reward it receives each time it executes an action.
B) By randomly exploring the environment and discovering the best strategy by trial and error.
C) Via a neural network that gives as output a q-value as a function of the state of the system.
- Does an agent trained with RL have to make predictions of the expected outcome of an action?
A) Yes; this is a characteristic called model-free RL.
B) Only if it does not take the model-free RL approach.
C) No; by definition, RL methods only need to be aware of rewards and penalties to ensure the learning process.
- If you run the Q-learning algorithm with a learning rate, alpha, of 0.7, what does this mean from the point of view of the learning process?
A) That you keep the top 30% of the pair state-actions that provide the higher...