- Keras abstracts many of the functionalities provided by TensorFlow and creates a high-level frontend for creating complex deep learning architectures.
- CartPole is effectively a binary prediction problem because there are two options provided for every action taken.
- When the state space is very large, some states can be grouped together and treated similarly when the optimal actions to take from those states are the same.
- Experience Replay updates the Q-function using samples of past actions rather than updating it after every action. This helps prevent overfitting by smoothing away outlier actions and having the agent forget previous experiences in favor of new ones.
- An RL model approximating Q-values does not know what those actual Q-values are and progressively develops estimates for those values. A...