Starting from this recipe, we will develop FA algorithms to solve environments with continuous state variables. We will begin by approximating Q-functions using linear functions and gradient descent.
The main idea of FA is to use a set of features to estimate Q values. This is extremely useful for processes with a large state space where the Q table becomes huge. There are several ways to map the features to the Q values; for example, linear approximations that are linear combinations of features and neural networks. With linear approximation, the state-value function for an action is expressed by a weighted sum of the features:
Here, F1(s), F2(s), ……, Fn(s) is a set of features given the input state, s; θ1, θ2,......, θn are the weights applied to corresponding features. Or we can put...