next up previous
Next: Discretizing the State Space: Tile Coding Up: Function Approximation Previous: Function Approximation


Linear Function Approximation

We assume some feature-extraction function $ \phi : S \to \Phi$ is given that maps states into features in the feature space $ \Phi$ . We assume $ \Phi \subseteq \mathbb{R}^{D_\Phi}$ where $ D_\Phi$ is the dimension of the feature space. Often $ D_\Phi < D_S$ , which means that the feature vector of a state is smaller than the full representation of the state. This need not hold in general. A discussion about the choice of good features falls outside the scope of this chapter, but see for instance [Busoniu et al., 2010] for some considerations.

A linear function is a simple parametric function that depends on the feature vector. For instance, consider a value-approximating algorithm where the value function is approximated by

$\displaystyle V_t(s) = \vec{\theta}^T_t \vec{\phi}(s) \enspace.$ (3)

In equation (3) and in the rest of this chapter, $ \vec{\theta}_t \in \Theta$ denotes the adaptable parameter vector at time $ t$ and $ \phi(s) \in \Phi$ is the feature vector of state $ s$ . Since the function in equation (3) is linear in the parameters, we refer to it as a linear function approximator. Note that it may be non-linear in the state variables, depending on the feature extraction. In this section, the dimension $ D_\Theta$ of the parameter space is equal to the dimension of the feature space $ D_\Phi$ . This does not necessarily hold for other types of function approximation.

Linear function approximators are useful since they are better understood than non-linear function approximators. Applied to reinforcement learning, this has led to a number of convergence guarantees, under various additional assumptions [Sutton, 1984,Sutton, 1988,Dayan, 1992,Peng, 1993,Dayan and Sejnowski, 1994,Bertsekas and Tsitsiklis, 1996,Tsitsiklis and Van Roy, 1997]. From a practical point of view, linear approximators are useful because they are simple to implement and fast to compute.

Many problems have large state spaces in which each state can be represented efficiently with a feature vector of limited size. For instance, the double pole cart pole problem that we consider later in this chapter has continuous state variables, and therefore an infinitely large state space. Yet, every state can be represented with a vector with six elements. This means that we would need a table of infinite size, but can suffice with a parameter vector with just six elements if we use (3) with the state variables as features.

This reduction of tunable parameters of the value function comes at a cost. It is obvious that not every possible value function can be represented as a linear combination of the features of the problem. Therefore, our solution is limited to the set of value functions that can be represented with the chosen functional form. If one does not know beforehand what useful features are for a given problem, it can be beneficial to use non-linear function approximation, which we discuss in Section 2.2.



Subsections
next up previous
Next: Discretizing the State Space: Tile Coding Up: Function Approximation Previous: Function Approximation
2011-07-01