A Short Introduction To State-Action Values - Hado van Hasselt

A Short Introduction To Some Reinforcement Learning Algorithms

By Hado van Hasselt

State-Action Values

Previous -- Up -- Next

In this section we only consider algorithms that store an approximation of the expected reward of each state-action pair. This approximation is also called a Q value, and is usually denoted Q(s,a) for a given state-action pair. Similarly, in control theory one often talks about the value J(x,u), where x is the state and u is the action.

The algorithms in this section:

Quick links:

Previous -- Up -- Next

Contact

My contact data can be found here.