A Short Introduction To Sarsa - Hado van Hasselt

A Short Introduction To Some Reinforcement Learning Algorithms

By Hado van Hasselt

Sarsa

Previous -- Up -- Next

Sarsa is similar to Q-learning, but uses the value of the actually performed action to determine its update, instead of the maximum available action. Its equation is:

Q_{t+1}(s_t,a_t) \overset{\alpha_t}{\longleftarrow} r_t + \gamma Q_t(s_{t+1},a_{t+1})

Neutral characteristics

Advantages

Disadvantages

Algorithm

The Sarsa algorithm in schematic form:

Sarsa algorithm

Note that because knowledge of the next action is required, the algorithm is a little more complex than that of Q-learning or Expected-Sarsa.

Selected relevant publications:

Quick links:

Previous -- Up -- Next

Contact

My contact data can be found here.