On this website, we use the following notation conventions. Whenever a value (or function) is updated, we use the following notation to indicate such an update:
This equation means that the value of A(x), which is dependent on some input x, is updated towards some value B. The subscripts indicate temporal steps, making this a discrete time formulation. The alpha is a learning rate parameter:
, that indicates how large to step towards B is.
If the values of A(x) for all possible inputs x are stored in a table, one can understand the notation above to be equivalent to the following update:
As another option, A(x) could be a parametrised function that is dependent on the input x and on some parameters w. Then, the notation can be understand to be shorthand for the following update on each of the parameters of this function:
This update can be interpreted as a gradient descent update on the squared difference between the output A(x,w) and the target output B. The learning rate parameter again regulates the size of the update. For instance, a neural network is often used to store values and in this case the parameters w would correspond to the weights of this network. However, other function approximators can of course also be used. On this page, we will use the same general notaton from the first equation to fill in for any of these options.
My contact data can be found here.