This page contains errata for the MDL Tutorial, which appeared as

P.D. Grünwald, A tutorial introduction
to the minimum description length principle (80 pages), Chapter 1 and 2 in the collection
*Advances in Minimum Description Length: Theory and Applications*
(edited by P. Grünwald, I.J. Myung, M. Pitt), MIT
Press, April 2005.

A preliminary version appeared in the CoRR archive (the computer science branch of the physics arXiv) under code math.ST/0406077. This is also the version that can be found on my homepage; that is, here.

- The definition of Fisher information on page 48 is incorrect: log
(logarithm to base 2) should be replaced by ln (logarithm to
base e). Note however that, perhaps confusingly, (2.21) on
page 47
*is*correct: there the log should indeed be taken to base 2.

- Section 2.2.2, Example 2 (Page 27, 6th line of the on-line version): inside the summation from 1 to m, 2^m should be 2^i.
- Equation 2.4 (Page 29 of both the on-line version and the book): the sum should be taken over (calligraphic) Z, not X.
- Example 2.5 (Page 30 of both on-line version and book): after the first equal ("=") sign in the equation inside the example, a minus ("-") should be added.
- Example 2.8, Equation 2.8 (Page 36 both in the book and the online version): the k in front of k log (n+1) should be k' rather than k (it's the number of parameters, not the Markov chain order).
- Similarly, in Equation (2.9), the k refers to the number of
parameters, so
*in the context of Example 2,8*, it should be called k' (in later sections, k is used for number of parameters). - In Example 2.8, Equation (2.8), one might argue that there should also be an extra term of size k. This is the number of bits needed to encode the starting state of the chain.
- The calculations in Example 2.17 (page 48 in both the on-line
version and the book), are incorrect. To add to the
confusion, the calculation in the book is different from the
calculation in the on-line version; yet
*both*are incorrect! The right calculation is as follows: the Fisher information is given by 1/t(1-t) (with t replaced by "theta"; in the on-line version it was wrongfully stated that the Fisher information is t(1-t)). The integral over the square root of the Fisher information from 0 to 1 thus should really be π (and not π/8, as claimed in the on-line version, or 2, as claimed in the book). Thus, the correct calculation of COMP(M) gives COMP(M) = 0.5log n + 0.5 log (0.5 π) + o(1).

Last updated: April 2006. Thanks to Wouter Koolen, Kinh Tieu, Michal Przykucki and Chris Sims for pointing out some of these mistakes.