Machine Learning Theory
Fall 2017, Wednesday 14:00 to 16:45 in UvA SP C0.110/D1.111.
Aim
Machine learning is one of the fastest growing areas of science,
with farreaching applications. In this course we focus on the fundamental ideas, theoretical frameworks, and rich array of mathematical tools and techniques that power machine learning. The course covers the core paradigms and results in machine learning theory with a mix of probability and statistics, combinatorics, information theory, optimization and game theory.
During the course you will learn to

Formalize learning problems in statistical and gametheoretic settings.

Examine the statistical complexity of learning problems using the core notions of complexity.

Analyze the statistical efficiency of learning algorithms.

Master the design of learning strategies using proper regularization.
MasterMath Website
This course is offered as part of the MasterMath program. To participate for credit, sign up here. We also use their ELO for submitting homework and receiving grades.
Prerequisites
The course will be selfcontained.

Basic probability theory. (Conditional) probability and expectations. Discrete and continuous distributions.

Basic linear algebra. Finite dimensional vector spaces. Eigen decomposition.

Basic calculus.

No programming will be required.
Lecturers
The lectures will be recorded. See the mastermath ELO for the link and password.
Mode
The grade will be composed as follows.

40%: weekly homework to be handed in before the next lecture.

30%: midterm exam.

30%: final exam.
It is strongly encouraged to solve and submit your weekly homework in teams. There is no limit on the team size. Exams are personal.
Schedule
When 
What 
Who 
Wed 13 Sep 
Intro to statistical and gametheoretic learning. Proper losses. Notes 
Wouter 
Wed 20 Sep 
PAC learnability for finite hyp. classes, realizable case.
Chapter 1+2 in [1]. SVM + Kernels.

Rianne 
Wed 27 Sep 
Agnostic PAC learning and learning via uniform convergence.
Chapter 3+4 in [1].

Rianne 
Wed 4 Oct 
NoFree Lunch. VC Dim. Fundamental theorem of PAC learning.
Chapters 56.4 in [1].

Rianne 
Wed 11 Oct 
Proof of Fund.Th of PAC Learning. Sauer's lemma.
Chapter 6.5 in [1]. Nonuniform Learnability, SRM, Other notions of Learning.

Rianne 
Wed 18 Oct 
Full Information Online Learning (Experts).
Specialists. Notes

Wouter 
Wed 25 Oct 
Midterm exam (list of content) 
TBD 
Wed 1 Nov 
Rademacher Complexity & covering numbers,
Chapters 26 and 27 in [1].

Peter 
Wed 8 Nov 
Online Convex Optimization, Sections 3.1 and 3.3 in [2] and Notes. 
Wouter 
Wed 15 Nov 
Bandits. UCB and EXP3. Chapters 2.2 and 3.1 in [3]. 
Wouter 
Wed 22 Nov 
Nonstationary environments. Tracking and longterm memory using Specialists Notes.

Wouter 
Wed 29 Nov 
Expconcavity. Online Newton Step. Vovk mixability. Chapter 4 in [2], Notes.

Wouter 
Wed 6 Dec 
Rademacher, continued; logloss prediction;
Minimum Description Length (MDL) 
Peter 
Wed 13 Dec 
Boosting. AdaBoost. Chapter 10 in [1]. 
Wouter 
Wed 20 Dec 
Tensor Decompositions + wrap up. 
Wouter 
Wed 24 Jan 
Final exam (list of content). Location: IWOgeel 

Exams
Midterm
Intro to statistical and gametheoretic learning. Proper losses. (Handout)
PAC learnability for finite hyp. classes, realisable case. (Chapter 1+2). Halfspaces, SVM's and Kernels (only what is discussed in the lecture)
Agnostic PAC learning and learning via uniform convergence. (Chapter 3+4)
NoFree Lunch. VC Dim. Fundamental theorem of PAC learning. Proof of Fund.Th of PAC Learning, Symmetrisation. Sauer's lemma + proof. (Chapter 5.1 and 6 (not: 6.5.2); proofs as in the lecture  for similar proofs see Chapter 28.1 and 28.3). Nonuniform learnability, SRM (7.1 and 7.2)
Final
The final exam will test you on the online learning component of the class plus Rademacher complexity. Below is the list of topics. The weekly homework prepares you for the exam. The solutions can be found on
the mastermath ELO.
If below we say 'you should read and understand a section', this means that:
 if in the exam we refer to a Lemma/Theorem/Formula in the section, we will always state it explicitly (e.g. on the cheat sheet), so you don't have to learn formulas by heart.

however, you should understand all these Lemmas/Theorems/Formulas
since we may ask questions about them and we will not give further explanations of formulas.
Online Learning
If below we say "algorithm and regret bound" we might question you on the setting, its main algorithm and (do computations involving) the corresponding regret bound. If we add "analysis" you should be able to reprove the regret bound at the exam.

Experts and specialists

Mix loss game, lower bound, Aggregating algorithm regret bound and analysis

Dot loss game, Hedge algorithm regret bound and analysis

Mixable loss functions. Aggregating algorithm regret bound and analysis.
 Mix loss game with specialists. Specialists Aggregating Algorithm (SAA) regret bound and analysis.

Online Convex Optimisation

Convex losses, Online Gradient Descent regret bound and analysis.
 Strongly convex losses, Online Gradient Descent regret bound and analysis.

Expconcave losses, Online Newton Step regret bound.

Nonstationarity
 Switching between experts. Fixed Share algorithm as a reduction to specialists. Regret bound and analysis.
 Longterm memory. Mixing Past Posteriors algorithm as a reduction to specialists. Highlevel construction.

Bandits
 Stochastic bandit setting. UCB algorithm, regret bound.
 Adversarial bandit setting. EXP3 algorithm, regret bound.

Boosting

AdaBoost algorithm, iteration bound, analysis.

LogLoss Prediction (covered in Rademacher lectures):

Section 9.19.4: read and understand completely.
Rademacher Complexity & Covering Nrs  Chapter 26, 27
Proofs about this part of the course are sometimes difficult and we will not ask you to reproduce proofs from the book (except very simple ones). However, we do ask you to read and understand the proofs since that helps in getting the right insights for the questions we do ask  which are similar to (but not the same as!) what you were asked in the homework exercises.

Sections 26.1  26.2. You should read & understand both sections
completely, except the proof of Thm 26.5 and Lemma 26.3 (McDiarmid).

Section 26.3,26.4 and 26.5 can be skipped.

Section 27.1: read & understand completely. Section 27.2: read and understand the statement of Lemma 27.4 (bound on Rademacher complexity in terms of covering numbers); no need to read or understand the proof.
We will not ask you about VC dimension or PAC learning. That was covered by the midterm. We will also not ask you about the tensor decompositions bonus lecture.
Material
We will make use of the following sources