Machine Learning Theory
Fall 2017, Wednesday 14:00 to 16:45 in UvA SP C0.110/D1.111.
Aim
Machine learning is one of the fastest growing areas of science,
with far-reaching applications. In this course we focus on the fundamental ideas, theoretical frameworks, and rich array of mathematical tools and techniques that power machine learning. The course covers the core paradigms and results in machine learning theory with a mix of probability and statistics, combinatorics, information theory, optimization and game theory.
During the course you will learn to
-
Formalize learning problems in statistical and game-theoretic settings.
-
Examine the statistical complexity of learning problems using the core notions of complexity.
-
Analyze the statistical efficiency of learning algorithms.
-
Master the design of learning strategies using proper regularization.
MasterMath Website
This course is offered as part of the MasterMath program. To participate for credit, sign up here. We also use their ELO for submitting homework and receiving grades.
Prerequisites
The course will be self-contained.
-
Basic probability theory. (Conditional) probability and expectations. Discrete and continuous distributions.
-
Basic linear algebra. Finite dimensional vector spaces. Eigen decomposition.
-
Basic calculus.
-
No programming will be required.
Lecturers
The lectures will be recorded. See the mastermath ELO for the link and password.
Mode
The grade will be composed as follows.
-
40%: weekly homework to be handed in before the next lecture.
-
30%: mid-term exam.
-
30%: final exam.
It is strongly encouraged to solve and submit your weekly homework in teams. There is no limit on the team size. Exams are personal.
Schedule
When |
What |
Who |
Wed 13 Sep |
Intro to statistical and game-theoretic learning. Proper losses. Notes |
Wouter |
Wed 20 Sep |
PAC learnability for finite hyp. classes, realizable case.
Chapter 1+2 in [1]. SVM + Kernels.
|
Rianne |
Wed 27 Sep |
Agnostic PAC learning and learning via uniform convergence.
Chapter 3+4 in [1].
|
Rianne |
Wed 4 Oct |
No-Free Lunch. VC Dim. Fundamental theorem of PAC learning.
Chapters 5-6.4 in [1].
|
Rianne |
Wed 11 Oct |
Proof of Fund.Th of PAC Learning. Sauer's lemma.
Chapter 6.5 in [1]. Nonuniform Learnability, SRM, Other notions of Learning.
|
Rianne |
Wed 18 Oct |
Full Information Online Learning (Experts).
Specialists. Notes
|
Wouter |
Wed 25 Oct |
Midterm exam (list of content) |
TBD |
Wed 1 Nov |
Rademacher Complexity & covering numbers,
Chapters 26 and 27 in [1].
|
Peter |
Wed 8 Nov |
Online Convex Optimization, Sections 3.1 and 3.3 in [2] and Notes. |
Wouter |
Wed 15 Nov |
Bandits. UCB and EXP3. Chapters 2.2 and 3.1 in [3]. |
Wouter |
Wed 22 Nov |
Non-stationary environments. Tracking and long-term memory using Specialists Notes.
|
Wouter |
Wed 29 Nov |
Exp-concavity. Online Newton Step. Vovk mixability. Chapter 4 in [2], Notes.
|
Wouter |
Wed 6 Dec |
Rademacher, continued; log-loss prediction;
Minimum Description Length (MDL) |
Peter |
Wed 13 Dec |
Boosting. AdaBoost. Chapter 10 in [1]. |
Wouter |
Wed 20 Dec |
Tensor Decompositions + wrap up. |
Wouter |
Wed 24 Jan |
Final exam (list of content). Location: IWO-geel |
|
Exams
Midterm
Intro to statistical and game-theoretic learning. Proper losses. (Handout)
PAC learnability for finite hyp. classes, realisable case. (Chapter 1+2). Half-spaces, SVM's and Kernels (only what is discussed in the lecture)
Agnostic PAC learning and learning via uniform convergence. (Chapter 3+4)
No-Free Lunch. VC Dim. Fundamental theorem of PAC learning. Proof of Fund.Th of PAC Learning, Symmetrisation. Sauer's lemma + proof. (Chapter 5.1 and 6 (not: 6.5.2); proofs as in the lecture - for similar proofs see Chapter 28.1 and 28.3). Non-uniform learnability, SRM (7.1 and 7.2)
Final
The final exam will test you on the online learning component of the class plus Rademacher complexity. Below is the list of topics. The weekly homework prepares you for the exam. The solutions can be found on
the mastermath ELO.
If below we say 'you should read and understand a section', this means that:
- if in the exam we refer to a Lemma/Theorem/Formula in the section, we will always state it explicitly (e.g. on the cheat sheet), so you don't have to learn formulas by heart.
-
however, you should understand all these Lemmas/Theorems/Formulas
since we may ask questions about them and we will not give further explanations of formulas.
Online Learning
If below we say "algorithm and regret bound" we might question you on the setting, its main algorithm and (do computations involving) the corresponding regret bound. If we add "analysis" you should be able to reprove the regret bound at the exam.
-
Experts and specialists
-
Mix loss game, lower bound, Aggregating algorithm regret bound and analysis
-
Dot loss game, Hedge algorithm regret bound and analysis
-
Mixable loss functions. Aggregating algorithm regret bound and analysis.
- Mix loss game with specialists. Specialists Aggregating Algorithm (SAA) regret bound and analysis.
-
Online Convex Optimisation
-
Convex losses, Online Gradient Descent regret bound and analysis.
- Strongly convex losses, Online Gradient Descent regret bound and analysis.
-
Exp-concave losses, Online Newton Step regret bound.
-
Non-stationarity
- Switching between experts. Fixed Share algorithm as a reduction to specialists. Regret bound and analysis.
- Long-term memory. Mixing Past Posteriors algorithm as a reduction to specialists. High-level construction.
-
Bandits
- Stochastic bandit setting. UCB algorithm, regret bound.
- Adversarial bandit setting. EXP3 algorithm, regret bound.
-
Boosting
-
AdaBoost algorithm, iteration bound, analysis.
-
Log-Loss Prediction (covered in Rademacher lectures):
-
Section 9.1-9.4: read and understand completely.
Rademacher Complexity & Covering Nrs - Chapter 26, 27
Proofs about this part of the course are sometimes difficult and we will not ask you to reproduce proofs from the book (except very simple ones). However, we do ask you to read and understand the proofs since that helps in getting the right insights for the questions we do ask - which are similar to (but not the same as!) what you were asked in the homework exercises.
-
Sections 26.1 - 26.2. You should read & understand both sections
completely, except the proof of Thm 26.5 and Lemma 26.3 (McDiarmid).
-
Section 26.3,26.4 and 26.5 can be skipped.
-
Section 27.1: read & understand completely. Section 27.2: read and understand the statement of Lemma 27.4 (bound on Rademacher complexity in terms of covering numbers); no need to read or understand the proof.
We will not ask you about VC dimension or PAC learning. That was covered by the mid-term. We will also not ask you about the tensor decompositions bonus lecture.
Material
We will make use of the following sources