**When**

14-02-2023 and 15-02-2023**Where**

Amsterdam Science Park Congress Centre (Euler room)

Science Park 123, Amsterdam

The Machine Learning Theory semester programme starts with a boot camp to provide a solid foundation for all participants.

The ML Theory semester programme runs in Spring 2023.

This two-day boot camp is intended for PhD students in ML theory. We will have one and half days of tutorials by researchers, one afternoon of lectures by international keynote speakers, a poster session, a joint dinner, and plenty of time for interaction.

Attending the boot camp is free after registration.

The boot camp takes place at the Amsterdam Science Park Congress Centre. This is located at Science Park 123, to the left of the CWI main entrance. The tutorials are in the *Euler room*, the lectures are in the *Turing room*,
the lunches and Tuesday poster session are in the *Newton room*. The Tuesday dinner is at Maslow, Carolina MacGillavrylaan 3198, within walking distance of CWI.

Tim van Erven

Associate professor at the Korteweg-de Vries Institute for Mathematics at the University of Amsterdam in the Netherlands.

**Abstract:** Since most machine learning systems are not inherently interpretable,
explainable machine learning tries to generate explanations that
communicate relevant aspects of their internal workings. This is a
relatively young subfield, which is generating a lot of excitement, but
it is proving very difficult to lay down proper foundations: What is a
good explanation? When can we trust explanations? Most of the work in
this area is based on empirical evaluation, but recently the first
formal mathematical results have started to appear. In
this tutorial, I will introduce the topic, and then highlight several
formal results of interest.

Johannes Schmidt-Hieber

Professor of statistics at the University of Twente.

**Abstract:** Recently a lot of progress has been made regarding the theoretical understanding for deep neural networks. One of the very promising directions is the statistical approach, which interprets deep learning as a statistical method and builds on existing techniques in mathematical statistics to derive theoretical error bounds and to understand phenomena such as overparametrization. The talk surveys this field and describes future challenges.

Ronald de Wolf

Researcher at the Algorithms and Complexity group of CWI
(Dutch Centre for Mathematics and Computer Science) and part-time full professor at the ILLC of the University of Amsterdam.

**Abstract:** Machine learning can be enhanced by quantum computing, both by allowing quantum data and by having quantum speed-ups for the optimization process that finds a good model for given data. This tutorial will give an introduction to quantum computing (what are quantum algorithms? what do we know about them?) and then examine how they can help machine learning.

Ivo Stoepker

PhD student at TU Eindhoven.

**Abstract:** Anomaly detection when observing a large number of data streams is essential in a variety of applications, ranging from epidemiological studies to monitoring of complex systems. High-dimensional scenarios are usually tackled with scan-statistics and related methods, requiring stringent modeling assumptions for proper test calibration. In this tutorial we discuss ways to drop these stringent assumptions, while still ensuring essentially optimal performance. We take a non-parametric stance, and introduce two variants of the higher criticism test that do not require knowledge of the null distribution for proper calibration. In the first variant we calibrate the test by permutation, while in the second variant we use a rank-based approach. Both methodologies result in exact tests in finite samples, and showcase the analytical tools needed for the study of these type of resampling approaches. Our permutation methodology is applicable when observations within null streams are independent and identically distributed, and we show this methodology is asymptotically optimal in the wide class of exponential models. Our rank-based methodology is more flexible, and only requires observations within null streams to be independent. We provide an asymptotic characterization of the power of the test in terms of the probability of mis-ranking null observations, showing that the asymptotic power loss (relative to an oracle test) is minimal for many common models. As the proposed statistics do not rely on asymptotic approximations, they typically perform better than popular variants of higher criticism relying on such approximations. We demonstrate the use of these methodologies when monitoring the daily number of COVID-19 cases in the Netherlands. (based on joint works with Rui Castro, Ery Arias-Castro and Edwin van de den Heuvel)

Mathias Staudigl

Associate Professor for Multi-Agent Optimization at the Department of Advanced Computing Sciences (DACS) at Maastricht University.

**Abstract:** Game theory is a powerful methodological tool to mathematically formalize and study strategic optimization problems between self-interested agents. Historically, game theoretic models played a fundamental role in economics and operations research as a qualitative model for economic decision making. However, given its intimate connection with the theory of variational inequalities, a more quantitative line of research quickly emerged in order to numerically compute equilibria in large games. More recently, game theory plays a significant role in machine learning and AI in order to generate robust predictions (in the sense of theory of Min-Max) and deep learning architectures (GANs). In the first part of this lecture, I am going to summarize a unified approach for learning in games based on regularization techniques and variational analysis. We will stress the connection between learning algorithms and dynamical system methods. Also recent connection with non-stationary regret measures will be discussed. The second part will focus on splitting-based algorithms that have been designed for convergence to a class of Nash equilibrium points in more general settings where players' decisions are subjected to coupling constraints.

Gabriele Cesa

PhD student at the Amsterdam Machine Learning Lab (AMLab) with Max Welling and a Research Associate at Qualcomm AI Research with Taco Cohen and Arash Behboodi.

**Abstract:** In deep learning and computer vision, it is common for data to present some symmetries. For instance, histopathological scans and satellite images can appear in any rotation. Examples in 3D include protein structures (which have arbitrary orientation) or natural scenes (where objects can freely rotate around their Z axis). Equivariance is becoming an increasingly popular design choice to build data efficient neural networks by exploiting prior knowledge about the symmetries of the problem at hand.

In this tutorial, we will cover the mathematical foundations of group equivariant neural networks. In addition, we will introduce Steerable CNNs as a general and efficient framework to implement equivariant networks by relying on tools from group representation theory.

Frans Oliehoek

Associate Professor at the Interactive Intelligence group at TU Delft.

**Abstract:** In recent years, we have seen exciting breakthroughs in the field of 'reinforcement learning'. In this talk, I will give a very basic introduction to this general field, where I put a focus on clarifying some of the terminology. With this I will cover some of the foundations of RL, the intuition behind the state of the art, and an overview of some the main challenges for the future.

Jaron Sanders

Development Track Assistent Professor at the Eindhoven University of Technology.

**Abstract:** Motivated by theoretical advancements in dimensionality reduction techniques we have used a recent model called Block Markov Chains (BMCs), to conduct a practical study of clustering in real-world sequential data. New clustering algorithms for BMCs namely possess theoretical optimality guarantees and can be deployed in sparse data regimes. We ultimately found that the BMC model assumption can indeed produce meaningful insights in exploratory data analyses despite the complexity and sparsity of real-world data.

Based on the study mentioned above, I will introduce you to the idea of dimensionality reduction via clustering; to methods to determine clusters and clusters in time series in particular; to theoretical properties of clustering that we can compare with; and to tools that can help you evaluate clusters that you may find in data. I also point you to an efficient implementation of our clustering algorithm and the evaluation tools for BMCs that we have made available. All in all, my talk might just help you discover hidden latent spaces in your own time series of interest.