Program Year
UE Introductifs obligatoires en statistiques bayésienne
-
Introduction à R
Introduction à R
Lecturer :
ROBIN RYDER
Total hours : 3
-
Introduction au Bayesian
Introduction au Bayesian
Lecturer :
CHRISTIAN ROBERT
Total hours : 3
-
A review of probability theory foundations
A review of probability theory foundations
Ects : 6
Lecturer :
PAUL GASSIAT
Total hours : 15
Overview :
- Random variables, expectations, laws, independence - Inequalities and limit theorems, uniform integrability - Conditioning, Gaussian random vectors - Bounded variation and Lebegue-Stieltjes integral - Stochastic processes, stopping times, martingales - Brownian motion : martingales, trajectories, construction - Wiener stochastic integral and Cameron-Martin formula
-
Introduction à Python
Introduction à Python
Lecturer :
DAVID GONTIER
Total hours : 3
Cours fondamentaux
-
Optimisation pour l'apprentissage automatique
Optimisation pour l'apprentissage automatique
Ects : 3
Lecturer :
Clement ROYER
Total hours : 18
Overview :
Optimization is at the heart of most recent advances in machine learning. Indeed, it not only plays a major role in linear regression, SVM and kernel methods, but it is also the key to the recent explosion of deep learning for supervised and unsupervised problems in imaging, vision and natural language processing. This course will review the mathematical foundations, the underlying algorithmic methods and showcase modern applications of a broad range of optimization techniques.
The course will be composed of classical lectures and numerical sessions in Python. It will begin with the basic components of smooth optimization (optimality conditions, gradient-type methods), then move to methods that are particularly relevant in a machine learning setting such as the celebrated stochastic gradient descent algorithm and its variants. More advanced algorithms related to non-smooth and constrained optimization, that encompass known characteristics of learning problems such as the presence of regularizing terms, will also be described. During lab sessions, the various algorithms studied during the lectures will be implemented and tested on real and synthetic datasets: these sessions will also address several practical features of optimization codes such as automatic differentiation, and built-in optimization routines within popular machine learning libraries such as PyTorch.
Recommended prerequisites :
Fundamentals of linear algebra and real analysis. Experience with Python programming.
Learning outcomes :
- Identify the characteristics of an optimization problem given its formulation.
- Know the theoretical and practical properties of the most popular optimization techniques.
- Find the best optimization algorithm to handle a particular feature of a machine learning problem.
Assessment :
Written exam.
Bibliography-recommended reading
Bibliography: Theory and algorithms: Convex Optimization, Boyd and Vandenberghe Introduction to matrix numerical analysis and optimization, Philippe Ciarlet Proximal algorithms, N. Parikh and S. Boyd Introduction to Nonlinear Optimization - Theory, Algorithms and Applications, Amir Beck Numerics: Pyrthon and Jupyter installation: use only Python 3 with Anaconda distribution. The Numerical Tours of Signal Processing, Gabriel Peyré Scikitlearn tutorial #1 and Scikitlearn tutorial #2, Fabian Pedregosa, Jake VanderPlas Reverse-mode automatic differentiation: a tutorial Convolutional Neural Networks for Visual Recognition Christopher Olah, Blog
-
Optimisation
Optimisation
Ects : 6
Lecturer :
ANTONIN CHAMBOLLE
Total hours : 24
Overview :
This course will review the mathematical foundations for Machine Learning, as well as the underlying algorithmic methods and showcases some modern applications of a broad range of optimization techniques. Optimization is at the heart of most recent advances in machine learning. This includes of course most basic methods (linear regression, SVM and kernel methods). It is also the key for the recent explosion of deep learning which are state of the art approaches to solve supervised and unsupervised problems in imaging, vision and natural language processing. This course will review the mathematical foundations, the underlying algorithmic methods and showcases some modern applications of a broad range of optimization techniques. The course will be composed of both classical lectures and numerical sessions in Python. The first part covers the basic methods of smooth optimization (gradient descent) and convex optimization (optimality condition, constrained optimization, duality). The second part will features more advanced methods (non-smooth optimization, SDP programming, interior points and proximal methods). The last part will cover large scale methods (stochastic gradient descent), automatic differentiation (using modern python framework) and their application to neural network (shallow and deep nets).
Learning outcomes :
The objective of this course is to learn how to recognize, manipulate and solve a relatively large class of emerging convex problems in areas such as, for example, learning, finance or signal processing.
-
Statistiques en grandes dimensions
Statistiques en grandes dimensions
Ects : 4
Lecturer :
VINCENT RIVOIRARD
MARC HOFFMANN
Total hours : 18
Overview :
Fléau de la dimension et hypothèse de parcimonie pour la régression gaussienne, les modèles linéaires généralisés et les données de comptage.Ondelettes et estimation par seuillage.Choix de modèles et sélection de variables.Estimation par pénalisation convexe : procédure Ridge, lasso, group-lasso…Liens avec l’approche bayésienne.Méthodes d’agrégation.Tests multiplies : procédures FDR, FWER.Estimation matricielle.
Learning outcomes :
L’objectif de ce cours de statistique est de présenter les outils mathématiques et les méthodologies dans la situation où le nombre de paramètres à inférer est très élevé, typiquement beaucoup plus important que le nombre d’observations.
-
Modèles graphiques
Modèles graphiques
Ects : 4
Lecturer :
FABRICE ROSSI
Total hours : 18
Learning outcomes :
Modélisation probabiliste, apprentissage et inférence sur les modèles graphiques. Les principaux thèmes abordés sont :Maximum de vraisemblance.Régression linéaire.Régression logistique.Modèle de mélange, partitionnement.Modèles graphiques.Familles exponentielles.Algorithme produit-somme.Hidden Markov models.Inférence approximéeMéthodes bayésiennes.
-
Advanced learning
Advanced learning
Ects : 4
Total hours : 23
Overview :
Typologie des problèmes d’apprentissage (supervisé vs. non-supervisé).Modèle statistique pour la classification binaire : Approches génératives vs. discriminantes.Algorithmes classiques : méthodes paramétriques, perceptron, méthodes de partitionnement.Critères de performances : erreur de classification, courbe ROC, AUC.Convexification du risque : Algorithmes de type boosting et SVM. Mesures de complexité combinatoires, métriques géométriques.Sélection de modèle et régularisation.Théorèmes de consistance et vitesses de convergence.
Learning outcomes :
Bases mathématiques pour la modélisation des problèmes d’apprentissage supervisé et l’analyse des algorithmes de classification en grande dimension. Il s’agit de présenter les bases mathématiques pour la modélisation des problèmes d’apprentissage supervisé et l’analyse des algorithmes de classification en grande dimension.
Cours optionnels - 5 cours à choisir parmi :
-
Transport optimal
Transport optimal
Ects : 4
Lecturer :
GABRIEL PEYRE
Total hours : 18
Overview :
Optimal transport (OT) is a fundamental mathematical theory at the interface between optimization, partial differential equations and probability. It has recently emerged as an important tool to tackle a surprisingly large range of problems in data sciences, such as shape registration in medical imaging, structured prediction problems in supervised learning and training deep generative networks. This course will interleave the description of the mathematical theory with the recent developments of scalable numerical solvers. This will highlight the importance of recent advances in regularized approaches for OT which allow one to tackle high dimensional learning problems.
The course will feature numerical sessions using Python.
- Motivations, basics of probabilistic modeling and matching problems.
- Monge problem, 1D case, Gaussian distributions.
- Kantorovitch formulation, linear programming, metric properties.
- Shrödinger problem, Sinkhorn algorithm.
- Duality and c-transforms, Brenier’s theory, W1, generative modeling.
- Semi-discrete OT, quantization, Sinkhorn dual and divergences
-
Computational methods and MCMC
Computational methods and MCMC
Ects : 4
Lecturer :
CHRISTIAN ROBERT
Total hours : 18
Overview :
Motivations Monte-Carlo Methods Markov Chain Reminders The Metropolis-Hastings method The Gibbs Sampler Perfect sampling Sequential Monte-Carlo methods
Learning outcomes :
This course aims at presenting the basics and recent developments of simulation methods used in statistics and especially in Bayesian statistics. Methods of computation, maximization and high-dimensional integration have indeed become necessary to deal with the complex models envisaged in the user disciplines of statistics, such as econometrics, finance, genetics, ecology or epidemiology (among others!). The main innovation of the last ten years is the introduction of Markovian techniques for the approximation of probability laws (and the corresponding integrals). It thus forms the central part of the course, but we will also deal with particle systems and stochastic optimization methods such as simulated annealing.
-
Applied bayesian statistics
Applied bayesian statistics
Ects : 4
Lecturer :
ROBIN RYDER
Total hours : 18
Overview :
We shall put in practice classical models for statistical inference in a Bayesian setting, and implement computational methods. Using real data, we shall study various models such as linear regression, capture-recapture, and a hierarchical model. We shall discuss issues of model building and validation, the impact of the choice of prior, and model choice via Bayes Factors. The implementation shall use several algorithms: Markov Chain Monte Carlo, importance sampling, Approximate Bayesian Computation. The course is based on the free software R. Practical information: Large portions of the course are devoting to students coding. Students should bring their own laptop, which must have R installed before the first session; I strongly suggest installing RStudio (free) as well.
Require prerequisites :
Pre-requisite: Knowledge of the programming language R is essential, as well as an introduction to Bayesian inference.
Learning outcomes :
Modelling and inference in a Bayesian setting
-
Bayesian non parametric and Bayesian Machine Learning
Bayesian non parametric and Bayesian Machine Learning
Ects : 4
Lecturer :
JULYAN ARBEL
Total hours : 18
-
Mixing times of Markov chains
Mixing times of Markov chains
Ects : 6
Lecturer :
JUSTIN SALEZ
Total hours : 24
Overview :
Combien de fois faut-il battre un paquet de 52 cartes pour que la permutation aléatoire obtenue soit à peu près uniformément distribuée ? Ce cours est une introduction sans pré-requis à la théorie moderne des temps de mélange des chaînes de Markov. Un interêt particulier sera porté au célèbre phénomène de "cutoff", qui est une transition de phase remarquable dans la convergence de certaines chaînes vers leur distribution stationnaire. Parmi les outils abordés figureront les techniques de couplage, l'analyse spectrale, le profil isopérimétrique, ou les inégalités fonctionnelles de type Poincaré. En guise d'illustration, ces méthodes seront appliquées à divers exemples classiques issus de contextes variés: mélange de cartes, marches aléatoires sur les groupes, systèmes de particules en intéraction, algorithmes de Metropolis-Hastings, etc. Une place importante sera accordée aux marches sur graphes et réseaux, qui sont aujourd'hui au coeur des algorithmes d'exploration d'Internet et sont massivement utilisées pour la collecte de données et la hiérarchisation des pages par les moteurs de recherche.
Learning outcomes :
Pour en savoir plus : www.ceremade.dauphine.fr/~salez/mix.html
Bibliography-recommended reading
Notes de cours, examen 2019 et correction (J. Salez) Markov Chains and Mixing Times (D. Levin, Y. Peres & E. Wilmer) Mathematical Aspects of Mixing Times in Markov Chains (R. Montenegro & P. Tetali) Mixing Times of Markov Chains: Techniques and Examples (N. Berestycki) Reversible Markov Chains and Random Walks on Graphs (D. Aldous & J. Fill)
-
Object recognition and computer vision
Object recognition and computer vision
Ects : 4
-
Journalisme et données
Journalisme et données
Ects : 4
Lecturer :
ROBIN RYDER
Total hours : 18
Overview :
L’objectif de ce cours est de mettre en place une interaction entre des étudiants mathématiciens et journalistes, en collaboration avec l’Institut Pratique du Journalisme. Après des interventions de deux professionnels, les étudiants formeront des groupes de 2 à 4 personnes (en mélangeant M2 MASH et M2 IPJ) pour analyser en autonomie des jeux de données de grande taille. Ils auront à débroussailler les données, trouver une problématique, proposer et valider des modèles pertinents, effectuer des analyses mathématiques, choisir un angle, élaborer des visualisations de données, et rédiger un rapport accessible au grand public sous forme d’article de presse.
-
Natural language processing
Natural language processing
Ects : 4
Lecturer :
ALEXANDRE ALLAUZEN
-
Renforcement learning
Renforcement learning
Ects : 4
-
Evaluations des politiques publiques
Evaluations des politiques publiques
Ects : 4
Lecturer :
BRIGITTE DORMONT
-
Méthode à noyau pour l'apprentissage
Méthode à noyau pour l'apprentissage
Ects : 4
Lecturer :
MICHAEL ARBEL
Total hours : 18
Overview :
Reproducing kernel Hilbert spaces et le “kernel trick”Théorème de représentationKernel PCAKernel ridge regressionSupport vector machinesNoyaux sur les semigroupesNoyaux pour le texte, les graphes, etc.
Learning outcomes :
Présenter les bases théoriques et des applications des méthodes à noyaux en apprentissage.
Mémoire de recherche
Academic Training Year 2023 - 2024 - subject to modification
Teaching Modalities
The program starts in September and attendance is required.
The program consists of a block of six required core courses in statistical machine learning. Students must pass four electives, including at least one from each unit, as well as a required internship of at least four months in duration in a company or research center.
Course information:
- 24 courses and two required introductory courses in Bayesian statistics are offered: 16 at Paris Dauphine-PSL and 8 at ENS or MINES
- All courses correspond to 4 ECTS credits, except for the two introductory courses, which are 0 credit courses
- A student must pass 10 courses (the equivalent of 40 ECTS credits) including six required core courses and four electives.
- Attendance is required at all courses in which a student is registered, and any absences will negatively affect the final grade.
- If a student receives a final grade of at least 10/20 on the research thesis, they are considered to have passed the thesis and will receive 20 ECTS credits.
Internships and Supervised Projects
Students can choose between an internship suggested by a faculty member, an internship featured at the “Internship Fair,� or a different internship with the prior approval of the Master's program director. The internship must take place after enrollment in the Master's degree program. It should pose a solid research question and present an opportunity for the practical application of one of the themes examined over the course of the Master's program.
It should last at least four months, from April to September of the academic year in which it is to be taken. Except in very rare, preapproved cases, the internship must conclude by the end of September at the latest.
Â
Research-driven Programs
Training courses are developed in close collaboration with Dauphine's world-class research programs, which ensure high standards and innovation.
Research is organized around 6 disciplines all centered on the sciences of organizations and decision making.
Learn more about research at Dauphine