Ects : 6

Volume horaire : 36

Coefficient : 6

Compétence à acquérir :

After attending the course the students will

have a good understanding of the algorithmic and statistical foundations of the main machine learning techniques

be able to select machine learning techniques adapted to a particular task (exploratory analysis with clustering methods, predictive analysis, etc.)

be able to design a model selection procedure adapted to a particular task

report the results of a machine learning project with valid estimation of the performances of their model

Mode de contrôle des connaissances :

quizzes and tests during the course

machine learning project

Pré-requis obligatoires :

intermediate level in either Python or R. Students are expected to be able to perform standard data management tasks in Python or R, including, but not limited to:

loading a data set from a CSV filerecoding and cleaning the data set implementing a simple data exploration strategy based on pivot table and on graphical representation

intermediate level in statistics and probability. Students are expected to be familiar with:

descriptive statisticsconditional probabilities and conditional expectationscore results from statistics: bias and variance concepts, strong law of large numbers, central limit theorem, etc.

Volume horaire : 36

Coefficient : 6

Compétence à acquérir :

After attending the course the students will

have a good understanding of the algorithmic and statistical foundations of the main machine learning techniques

be able to select machine learning techniques adapted to a particular task (exploratory analysis with clustering methods, predictive analysis, etc.)

be able to design a model selection procedure adapted to a particular task

report the results of a machine learning project with valid estimation of the performances of their model

Mode de contrôle des connaissances :

quizzes and tests during the course

machine learning project

Pré-requis obligatoires :

intermediate level in either Python or R. Students are expected to be able to perform standard data management tasks in Python or R, including, but not limited to:

loading a data set from a CSV filerecoding and cleaning the data set implementing a simple data exploration strategy based on pivot table and on graphical representation

intermediate level in statistics and probability. Students are expected to be familiar with:

descriptive statisticsconditional probabilities and conditional expectationscore results from statistics: bias and variance concepts, strong law of large numbers, central limit theorem, etc.

Description du contenu de l'enseignement :

The course gives a thorough presentation of the machine learning field and follows this outline:

general introduction to machine learning and to its focus on predictive performances (running example: k-nearest neighbours algorithm)

machine learning as automated program building from examples (running example: decision trees)

machine learning as optimization:

empirical risk minimizationlinks with maximum likelihood estimationsurrogate losses and extended machine learning settingsregularisation and kernel methods (support vector machines)

reliable estimation of performances:

over fittingsplit samplesresampling (leave-one-out, cross-validation and bootstrap)ROC curve, AUC and other advanced measures

combining models:

ensemble techniquesbagging and random forestsboosting

unsupervised learning:

clustering (hierarchical clustering, k-means and variants, mixture models, density clustering)outlier and anomaly detection

Enseignant responsable :

- FABRICE ROSSI