Machine Learning
Ects : 6
Enseignant responsable :
Volume horaire : 36Description du contenu de l'enseignement :
The course gives a thorough presentation of the machine learning field and follows this outline:
- general introduction to machine learning and to its focus on predictive performances (running example: k-nearest neighbours algorithm)
- machine learning as automated program building from examples (running example: decision trees)
- machine learning as optimization:
- empirical risk minimization
- links with maximum likelihood estimation
- surrogate losses and extended machine learning settings
- regularisation and kernel methods (support vector machines)
- reliable estimation of performances:
- over fitting
- split samples
- resampling (leave-one-out, cross-validation and bootstrap)
- ROC curve, AUC and other advanced measures
- combining models:
- ensemble techniques
- bagging and random forests
- boosting
- unsupervised learning:
- clustering (hierarchical clustering, k-means and variants, mixture models, density clustering)
- outlier and anomaly detection
Pré-requis obligatoires :
- intermediate level in either Python or R. Students are expected to be able to perform standard data management tasks in Python or R, including, but not limited to:
- loading a data set from a CSV file
- recoding and cleaning the data set
- implementing a simple data exploration strategy based on pivot table and on graphical representation
- intermediate level in statistics and probability. Students are expected to be familiar with:
- descriptive statistics
- conditional probabilities and conditional expectations
- core results from statistics: bias and variance concepts, strong law of large numbers, central limit theorem, etc.
Compétence à acquérir :
After attending the course the students will
- have a good understanding of the algorithmic and statistical foundations of the main machine learning techniques
- be able to select machine learning techniques adapted to a particular task (exploratory analysis with clustering methods, predictive analysis, etc.)
- be able to design a model selection procedure adapted to a particular task
- report the results of a machine learning project with valid estimation of the performances of their model
Mode de contrôle des connaissances :
- quizzes and tests during the course
- machine learning project