Panneau de gestion des cookies
NOTRE UTILISATION DES COOKIES
Des cookies sont utilisés sur notre site pour accéder à des informations stockées sur votre terminal. Nous utilisons des cookies techniques pour assurer le bon fonctionnement du site ainsi qu’avec notre partenaire des cookies fonctionnels de sécurité et partage d’information soumis à votre consentement pour les finalités décrites. Vous pouvez paramétrer le dépôt de ces cookies en cliquant sur le bouton « PARAMETRER » ci-dessous.

Python for data science

Ects : 3

Enseignant responsable :

  • MOHAMED KHALIL EL MAHRSI

Volume horaire : 18

Description du contenu de l'enseignement :

The course is organised as follows.

 

1 - Introduction to Python Programming

 

This first part introduces the fundamentals of Python programming. It covers topics such as working with basic built-in types (numbers, strings, booleans, ...), control flow statements, writing reusable code (functions), handling errors and exception that can occur during the execution of Python code, advanced data structures (lists, sets, dictionaries, ...), ...

 

2 - Scientific Computing With NumPy

 

This part focuses on using NumPy, a scientific computing package that provides a wide assortment of useful and highly-optimized routines for working with multi-dimensional arrays (matrices, tensors, ...), linear algebra, statistics and random simulation, and much more.

 

3 - Processing Tabular Data With pandas

 

The third part of the course is dedicated to pandas, a fundamental Python package when it comes to data science and data analysis. pandas provides functionalities for efficient manipulation of data frames, i.e., tabular data (stored in csv files, Excel sheets, ...). With the help of pandas, you can easily conduct tasks such as data cleaning (filling missing data, replacing outliers, ...), reshaping, merging, ...

 

4 - Visualizing Data With Matplotlib and seaborn

 

The last part of the course is a quick introduction to data visualization functionalities in Python using the Matplotlib and seaborn packages. Data visualization is a very powerful tool for making sens of large volumes of data, identifying patterns, and extracting useful insights that can help understand and solve real-world business cases.

Pré-requis recommandés :

The course does not assume any prior knowledge in programming in general and Python in particular. However, familiarity with another programming language can be useful in understanding the discussed concepts and topics.

Pré-requis obligatoires :

You are expected to be familiar with mathematical tools associated to an economics curriculum (linear algebra, calculus, probability, and statistics) at an undergraduate level

Coefficient : 1

Compétence à acquérir :

By the end of this course, you will be able to

  • Write and understand entry-level to intermediate-level code in the Python programming language
  • Use NumPy for scientific computing and efficient manipulation of multi-dimensional arrays and matrices
  • Use pandas to load, manipulate, and analyze tabular data
  • Use Matplotlib and seaborn to visualize data

Mode de contrôle des connaissances :

You will be evaluated based on a team project (conducted in pairs) in which you will apply the knowledge and skills you acquired during the course. The project takes the form of an exploratory data analysis in which you will work on a tabular data set in order to extract valuable insights that can help solve a business problem. The expected deliverables of the project are:

  • A 5–10 pages report;
  • The source code (Jupyter notebooks or Python scripts) of your work, either in a Github repository or as a zip file.

You are expected to present your main findings during a 10-minutes presentation, which will be followed by approximatively 5 minutes of questions.