The profession of data scientist
A data scientist processes and analyzes data collected in bulk by companies or organizations, in order to improve their performance.
The era of Big Data and technological developments enable organizations to study data in more detail, thanks to the talent of data scientists. Giving meaning to data is useful to the leadership, but also to all departments including marketing, human resources, customer service and finance. This expert’s cross-disciplinary position makes them a valuable asset in the company’s overall activity.
They may take on a number of responsibilities. At Université Paris Dauphine – PSL, the Master's degree in Computer Science and the Master's degree in Mathematics train future data scientists through courses in computer science and applied mathematics, with an emphasis on organizational sciences and data. What’s the difference between a data scientist and a data analyst? A data scientist extracts raw data to make it usable and designs methods to analyze this data. Then the data analyst uses the processed and segmented data to answer questions or satisfy the needs of their organization.
Words from our alumni
“Data Science is useful everywhere, and as you work on more
projects you learn an enormous amount about a company’s business.”
What does a data scientist do in practical terms on a daily basis?
Practically, a data scientist must use advanced statistics/machine learning methods, using past data about clients to predict the future behavior of these same clients.
Data scientists have many responsibilities over the course of a project. Their work is not limited to modelling a business problem; it is much broader and above all much more interesting, and can be broken down into several steps:
- Frame the problem: a large part of the work will be done at the point when the Data Scientist understands the issues of the project, and what they must model. It’s more difficult than it seems, and a lot of discussion with the “client”: why do we want to model this? What are we doing today? What happened in the past that could skew our data? What could happen in the future that could alter our model? What are the legal constraints and risks of this project?
- Data wrangling: once the issue has been understood, the data scientist needs to work with the available data, understand which data to choose and why, transform it, code it... The building of this “basefile” is the most important part of the modelling, since high-quality data must be guaranteed. In Data Science we say “Garbage In Garbage Out”; if the data is not good, the quality of the model cannot change this!
- Modeling: the basefile is ready, we can model, search for the models most suited to the issue (Linear model? Boosting? Decision tree?), work on the reliability of this model (limit overlearning, optimize hyperparameters, etc.)
- Validating: for me this is the most important part of the project. It’s very easy when we’re comfortable with the statistics or code for building a highly predictive super model. The most difficult part is presenting the results to a client who very likely has very limited mathematical knowledge. Being able to present results and convince a client of your work is essential for a data scientist.
- Implementing: the model is complete, yes but now we need to industrialize and talk to the right people (such as data engineers) to implement the model in production: it’s great to make a beautiful model but it needs to be used!
- Follow-up: the major risk is believing that the work is finished when the model is in production! The work has just begun: we need to implement a monitoring strategy for this model, to be sure that what we predicted is happening, and above all to prevent the model from becoming outdated.
What part of this profession interests you most?
For me the most interesting part is being able to have an impact on all departments within a company. Data Science is useful everywhere, and as you work on more projects you learn an enormous amount about a company’s business.
Why did you choose Université Paris Dauphine – PSL for your training?
The Master's degree in MASH (Mathematics, Machine Learning, and the Humanities) gave me solid technical skills, as well as an openness towards the company. The projects we worked on during the year were relevant enough to talk about during hiring interviews.
What is the ONE crucial skill that you learned during the second year of the Master's degree in MASH at Dauphine, to work in this sector, and that still serves you today?
The Master's degree in MASH taught me a huge amount about the theory of Machine Learning. When you arrive in a company with solid technical skills, you can quickly learn about the business side! Conversely, a person who needs to acquire the technical skills will take longer to move forward in a company.
AUGUSTIN LEJEUNE
DATA SCIENTIST - L'OLIVIER ASSURANCE
His background:
Master’s degree in MASH
Role and missions of a data scientist
The role and missions of a data scientist vary according to the company for which they work and the status of their position. A data scientist may be an employee of a company, of a consulting firm, or may be self-employed. They work in collaboration with the data engineer and the data analyst within their team.
Daily
tasks
The main assignments of a data scientist are:
- Understanding the issues at hand and modelling mathematic/statistical problems in order to solve them
- Choosing tools for collecting, storing and analyzing data
- Selecting relevant and reliable data sources
- Developing algorithms and predictive models to anticipate data trends and changes
- Making data understandable to managers (data visualization)
- Issuing business recommendations to managers and to leadership to improve decision-making
- Monitoring technological trends
Salaries and career development
The median gross annual salary for a junior data scientist graduating from Université Paris Dauphine – PSL is €46,200. Salary ranges take into account the different levels of responsibility and the business sector of the employer.
After 5 years of experience, the gross annual salary of a senior data scientist exceeds €70,000.
At the start of their career, a young graduate begins as a data analyst. With more experience and perspective in their position, they may become a data scientist. As they develop managerial skills they can apply for a position as Chief data scientist and manage Data Sciences teams.
Skills
required
- Excellent knowledge of technological solutions and computer programming
- Advanced knowledge of applied mathematics and statistics to design algorithms and predictive analyses
- Perfect management of databases and data structure
- Ability to summarize and process information
- Managerial and/or project management skills
Which studies to become a data scientist?
Five years of post-secondary education are required to become a data scientist. A university program with a Master's degree in Data Science allows applicants to achieve the academic level required by recruiters.
Training to become a data scientist
at Université Paris Dauphine – PSL
The Master's degrees in Computer Science and in Mathematics and Applied Mathematics at Dauphine - PSL are high-level programs that develop all the skills needed by a future data scientist. Several specialization courses offer students the opportunity to acquire understanding of the fundamentals of computer science, applied mathematics or data science.
- The IASD specialization (Artificial Intelligence, Systems, Data) trains students to design and develop artificial intelligence systems.
- The MIAGE-ID specialization (Computer Methods Applied to Business Management - Computer Science and Decision-making) teaches computer science and decision-making skills.
- The ISF specialization (Statistical and Financial Engineering) trains company executives who know how to apply quantitative methods to solve business issues.
- The MASH specialization (Mathematics, Machine Learning, and the Humanities) offers a high-level program with applications in digital economy and the humanities.