Charles Explorer logo
🇬🇧

Data Science

Class at Faculty of Mathematics and Physics |
NDBI048

Syllabus

What is data science, typical use cases. Data science decathlon (an overview of related methods, algorithms and technologies). Map of follow-up lectures, organization of the course, requirements for credit / exam.

Motivation and problems of data science - a view from industry. Limits of statistical methods, distortion.

Technologies for data science I: overview of popular representatives (technology stack), Python and data science.

Phases of a data science project, methodology CRISP-DM. Business understanding, data understanding.

Methods of data exploration and visualization.

Creating a useful and understandable report.

Data preparation (cleaning, transformation, feature extraction, ...).

Modeling I: basic statistical models and performance evaluation.

Modeling II: applied Bayesianism.

Data science in modern database systems.

Big Data science, MapReduce and data science.

Apache Spark and data science.

Technologies for data science II: MLops versioning, documentation, ...

Business view of a data science project.

Annotation

The course will provide a practical introduction to data science. The lectures will discuss phases of the data science project, related technologies and methods. In the practicals, the individual steps will be applied to real- world data. Part of the lectures will also focus on the specifics of Big Data. The added value will be practical experience from data science projects of the Profinit company, hardly found in textbooks.

The course is intended for students of specialization Big Data Processing and also other specializations who want to gain a basic overview of the field of data science.