Charles Explorer logo
🇬🇧

Data analysis in R and Python

Class at Faculty of Science |
MG440P44

Syllabus

1 Introduction to data analysis and algorithmization I. [OL] Reproducible research Data Analysis in Earth Sciences Why Python? Let’s install out scientific computing environment

2. Introduction to data analysis and algorithmization II. [VJ] Why R? – a bit of history and its current upswing How does the computer programme work? Fundamental data types, algorithmization, typical parts of a computer programme, object-oriented programming

3.  Fundamentals of the Python language I. [OL]  Introduction to Jupyter Notebooks and JupyterLab Python crash course, basics of Python programming Variables and simple data types Advanced datatypes Built-in functions and operators Blocks and loops User-defined functions Errors and exceptions

4.  Fundamentals of the Python language II. [OL] Scientific Python   Introduction to NumPy Visualizations with Matplotlib and Seaborn Data input and output

5. Fundamentals of the R language I. [VJ]   Introduction, fundamental data types and basic operations with them Interactive/batch mode Help and documentation Main data types, attributes Vectors Matrices and arrays Factors Lists

6. Fundamentals of the R language II. [VJ] Programming and graphics Data import and output from/to files Graphical functions and their main parameters Printing and exporting graphics (PDF, PostScript…) Programming in R – conditional execution, loops, user-defined functions R community, CRAN, mailing lists, useR! conferences Expanding R by additional packages (libraries)

7. Python applications I. [OL] Calculations and statistics Advanced NumPy and SciPy Data analysis and manipulation with Pandas

8.  Python applications II.  Directional statistics Basics of directional statistics in 2D and 3D Advanced analyses of 3D orientational data – APSG

9.  R applications I. [VJ] Calculations and statistics Simple geochemical recalculations On usefulness of matrices Descriptive statistics in R Working with large and complex datasets

10. R applications II. [VJ] Graphics in R – examples from whole-rock geochemistry Binary diagrams and Harker plots Ternary diagrams Spiderplots Calculating simple petrogenetic models, including graphical output

Annotation

The course is taught in English when at least one international student is enrolled. This practical course is aimed at senior undergraduate and postgraduate students.

It is intended to: a) explain fundamentals of data processing and visualization in geology as well as functioning of computing algorithms in general; b) present basics of the R and Python programming languages; c) illustrate the usability and versatility of both languages for everyday calculations, as well as for production of publication-quality graphics; d) demonstrate examples of using both languages in reproducible research (with certain structural geology and whole-rock geochemistry bias).