Charles Explorer logo
🇬🇧

Introduction to Data Analysis in R

Class at Faculty of Arts |
ASGV00154

Syllabus

Topics: 0. Before starting - install R, Rstudio, Tidyverse individually at home according to our instructions 1.

What you learn in the course (motivation), what you have to accomplish, R as software, R Studio as user interface, materials and where to find help, R-base vs. Tidyverse, examples of working with R-base, data structures in R, built-in functions in R. 2.

Data import, data file transformations (dplyr package; select, filter, arrange, mutate, summarize function) 3. Working in multiple variables at once (across function) 4.

Data file manipulation (pivot_longer, pivot_wider, *_join, bind_rows, bind_collumns function) 5. Revision of functions from the dplyr and tidyr packages 6.

Working with factors (forcats package) 7. Exploring data using visualization (ggplot package 2) - 1st class 8.

Exploring data using visualization (ggplot package 2) - 2nd class 9. Aesthetic and functional editing of graphs (ggplot2 package, scales package) 10.

Working with text variables (stringr package) 11. Introduction to RMarkdown and generating analytical outputs in various formats 12.

Revision

Annotation

This course is taught in Czech.

The course is an introduction to the R programming language developed for statistical data analysis. Previous knowledge of the R language is not assumed in the course, but basic knowledge of descriptive statistics and prior experience of data analysis is a prerequisite. The minimum input for particularly motivated students at the Department of Sociology FF UK is to take first year Statistics 1 (ASG100117), Statistics 1 Seminar (ASG100118) and Sociological Data Processing (ASG100118).

We base our course on a modern approach to data analysis in R using the R Studio development environment and Tidyverse "grammar." This approach is likely to prevail among the user community today.

Learning R is a long shot. It's a journey that means a much bigger time investment than mastering GUI software, such as SPSS. The reward is much more flexibility, and in hand a universal tool for data processing, analysis, visualization, as well as programming and automation. Although we can't get that far in the course, in R today, thanks to available libraries and tools, it is also possible to create interactive graphic applications, web pages, presentations and, in addition to standard statistical analysis, machine learning tools are also available. This course makes sense especially for those students who want to focus quantitatively in their sociological path and who are ready to self-study and further develop the modest foundations that the course will offer.

Participation in teaching presupposes a custom laptop with an Internet connection (Eduroam or other).