Charles Explorer logo
🇬🇧

Biostatistics 1, 2

Class at First Faculty of Medicine |
B83534

Syllabus

(intended topics - a given topis is designed to take two to four practicals)

- Descriptive statistics. Averages and variability. Measures of association. Graphical data representation.

- Introduction to probability. Random variable. Selected probabilistic distributions.

- Confidence intervals. Introduction to inferential statistics. Parametric and nonparametric tests of inference. Interpretation of results and appropriate graphical visualization.

- Analysis of variance (ANOVA). Univariate and multivariate linear regression. Interpretation of results and appropriate graphical visualization.

- Logistic binary regression. Multinomial logistic regression. Interpretation of results and appropriate graphical visualization.

- Mixed-effects model. Hierarchical models. Interpretation of results and appropriate graphical visualization.

- Introduction to time series. Introduction to survival analysis. Interpretation of results and appropriate graphical visualization.

- Selected advanced statistical methods in R, both linear and nonlinear. Cluster analysis. Discriminant analysis. Jackknife. Bootstrap. Interpretation of results and appropriate graphical visualization.

- Selected methods of machine learning in R. Naïve Bayes classifier. Support Vector Machine (SVM). Cross-Validation (CV). Principal Component Analysis (PCA). Decision trees. Random forests. Neural networks. Association rules. Interpretation of results and appropriate graphical visualization.

Annotation

So far, statistics is the most powerful tool for analyzing data and hypotheses in biomedicine, and, furthermore, how to interpret the results of statistical analyses, which helps to guarantee the quality of a scientific proof concept in medicine. Therefore, knowledge of fundamental and applied biostatistics principles and their application to interpret statistical results in publications became necessary for modern physician education. In these times, tons of research outputs containing advanced and complex information are published daily and required to be interpreted using "a language of statistics". Moreover, age of covid-19 and (un)willing sharing of misleading or incorrect information has tested medical students and professionals' statistical and epidemiological knowledge and showed there is a room for reinforced teaching of (bio)statistics, e. g. by an elective course.

The elective course is recommended for all undergraduate students considering doing science in their future career partly or on full-time. As expected course-takers, students (not only) thinking about a Ph.D., and all interested graduate students (Ph.D. candidates) are more than welcome. The course is designed as a brief introduction (a "crash-course") into (bio)statistics for newbies; therefore, there are no demands on previous knowledge of statistics or statistical software. An applied course's mathematical level is reduced as much as possible and based only on high-school mathematics. A student will be introduced to descriptive statistics, descriptive characteristics and association measures used in publications, graphical data visualizations, and appropriate plots for given data inputs. An important topic emphasized both theoretically and practically is statistical inference, including assumptions of the tests, parametric and nonparametric (robust) approaches, and applying a correct inference test for given data and hypotheses on input, and, also, linear and nonlinear regression techniques for modeling a continuous response variable using explanatory ones and logistic regression classifying individuals into classes based on their predictors. Finally, some other and even more complex methods will be introduced, namely but not only hierarchical models and survival analysis. Within a practical part of the course, adequately emphasized and dimensioned, statistical software will be applied to real data – an open-source clickable software Jamovi and others, including R language and environment if requested.

The course is finished by a final seminar project, where a student will analyze data using statistics and interpret statistical results appropriately. Besides fingers-on exercises working with statistical concepts and models, appropriate and correct interpretations and understanding of statistical results and outputs will be discussed.

During all the course and particularly in the final course project, a student will face a need to keep her data analyses and project results in a reproducible and transparent way. A similar course is usually listed as a part of the curriculum at other Prague universities, including medical faculties. The course is already established and taught in Czech language mutation at our faculty.