Charles Explorer logo
🇬🇧

Statistics in Biology III - Seminar on Advanced Statistical Methods

Class at Faculty of Science |
MB120P174

Syllabus

Schedule of the individual two-day blocks:Block 1 - Generalized Linear Models (GLM) and introduction to hierarchical designs

Day 1 - morningTheory (cca 2 h) - Introduction to GLM, concept of deviance, link functions, etc., introduction to logistic regressionExercise (approx. 1 h) - logistic regression, its assumptions, interpretation, construction of confidence intervals of the logistic curveDay 1 - afternoonTheory (approx. 1 h) - GLM with binomial and Poisson distribution, treatment of overdispersionExercises (approx. 3 h) - practical analyses using poisson and binomial GLM, interpretation of diagnostic graphs, detection and treatment of overdispersion  

Day 2 - morningTheory (approx. 1 h) - GLM with gamma distribution, other than canonical link-functionsExercise (approx. 2 h) - practical exercises on GLM from the whole spectrum of variants discussed so farDay 2 - afternoonTheory (approx. 1 h) - Hierarchical data designs and hierarchical ANOVA (split-plot, hierarchical ANOVA s.s.)Exercises (approx. 1.5 h) - Identification of individual layers of hierarchical designs, practical implementation of hierarchical ANOVs, auxiliary linear models for verification of assumptionsTheory (approx. 1.5 h) - Revision of the concept of random effect factors and introduction to linear models with mixed effects1st classified homework: analysis of two data sets focused on GLM usage

Block 2 - Mixed Effect Models - Linear (LME) and Generalized Linear (GLMM)

Day 1 - morningTheory (approx. 1 h) - LME - continuation, interpretation of LME, introduction to LME testingExercise (approx. 2 h) - LME with one random effect, introduction to testing of fixed effect factors, interpretation of LME resultsDay 1 - afternoonTheory (approx. 1 h) - LME - continuation, differences between random effect and mixed effect factor, testing of random effect factors in LMEExercise (approx. 3 hours) - LME with multiple factors with random and mixed effectsDay 2 - morningTheory (approx. 1 h) - construction of confidence intervals in LME - model profiling and other CI construction methods, expression of the amount of explained variability within LME (pseudo-R2)Exercise (approx. 2 h) - construction of confidence intervals for LME, calculation of psuedo-R2Day 2 - afternoonTheory (approx. 1 h) - transition from LME to GLMM, common problems when working with mixed effect models and how to deal with themExercises (approx. 3 hours) - GLMM exercises and revision excersises for mixed effect models2nd classified homework: analysis of two data sets with hierarchical design

Block 3 - Data with temporal, spatial or phylogenetic correlation between observations - Generalized Least Squares (GLS)

Day 1 - morningTheory (approx. 1 h) - Introduction to GLS, possibility to use for heteroscedastic data, introduction to temporal and spatial autocorrelation of data, time series analyses, detection of spatial autocorrelation of data (semivariograms), functions useful for approximation of semivariogramExercise (approx. 2 h) - GLS with weights (heteroskedasticity), 1st-order autoregressive models, ARIMA models, spatial data autocorrelationDay 1 - afternoonTheory (approx. 1 h) - Introduction to work with phylogenetic data, models of character evolution, phylogenetically independent constants (PIC)Exercise (approx. 3 hours) - recording and editing of phylogenesis data, mapping of characters to phylogenetic trees, analysis of data using PICDay 2 - morningTheory (approx. 1 h) - Phylogenetic GLS (pGLS) and transformation of phylogenetic tree into a variance-covariance matrix, phylogenetic RMA (reduced major axis regression)Exercise (approx. 2 h) - analysis of data sets with available data on phylogenyDay 2 - afternoonTheory (approx. 0.5 h) - Phylogenetic analysis of major components (phylPCA)Exercise (approx. 1.5 h) - continuation of tasks from morning + phylPCASeminars (approx. 2 hours) - discussion of model tasks, focusing on the identification of the nature of data and selection of appropriate analytical techniques 3rd classified homework: analysis of two data sets with spatial, temporal or phylogenetic correlation of response variable

Annotation

The course aims to introduce students into a selection of frequently-used advanced techniques of statistical data analysis. The course is a sequel to Statistics in biology and design of ecological experiments (MB120P163), which is a prerequisite to this course. In justified cases (e.g. similar course in data analysis accomplished), the teachers will allow enlisting into the course without achieving the prerequisite. The course shall consist of three two-day teaching blocks of combined talks and practicals – 1) non-normally distributed response variables – generalised linear models (GLM); 2) hiearchical experimental designs – mixed-effect models (LME, GLMM) and nested

ANOVAs; 3) models with spatially, temporally or phylogenetically correlated responses – generalised least squares (GLS, PGLS).