Charles Explorer logo
🇬🇧

Practical Fundamentals of Probability and Statistics for Computer Linguistics

Class at Faculty of Mathematics and Physics |
NPFL081

Syllabus

- mathematical probability, its definition and calculating

- random variable (discrete and continuos) and its probability distribution

- distribution function, quantile function, density

- statistical independence

- expected value and variance

- properties of binomial and normal distributions

- random sampling

- parameters of distributions, parameter estimating, t-test

- statistical hypothesis testing, critical values

- contingency tables, hypothesis testing in contingency tables

- chi-square distribution, chi-square tests

- entropy, conditional entropy, mutual information

- basics of programming in R system (www.r-project.org)

Annotation

ONLY for students in EM Program in LCT, see http://ufal.mff.cuni.cz/lct.html. The aim of the course is to introduce elementary probabilistic and statistical principles, techniques and methods which are used in solving computational linguistics (natural language processing) tasks.

An essential part of the course is active work with data and introduction to workflow in R while solving a given task. A part of the course will consist of individual study of mutually agreed selected materials.