Charles Explorer logo
🇬🇧

Statistical Methods in Natural Language Processing I

Class at Faculty of Mathematics and Physics |
NPFL067

Syllabus

Introduction. Course Overview: Intro to NLP. Main Issues.

The Very Basics on Probability Theory. Elements of Information Theory I. Elements of Information Theory II.

Language Modeling in General and the Noisy Channel Model. Smoothing and the EM algorithm.

Word Classes and Lexicography. Mutual Information (the "pointwise" version). The t-score. The Chi-square test. Word Classes for NLP tasks. Parameter Estimation. The Partitioning Algorithm. Complexity Issues of Word Classes. Programming Tricks & Tips.

Markov models, Hidden Markov Models (HMMs). The Trellis & the Viterbi Algorithms. Estimating the Parameters of HMMs. The Forward-Backward Algorithm. Implementation Issues.

Annotation

Introduction to formal linguistics and the fundamentals of statistical natural language processing, including basics of Infromation Theory,

Language MOdeling and Markov Models. Continues as Statistical Methods in

Natural Language Processing II.