- Motivation for NLP. Probability models and information theory, basic notions.
- Language models, smoothing.
- Hidden markov models.
- Language data resources, experiments in NLP.
- Morphological tagging.
- Syntactic analysis.
- Overview of machine translation approaches.
- Statistical machine translation.
- Linguistic features in machine translation.
- Information retrieval.
- Term weights.
- Document classification and clustering.
- Word embeddings.
The goal of the course is to provide students with knowledge and hands-on experience related to basic (mostly statistical) methods in the field of Natural Language Processing. The students will be acquainted with fundamental components such as corpora and language modes, as well as with complex end-user applications such as
Machine Translation.