1. Introduction - an overview of basic application components.
2. Spelling checker Dictionary based methods vs. checking of illegal combinations of characters, string similarity metrics, communication towards the user.
3. Grammar checking Error patterns vs. syntactic analysis, types of detectable errors, attitude towards the user, RFODG and LanGR.
4. Machine Assisted human translation Translation memory and its variants in commercial products, controlled language, glossary hierarchies.
5. Machine Translation Google Translate vs. rule-based systems commercial systems (Systran, PC Translator), quality evaluation methods, evaluation of translation competitions, project Euromatrix.
6. Localization Differences between translation and localization, commercial localization tools.
7. Generating Text generation from tectogrammatical layer.
8. Information retrieval and extraction Basic models, evaluation metrics, text similarity metrics, lemmatization, stop words, the role of linguistic tools, Malach project. *
9. Question answering Dialog systems, multimodal communication.
10. Speech synthesis and recognition Basic problems and algorithms.
11. Semantic web Exploitation of linguistic methods for searching for information on the web, the role of the tectogrammatical layer.
The main goal of the course is to introduce basic types of natural language processing (NLP) applications and to give the students a chance to work with some of those applications in seminars.
The course will concern machine translation, machine aided human translation tools, localization tools, information retrieval and extraction, question answering, speech recognition, spelling and grammar checking, generation etc.