Topics
The course covers the following topics. Each lecturer has his/her own individual approach, the order of and/or emphasis on the particular topic can thus vary.
What is a corpus; CNC corpora
Corpus linguistics
Reprezentativeness of written and spoken corpora, register variation
Corpus annotation and structure
Corpus querying and interpretation of a concordance
Frequency analysis
Regular expressions and advanced CQL queries
Collocation, colligation and semantic prosody
Corpus material in the research of individual language layers
Basic foundations of data processing (MS Excel, tables and figures)
Basic statistics for working with corpora
Corpus tools SyD, Morfio, KWords
Specialized corpora (Diakorp, InterCorp, author corpora)
Devising and delivering a linguistic research based on corpus data.
The course is aimed typically at the students of Czech studies. The students will get to know the language corpora available at Czech National Corpus and learn how to use them for their own research. They will also learn how to work with the KonText query interface and other web applications to query, find and interpret language phenomena.
Credit requirements: active participation, test, analysis of a language phenomenon using corpus linguistic methods.