The aim of the article is to shed light on the methodology of corpus research in the humanities, primarily in linguistics. Corpus linguistics emerged in the late 1970s and early 1980s, focusing on electronic language corpora.
Corpora are collections of various types of texts (written and spoken) gathered in a computer database which makes it possible to automatically search for text units in their natural context. There are various types of corpora depending on the type of study.
The first corpora were compiled for the English language, although more and more languages are acquiring their national corpora, such as the National Corpus of Polish, the Czech National Corpus or the Russian National Corpus.