Charles Explorer logo
🇬🇧

Corpus and representativeness

Publication at Faculty of Arts |
2014

Abstract

This paper discusses the concept of representativeness in corpus linguistics. Representativeness is a concept used in empirical, quantitative science and it is a characteristic of the relationship between the sample and the population.

It is argued that the population for the standard supposedly "representative" corpora of a whole language cannot be defined. The population could be defined reliably only for the specialized corpora (e.g. corpora of newspaper texts etc.), hence only this type of corpora could be trully statistically representative.

The paper also discusses an idea that we could think about representativeness from the perspective of particular linguistic items instead of from the perspective of the whole language. It may be so that the same corpus is representative for the use of one item and not representative for the use of another item at the same time.