Charles Explorer logo
🇬🇧

Comparison of spoken corpora: really just a matter of perspective?

Publication at Faculty of Arts |
2019

Abstract

Recently, more attention has been paid to the issues of corpus design and representativeness. These issues are especially important for general-purpose language corpora such as the spoken corpora developed within the framework of the Czech National Corpus.

The text is a response to Jan Chromý's paper "Comparison of spoken corpora from a sociolinguistic perspective" in which the author compares the general-purpose spoken corpus ORAL2013 with his own dataset collected for the SAUP project. We argue that some of his claims are not justified by the findings presented in the paper and that his understanding of the concept of representativeness is rather misleading.

Therefore, we aim to clarify some fundamental design decisions adopted for the compilation of ORAL2013 by responding to the specific objections raised by Chromý. We also point out some methodological and reasoning inconsistencies in his paper.