Charles Explorer logo
🇬🇧

Proper Nouns in Czech Corpora

Publication at Faculty of Mathematics and Physics |
2008

Abstract

Although proper nouns are an inseparable part of natural language texts and thus of text corpora, (at least) Czech corpus linguistics has been paying little attention to this area. We demonstrate that Czech corpora can be certainly used as a rich source for study of proper nouns.

Firstly, we introduce how proper nouns are handled in present Czech corpora: e.g., in the Prague Dependency Treebank 2.0 a very simple annotation of proper nouns is involved in the morphological annotation whereas the syntactic annotation scheme of this corpus does not take proper nouns into consideration. We list several reasons why we consider such annotations to be insufficient and unsuitable.

As for the position of the annotation of proper nouns within a multi-layered annotation scheme, we arrive at the conclusion that proper nouns could be annotated at a (deep) syntactic layer. Therefore, the annotation schemes have to be adapted for this purpose.