Charles Explorer logo
🇬🇧

Comparison of Genres in Corpora on the Basis of Syntactic Functions of Substantives

Publication at Faculty of Arts |
2013

Abstract

Large synchronic textual corpora of the Czech National Corpus are built as representative: they contain a balanced quantity of texts of various styles, divided into three genre subcorpora: fiction, technical/scientific literature and journalism. Comparisons of these genres have been performed on phonological and morphological level; in this paper, I deal with differences between genres on the surface-syntactic level.

I use an automatic syntactic annotation of the SYN2005 corpus in the formalism of the analytical layer of the Prague Dependency Treebank. I compare the frequencies of syntactic functions of nouns in the three genres represented by the corresponding subcorpora of SYN2005.

I also present a more detailed analysis of four syntactic phenomena: subtypes of the function of attribute in non-prepositional genitive; frequencies of groups of the type pan Novák (Mr. Novák); frequencies of the function of agent in passive constructions expressed by nouns in non-prepositional instrumental and the ratio of the expression of the nominal part of a verbal-nominal predicate by nominative and instrumental.

Significant differences found between genres in all the syntactic phenomena analyzed show that in comparing corpora one should carefully monitor their genre composition.