Studying Properties of Czech Complex Sentences from an Annotated Corpus

Publication at Faculty of Mathematics and Physics |

2011

Abstract

The paper deals with the problem of an analysis of complex sentences in Czech on the basis of manually annotated data. The availability of a specialized corpus explicitly describing mutual relationships between segments and clauses in Czech complex sentences, together with the availability of a thoroughly syntactically annotated corpus, the Prague Dependency Treebank, provide a solid background for linguistic investigation.

The paper presents quantitative, linguistic and structural observations which provide a number of clues for building an algorithm for analyzing a structure of complex sentences in the future.

Keywords

studying properties czech complex sentences from annotated corpus