Charles Explorer logo
🇬🇧

Using Czech-English Parallel Corpora in Automatic Identification of It

Publication at Faculty of Mathematics and Physics |
2012

Abstract

In this paper we have two goals. First, we want to present a part of the annotation scheme of the recently released Prague Czech-English Dependency Treebank 2.0 related to the annotation of personal pronoun it on the tectogrammatical layer of sentence representation.

Second, we introduce experiments with the automatic identification of English personal pronoun it and its Czech counterpart. We design sets of tree-oriented rules and on the English side we combine them with the state-of-the-art statistical system that altogether results in an improvement of the identification.

Furthermore, we design and successfully apply rules, which exploit information from the other language.