Charles Explorer logo
🇬🇧

CzEng 0.9

Publication

Abstract

CzEng 0.9 is the third release of a sentence-parallel Czech-English corpus compiled at the Institute of Formal and Applied Linguistics (ÚFAL) freely available for non-commercial and research purposes. CzEng 0.9 contains 8.0 million parallel sentences (93 million English and 82 million Czech tokens) from seven different types of sources automatically annotated at surface and deep (a- and t-) layers of syntactic representation.

Keywords