Charles Explorer logo
🇨🇿

A New Corpus of Czech With an Innovated Annotation

Publikace na Filozofická fakulta |
2021

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

The paper introduces the SYN2020 corpus. The design of SYN2020 incorporates several substantial new features in the area of segmentation, lemmatization and morphological tagging, such as a new treatment of lemma variants, a new system for identifying morphological categories of verbs or a new treatment of multiword tokens.