Charles Explorer logo
🇬🇧

DeriNet version 2.0

Publication

Abstract

DeriNet is a lexical network which models derivational relations in the lexicon of Czech. Nodes of the network correspond to Czech lexemes, while edges represent derivational or compositional relations between a derived word and its base word / words.

The present version, DeriNet 2.0, contains 1,027,665 lexemes (sampled from the MorfFlex dictionary) connected by 808,682 derivational and 600 compositional links. Compared to previous versions, version 2.0 uses a new format and contains new types of annotations: compounding, annotation of several morphological categories of lexemes, identification of root morphs of 244,198 lexemes, semantic labelling of 151,005 relations using five labels and identification of several fictitious lexemes.