Charles Explorer logo
🇬🇧

Constructing a Lexical Resource of Russian Derivational Morphology

Publication at Faculty of Mathematics and Physics |
2022

Abstract

Words of any language are to some extent related thought the ways they are formed. For instance, the verb exempl-ify and the noun example-s are both based on the word example, but the verb is derived from it, while the noun is inflected.

In Natural Language Processing of Russian, the inflection is satisfactorily processed; however, there are only a few machine-tractable resources that capture derivations even though Russian has both of these morphological processes very rich. Therefore, we devote this paper to improving one of the methods of constructing such resources and to the application of the method to a Russian lexicon, which results in the creation of the largest lexical resource of Russian derivational relations.

The resulting database dubbed DeriNet.RU includes more than 300 thousand lexemes connected with more than 164 thousand binary derivational relations. To create such data, we combined the existing machine-learning methods that we improved to manage this goal.

The whole approach is eva