Charles Explorer logo
🇬🇧

Word Formation Analyzer for Czech: Automatic Parent Retrieval and Classification of Word Formation Processes

Publication at Faculty of Mathematics and Physics |
2022

Abstract

We present a deep-learning tool called Word Formation Analyzer for Czech, which, given an input lexeme, automatically retrieves the lemma or lemmas from which the input lexeme was formed. We call this task parent retrieval.

Furthermore, based on the number of words in the output sequence and its comparison to the input, the input word is classified into one of three categories: compound, derivative or unmotivated. We call this task word formation classification.

In the task of parent retrieval, Word Formation Analyzer for Czech achieved an accuracy of 71%. In word formation classification, the tool achieved an accuracy of 87%.