On the efficiency of manual and semi-automatic detection of neologisms

Publication at Faculty of Mathematics and Physics, Faculty of Arts |

2019

Abstract

The paper presents a simple semi-automatic neologism detection procedure: a trivial Python script processes a text file, making use of a Czech morphological tagger, and extracts all words unrecognized by the tagger as potential neologisms. The list of these candidates has to be checked by a human (hence semi-automatic).

This method was applied to a set of texts that were also analyzed in a more traditional way, by the "reading and marking" technique (i.e. the current prac-tice). The comparison of the two methods has revealed that the semi-automatic procedure clear-ly outperforms the current practice both in speed and in efficiency.

Keywords

data collection manual detection of neologisms neologisms Python semi-automatic detection of neologisms