The paper describes the metrics-based model for assessing complexity of Russian legal texts, implying the use of 130 metrics divided into following categories: “basic metrics”, “readability formulas”, “words of different part-ofspeech classes”, “n-grams of part-of-speech tags”, “frequency of lemmas”, “word-building patterns”, “grammes”, “lexical and semantic features, multi-word expressions, hypertext links”, “syntactic features”, “cohesion assessments”. The paper illustrates the reasons for choosing metrics, taking into account the experience of studies on linguistic complexity, stylometric studies, as well as experimental studies of legal texts perception.
The authors present the results of testing the model in an experiment on the classification of texts by complexity level using metrics as parameters. These results are compared with the results of classification using USE (Universal Sentence Encoder) language model vectors as parameters.
The authors come to the conclusion that the use of metrics makes it possible to assess text complexity more precisely than in an experiment using a language model.