Production and perception are two sides of the same coin - shape of the language and properties of text are products of human abilities in both. The quantitative linguistic theories (e.g. the Zipfian principle of least effort, the Altmannian theory proposal and the Köhlerian synergetic control circle) take into account that the language is a trade-off between the demands and capabilities of the text receiver and the capabilities of the producer.
In other words, the producer tries not only to minimize their effort, but also to maximize success of the communication and therefore accommodates their texts to satisfy the needs of the receiver. The main objective of the study is to explore how the texts meet the demands of the readers.
While the corpora are mostly utilized to study production of texts, there is no inherent reason why they could not be used to study perception. We compiled a corpus of Czech blogs to explore the relation between lexical richness and phonological features of texts and their success.
Moving average type-token relation (MATTR) and moving average entropy (MAH) were used as lexical richness metrics, and several phonological features (PF) connected with the euphony such as vowel/phoneme ratio, open syllable ratio, consonant cluster distribution etc. were taken into account. The number of views and relative number of likes were employed as text success metrics.
The following competing hypotheses emerged: H0: there is no relation between the lexical richness (PF) and the text success at all; H1: the lexically less rich texts are more popular than the lexically richer ones, as they are more readable; H2: for the text producer it is difficult to attain the lexical richness (PF values) level that is ideal for the receiver (even in L1) and thus the positive correlation between lexical richness and text success can be observed; H3: text producing abilities evolved so that the lexical richness (and PF) level which is ideal for the producer is the same as the ideal level for the text receiver. It turned out that the most frequent lexical richness metrics values correspond with the values of the texts whose average number of views is the highest one, which is in accordance with the theories that expect some degree of self-organization in language.
Contrary, lexically rich texts have on average higher relative number of likes and with the descending lexical richness metrics values the average relative number of likes is also descending. The text success is almost independent of the phonological features of the texts.