Charles Explorer logo
🇨🇿

Rank-frequency Relation & Type-token Relation: Two Sides of the Same Coin

Publikace na Filozofická fakulta |
2013

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

This paper shows that type-token relation, hapax-token relation and, generally, relation between types of certain frequency and tokens can be computed from the rank-frequency relation or from any type frequency §distribution and that type-token relation can be computed from the hapax-token relation. This paper shows that there is no need for any approximation or assumption and that the formulae can be derived purely algebraically.

The second part of the paper observes that, for a very large corpora, ratio between number of hapax legomena and types converges to a constant Z; Z>0. Under this assumption an approximation is built that enables us to predict type-token relation and other aforementioned relations from the single parameter Z.

This approximation is only valid for very large corpora. As the last chapter shows, this assumption implies that for an infinitely increasing number of tokens, number of types increases beyond any limit.