This paper represents an attempt to introduce a glottometric approach to the analysis of the ancient Chinese language and classical texts, which to my knowledge has rarely been undertaken by other publications in this field. This state of affairs is somewhat surprising in the light of the fact that various methods of quantitative linguistics have long been known and tested in relation to a number of languages.
Lately, they have become especially popular alongside with the development of text corpora and instruments of electronic text processing, as well as advancements in the theory of information. Although sophisticated methods, such as fractal analysis, are available for the glottometry of texts and are extremely useful in present-day research initiatives on ancient Chinese texts, grappling as they do with the basic issues of textuality and authorship, only the basic statistical properties of classical Chinese texts are presented here, with five specific books from the classical period being glottometrically compared - Hánfēizǐ, Mèngzǐ, Xúnzǐ, Zhuāngzǐ, and Shāngjūnshū.
The paper initially outlines how Classical Chinese basically abides by the Zipf-Mandelbrot law, as does any other language, and then introduces the reader to the basics of character/word distribution in a working corpus of ancient texts. Following this, there is an exploration of what is revealed by lexical statistics with regard to the traditionally claimed differences between the various books representing diverging strains of thought.
The observations, based on measurable criteria, in fact support the more traditional claims, but also allow for a much more fine-grained and objective analysis - at least in some respects. On the other hand, it is clear that quantitative methods have their limitations.
Instead of being (and being able to be) a substitute for philology, they represent an excellent partner for it.