Charles Explorer logo
🇬🇧

CORPUS LITERACY AS AN UNDERRESEARCHED PROBLEM: THE CONTENT AND SPECIFICITY OF THE CONCEPT

Publication at Faculty of Arts |
2022

Abstract

Background: The rapid development of information technologies and corpus linguistics in our century has made language corpora accessible to a wide range of users. Corpora are constantly growing and improving and many of them are free available on the Internet today. Nevertheless, courses on corpus linguistics and on working with corpora still have not been universally integrated into the university core curricula of philological disciplines. Under these circumstances, it is relevant to discuss the question of the so-called "corpus literacy" which reflects the correspondence (or non-correspondence) between users' competencies and the requirements that must be met when using corpora in research, teaching, translation, or any other work concerned with language data. In the present paper, the corpus literacy is considered an essential condition for the quality and effectiveness of the use of language corpora. Purpose: The article aims a) to summarize and critically analyse literature on the corpus literacy, mainly in English, Russian, and Czech, b) to consider the content of the concept of "corpus literacy", its specifics and boundaries, c) to establish the factors influencing its content, d) to deduce the definition of this concept based on the generalizations made. A sub-purpose of the article is to raise the issue of corpus literacy among linguists of other fields than corpus linguistics. Results: Following conclusions can be drawn from the present theoretical study:

1) The corpus literacy is an underresearched and underdiscussed topic as far as the Russian a Czech contexts are concerned. This issue has been actively discussed in the English language literature since the early 2000s (apparently for the first time in Mukherjee's works), however, mainly in the context of problems related to integrating corpora into practice teaching English as a foreign language in different countries;

2) The corpus literacy does not arise by itself; it should be purposefully developed;

3) The corpus literacy should be developed both by future and current language specialists;

4) The issue of the corpus literacy should be addressed at universities by integrating corresponding courses into philological core curricula, organizing courses for in-service specialists, compiling corresponding study guides, etc.;

5) The corpus literacy is a generalized concept that has many specific implementations, the volume and content of which are influenced by various interrelated factors; in particular, what users use the corpora, for what purpose, which corpus resources are used;

6) The corpus literacy can be defined as a set of general and specific knowledge, skills, and abilities developed by certain corpus users within a certain educational process in order to effectively use a certain corpus (corpora) in a certain occupation. Discussion: The corpus literacy is a vast area for research and the theoretical investigation in this area is intended to lay theoretical foundations for the preparation of textbooks on the use of language corpora by non-specialists (i.e., by non-corpus linguists). The following issues are to be considered in future studies: sources and ways of developing the corpus literacy, the concept of the efficiency of the corpora use (what is it and how to evaluate or measure it), the content and boundaries of general and specific components of the corpus literacy, the state of the corpus literacy among linguists of other fields than corpus linguistics (in the Czech context).