Charles Explorer logo
🇬🇧

Sharing data through specialized corpus-based tools: the case of GramatiKat

Publication at Faculty of Arts |
2021

Abstract

This paper presents a specialized corpus tool GramatiKat in the context of Open Science principles, namely data sharing, which offers opportunities for original research and facilitates verifiability of research and building on previous research. The tool is designed primarily for examining grammatical categories from the quantitative point of view.

It offers grammatical profiles of particular lemmas (currently 14 thousand Czech nouns) and the proportion of individual grammatical categories within a part of speech, i.e. the standard behavior of the word class. The data in GramatiKat are pre-processed, statistically evaluated, and presented in charts and tables for clarity, and they are available to other linguists, especially from fields of morphology and lexicography.

This article is aimed to provide inspiration and support to corpus and non-corpus linguists to utilize and enhance existing tools and to create new specialized tools available to other users.