Charles Explorer logo
🇨🇿

ONCO: Compiling an Old Norse Corpus

Publikace na Filozofická fakulta |
2023

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

The field of Old Norse/ Icelandic studies would greatly benefit from an existence of a comprehensive corpus of Old Norse/ Icelandic, akin to other historical corpora of other languages, such as those currently at the disposal of scholars of Old and Middle English. Such a corpus, containing a wide variety of extant texts, would naturally facilitate broader generalizations, comparative studies, and allowing observation of frequencies of occurrence. All these could advance our understanding of the language as well as perhaps shed some additional light on textual transmission and language contact.

A comprehensive corpus of Old Norse/ Icelandic should encompass normalized as well as non-normalized texts of different dialectal provenances, not restricted to a particular genre. Morpho-syntactic tagging, apart from enhancing searches across the data set, would additionally allow for a future incorporation of other tools such as generated paradigms based on the corpus data, and other visualisation utilities. In light of these demands, developing a tagger suitable for the task is a key concern as well as the main challenge. Focusing on tagger development, this presentation aims to discuss the process of creation of a comprehensive corpus of Old Norse/ Icelandic.