Charles Explorer logo
🇨🇿

Universal Dependencies for Malayalam

Publikace na Matematicko-fyzikální fakulta |
2023

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

Treebanks can play a crucial role in developing natural language processing systems and to have a gold-standard treebank data it becomes necessary to adopt a uniform framework for the annotations. Universal Dependencies (UD) aims to develop cross-linguistically consistent annotations for the world's languages.

The current paper presents the essential pivots of the UD based syntactically annotated treebank for Malayalam. Sentences extracted from the IndicCorp corpus were manually annotated for morphological features and dependency relations.

Language-specific properties are discussed which shed light on many of the grammatical areas in the Dravidian language syntax which needs to be examined in-depth. This paper also discusses some pertaining issues in UD taking into consideration the Dravidian languages and provides insights for further improvements in the existing treebanks.