Self-Organizing Computational Efficiency in Quranic Grammar

Publikace

Abstrakt

The existing knowledge-based and data-driven systems for Arabic morphological analysis are all suffering three main computational drawbacks, viz. efficiency, domain, and abstraction. Although the knowledge-based systems employ heavy lexical databases, they generate highly ambiguous tags.

And to cover a new domain their lexicon should be costly modified. They also do not provide the linguistic abstraction preferred especially in computational linguistics.

Similarly, the systems developed following a data-driven approach ignore the linguistic tractability for Arabic morphology and are highly dependent on big sizes of domain-specific training data. The source of these drawbacks may be traced in the morphological approach they employ in their knowledge base or in their training data.

This chapter introduces regex morpho-syntax for Arabic, a highly efficient formalism originating from the basic grammatical rules developed for diacritizing the Quran fourteen centuries ago. The developed formalism is implemented in the knowledge base of Mobin morpho-syntactic parser and tagger.

The achieved F-score of 0.967 for the computational effectiveness of the system as well as its significant comparative efficiency measured in terms of Kolmogorov complexity highlights the inherent computational efficiency in Quranic grammar.

Klíčová slova

Arabic language Grammatical complexity Computational natural language Processing systems Artificial Intelligence Knowledge-based systems Quranic grammar Mobin parser and tagger