Improvements to Dependency Parsing Using Automatic Simplification of Data

Publikace na Filozofická fakulta |

2014

Abstrakt

The paper presents a method of improving dependency parsing using automatic simplification of data. Language data are often too complex (and too sparse) for parsers to cope with.

The paper shows that by means of small, reversible simplifications of the text and of the annotation, a considerable improvement of parsing accuracy can be achieved. In order to facilitate the task of language modeling performed by the parser, I reduce variability of lemmas and word forms in the text.

I modify the system of morphological annotation to make it more suitable for parsing. Finally, the dependency annotation scheme is also partially modified.

All such modifications are automatic and fully reversible: after the parsing is done, the original data and structures are automatically restored. With MaltParser, I achieve an 8.3% error rate reduction.

Klíčová slova

dependency parsing text simplification parsing accuracy morphological annotation sparse data