Mind the Gap: Data Enrichment in Dependency Parsing of Elliptical Constructions

Publication at Faculty of Mathematics and Physics |

2018

Abstract

We report on experiments with several approaches to automatically extending training data for dependency parsers, using large crawled web corpora. One set of methods is general, draws upon self-training and tri-training and adds a novel algorithm of mimicking the structural complexity of the original treebank.

Methods from the other set are more focused on elliptical constructions. We provide evaluation on 5 languages: Czech, English, Finnish, Russian and Slovak.

Keywords

mind data enrichment dependency parsing elliptical constructions