Using a Database of Multiword Expressions in Dependency Parsing

Publication at Faculty of Arts |

2019

Abstract

Identifying and correctly handling multiword expressions is critical for understanding a language system and for properly functioning NLP tools. This paper presents a database of multiword expressions (MWE) we build for the Czech language which currently contains more than 7,000 entries.

It contains detailed information about the properties of MWEs, e.g. about their idiomaticity and variability. The database also contains manually verified dependency structures of MWEs.

We show one of the possible uses of the database: identification and correction of parsing errors in sentences containing MWEs.

Keywords

multiword expressions dependency parsing database of multiword expressions Czech