Charles Explorer logo

Using a Database of Multiword Expressions in Dependency Parsing

Publication at Faculty of Arts |


Identifying and correctly handling multiword expressions is critical for understanding a language system and for properly functioning NLP tools. This paper presents a database of multiword expressions (MWE) we build for the Czech language which currently contains more than 7,000 entries.

It contains detailed information about the properties of MWEs, e.g. about their idiomaticity and variability. The database also contains manually verified dependency structures of MWEs.

We show one of the possible uses of the database: identification and correction of parsing errors in sentences containing MWEs.