Charles Explorer logo
🇬🇧

Maintaining consistency of monolingual verb entries with interannotator agreement

Publication at Faculty of Mathematics and Physics |
2011

Abstract

There is no objectively correct way to create a monolingual entry of a polysemous verb. By structuring a verb into readings, we impose our conception onto lexicon users, no matter how big a corpus we use in support.

How do we make sure that our structuring is intelligible for others? We are performing an experiment with the validation of the fully corpus-based Pattern Dictionary of English Verbs (Hanks & Pustejovsky, 2005), created according to the lexical theory Corpus Pattern Analysis (CPA). The lexicon is interlinked with a large corpus, in which several hundred randomly selected concordances of each processed verb are manually annotated with numbers of their corresponding lexicon readings (“patterns”).

It would be interesting to prove (or falsify) the leading assumption of CPA that, given the patterns are based on a large corpus, individual introspection has been minimized and most people can agree on this particular semantic structuring. We have encoded the guidelines for assigning concordances t