Charles Explorer logo
🇬🇧

PyVallex: A Processing System for Valency Lexicon Data

Publication at Faculty of Mathematics and Physics |
2020

Abstract

We present PyVallex, a Python-based system for presenting, searching/filtering, editing/extending and automatic processing of machine readable lexicon data originally available in a text-based format. The system consists of several components: a parser for the specific lexicon format used in several valency lexicons, a data-validation framework, a regular expression based search engine, a map-reduce style framework for querying the lexicon data and a web-based interface integrating complex search and some basic editing capabilities.

PyVallex provides most of the typical functionalities of a Dictionary Writing System (DWS), such as multiple presentation modes for the underlying lexical database, automatic evaluation of consistency tests, and a mechanism of merging updates coming from multiple sources. The editing functionality is currently limited to the client-side interface and edits of existing lexical entries, but additional script-based operations on the database are also possible.

The code is pub