Charles Explorer logo
🇬🇧

A Syntactico-semantic Description of Selected Groups of Verbs Expressing Emotions and Feelings - Methods of Contrastive Study of Valency Based on a Parallel Corpus

Publication at Faculty of Arts |
2016

Abstract

Our goal is to identify factors underlying the choice of equivalents of psych verbs when translating from Czech to Polish, using context and the syntactico-semantic properties of the source lexeme's syntactic arguments. We start with the relation of meaning to context and valency and an overview of options for studying and distinguishing meaning using a parallel corpus.

Then we proceed with a manual valency-based analysis, examining parallel concordances of a Czech verb seen from the Polish perspective as highly polysemous (toužit 'to yearn, to desire'). The results, complemented and verified by a bilingual glossary automatically extracted from the parallel corpus, show that valency is not the only predictor of an appropriate target equivalent for a lexeme.

In the second part of the study, we examine options for formalizing the choice of an equivalent using the source context of the lexeme. First, we focus on collocation profiles and syntactic analysis as methods for aggregating data about the object argument of the source lexeme and evaluate their reliability as cues for predicting the target equivalent.

Finally, we turn to machine learning methods, using a stochastic classifier to determine equivalents in both linear and syntactically structured contexts. None of the above methods confirmed the hypothesis that valency is the main predictor for the choice of the target equivalent in general.

It is not even obvious that methods based on syntactically structured contexts outperform those based on linear contexts. However, the study still yielded intermediate conclusions: of all syntactic dependents, the object argument is the best predictor, and valency can be the primary factor in specific cases (infinitival complements), the methods for extracting the binary glossary originally used for this research have been successfully applied to build a lexical database for many language pairs in the parallel corpus InterCorp, etc.