Case homonymy in Czech: corpus data and sentence production

Publication at Faculty of Arts |

2017

Abstract

Do speakers keep track of frequency asymmetries in cases of two homonymous, word-level constructions, or, alternatively, do they rely on type frequencies of the individual cases, or is the ambiguity resolved by context and are the case forms consequently represented only as parts of higher level constructions? A sentence production task was designed to address these questions. The experiment tested the production of instrumental singular and genitive plural word forms from "soft" feminine paradigms ending in -í, e.g. lahv-í 'bottle-ins.sg/gen.pl' (cf.

Cvrček et al. 2010). Both the instrumental and the genitive are distinct from direct case forms and both occur with or without a preposition.

All feminine word forms ending in -í were sampled from a corpus of written Czech (Křen et al. 2015) and genitive-biased (more than 60 % of tokens in genitive) and instrumental-biased (less than 40 % genitives) lemmas were selected. Nine test items were selected from each group.

The items denoted objects, processes, and sensations and they were matched for lemma and word form frequency. Filler items consisted of present tense forms or adjectives ending in -í.

In the web-based experiment, participants (n = 46) were instructed to use the target word form in a sentence. Responses were coded for interpretation (genitive or instrumental) and the percentages of genitive uses were calculated for each item.

All the items appear in both functions, suggesting that both interpretations are indeed possible, acceptable, and accessible. When one contrasts the relative frequencies of genitives extracted from the corpus with the numbers obtained from the elicitation task, a clear pattern emerges.

The elicited numbers mirror almost perfectly the corpus frequencies.

Keywords

case homonymy Czech noun morphology corpus linguistics sentence production