Background Several decades of research in linguistics has focussed on building a better understanding of how language and gender interact with one another. Whilst much of this work has been centred around studying usage, for example the differences in the types of words used by different gender groups (Schwartz et al., 2013) or the types of words used to describe different gender groups (Kuznetsova, 2015), there is a growing interest in how gender is represented psycholinguistically - where the emphasis is placed on the association between a word's semantics and how that relates to the dimension of gender (e.g. the extent to which a word is associated with femininity, neutrality or masculinity - AG).
This work has started to uncover interesting relationships into how we represent gender associations within the words we learn, use and process in everyday language. For instance, Scott et al. (2019) collected AG ratings for a large set of English words and explored how they correlate with other psycholinguistic variables, such as emotional valence.
Although the collection of AG ratings is providing fruitful insights into how we represent gender semantics, there is still a lack of research on languages other than English. This is theoretically important as AG is likely to be influenced by multiple factors, not only cultural, but also linguistic.
This paper aims to explore AG in Czech and how it is modulated by structural properties of the language, namely grammatical gender (GG). Methods We explore the SocioLex dataset (Preininger et al, submitted), an ongoing project that aims to quantify the socio-semantics of Czech words, where participants were asked to rate words in terms of how they associate the meaning with femininity/neutrality/masculinity on a 7-point Likert scale.
We analysed data from 1,161 participants (848 female, 308 male, 5 non-binary, Mage = 21.8 years). Ratings for 2,700 words were collected (1,603 nouns, 766 adjectives, 331 verbs, N participant ratings per word = 42).
Whenever possible, both masculine and feminine variants of a word were included (e.g. pekař/pekařka [masc/fem baker]). Results We coded words into 7 different groups, based on part-of-speech, grammatical gender and animacy (see Table 1).
The distribution of AG ratings for the different groups is shown in Fig. 1. Linear regression models confirm that GG significantly predicts whether an animate noun is associated with femininity or masculinity (p <.001, R2 = .90), but this result is also observed for adjectives (p <.001, R2 = .71) and crucially inanimate nouns (p <.001, R2 = .13).
We also tested whether the strength of AG was larger for grammatically feminine or masculine forms, by modelling the absolute AG values, i.e. non-directional in terms of femininity/masculinity, which revealed significantly higher scores for feminine forms (p <.001). Discussion Our results suggest that the way people associate words with gender is systematically modulated by GG and related language specific phenomena.
There appears to be stronger representation of femininity for grammatically feminine forms when compared to representation of masculinity for grammatically masculine forms, which could be explained by markedness - in Czech, masculine variants are often used even when referring to females (generic masculine, for example Hledáme pekaře. [Baker needed.]). In future work, we are planning to explore whether it is possible - despite the differences in language structure - to grasp the similarities and differences in relation to whom speakers of different languages associate words with the same meanings.