Phonotactic probability refers to the frequency with which phonological segments and sequences of phonological segments occur in words in a given language (Vitevich & Luce, 2004). It has been shown that phonotactic probabilities of words are important in language processing and language acquisition (Jusczyk, Luce & Charles-Luce, 1994; Mattys & Jusczyk, 2001; Pitt & McQueen, 1998). For example, words with high phonotactic probability are recognized faster by native speakers in lexical decision tasks (Luce & Large, 2001) and pseudowords with high phonotactic probability are judged as more word-like by adults (Vitevitch, Luce, Charles-Luce & Kemmerer, 1997). These effects were however tested mainly on English. In this paper we present two word-likeness rating tasks conducted on Czech.
In experiment 1, 88 native speakers of Czech listened to recordings of 40 pseudowords with varying values of phonotactic probability in random order. They were asked to judge the pseudowords based on their word-likeness on a seven-item Likert scale. A mixed-effects model revealed that phonotactic probability is a good predictor of word-likeness ratings (χ2 (1)=16.37, p=0.000052), yet there is a lot of variance in the data. We found no effect of neighborhood density. In the on-going experiment 2, the participants rate the same pseudowords presented visually. This should show us whether the effect of phonotactic probability persists even with written stimuli.
The described experiment 1 confirms that phonotactic probability influences processing of pseudowords to a similar extent as in English. This is an important finding; since phonotactic probability serves as a factor in many psycholinguistic experiments on English, now it can be used in a similar way on Czech. The variance in the data might be caused, among other factors, by the effect of morphology (some of the pseudowords ended with possible morphemes, some did not). This might be further accentuated by the fact that the standard calculation method of phonotactic probability (Vitevitch & Luce, 2004) puts more weight to the beginnings of words than their endings. However, when a mean of standard and reversed calculation is used, the variance diminishes.