Information-theoretic locality properties of natural language

Publication

Abstract

I present theoretical arguments and new empirical evidence for an information-theoretic principleof word order: information locality, the idea that words that strongly predict each other shouldbe close to each other in linear order. I show that information locality can be derived underthe assumption that natural language is a code that enables efficient communication while minimizinginformation-processing costs involved in online language comprehension, using recentpsycholinguistic theories to characterize those processing costs information-theoretically. I arguethat information locality subsumes and extends the previously-proposed principle of dependencylength minimization (DLM), which has shown great explanatory power for predicting word orderin many languages. Finally, I show corpus evidence that information locality has improved explanatorypower over DLM in two domains: in predicting which dependencies will have shorterand longer lengths across 50 languages, and in predicting the preferred order of adjectives in

English.

Keywords

word order information-processing Dependency Length Minimization