The present chapter applies text classification to test how well we can distinguish between texts along two dimensions: a text-production dimension that distinguishes between translations and non-translations (where translations also include interpreted texts); and a mode dimension that distinguishes between and spoken and written texts. The chapter also aims to investigate the relationship between these two dimensions.
Moreover, it investigates whether the same linguistic features that are derived from variational linguistics contribute to the prediction of mode in both translations and non-translations. The distributional information about these features was used to statistically model variation along the two dimensions.
The results show that the same feature set can be used to automatically differentiate translations from non-translations, as well as spoken texts from the written texts. However, language variation along the dimension of mode is stronger than that along the dimension of text production, as classification into spoken and written texts delivers better results.
Besides, linguistic features that contribute to the distinction between spoken and written mode are similar in both translated and non-translated language.