Charles Explorer logo
🇬🇧

Functions and translations of underspecified discourse markers in TED Talks: a parallel corpus study on five languages

Publication at Faculty of Mathematics and Physics |
2019

Abstract

Discourse markers are highly polyfunctional, particularly in spoken settings. Because of their syntactic optionality, they are often omitted in translations, especially in the restricted space of subtitles such as the parallel transcripts of TED Talks.

In this study, we combine discourse annotation and translation spotting to investigate English discourse markers, focusing on their functions, omission and translation equivalents in Czech, French, Hungarian and Lithuanian. In particular, we study them through the lens of underspecification, of which we distinguish one monolingual and two multilingual types.

After making an inventory of all discourse markers in the dataset, we zoom in on the three most frequent and, but and so. Our small-scale yet fine-grained corpus study suggests that the processes of underspecification are based on the semantics of discourse markers and are therefore shared cross-linguistically.

However, not all discourse marker types nor their functions are equally affected by under