In this contribution we will focus on Spoken Czech in Romanian Banat. A Czech language minority has been living in the South Eastern Romanian region of Banat since the first half of the nineteenth century and the community has been retained in the region until today.
The language of this slowly disappearing language island has been captured in the Banat corpus, a continuously growing corpus which is a collection of spoken informal Czech used in Banatian villages and at the same time one of the biggest specialized corpora of spoken Czech. In the first part, we will have a closer look on the corpus creation process - from the collection of the recordings through their processing and transcription (with a focus on the transcription method used) to the creation of the corpus.
One of the great advantages of the corpus is the fact that it was build with a possibility of comparison in mind - it's practically fully comparable with the ORAL corpus, a corpus of spoken common Czech created by the Institute of Czech National Corpus. Therefore, both corpora can be used for a comparison and a verification of divergences between the two language varieties.
The corpora could also assist with a re-examination of hypotheses from previous case studies which could only depend on knowledge, judgment and introspection of the linguists at the time their research was conducted. In the last part of the presentation we will demonstrate the possibilities of the corpus on a few examples from Czech syntax.
Firstly, we will examine enforced negation expressed by nic ['nothing'] and nikerak ['in no way'] which are typical features of Banatian Czech. Secondly, we will demonstrate the differences in the distribution and use of negative pronouns and adverbs (nikde ['nowhere'], nikdo ['nobody'] and nikdy ['never']) in both language varieties.