A corpus of Czech aphasic speech: development and possible applications

Publikace na Filozofická fakulta |

2016

Abstrakt

In this paper, I present a project which aims to advance the research in this area by developing a corpus of Czech aphasic speech which will serve as source of data for both linguists and clinicians. The corpus includes 10 hours of structured and semi-spontaneous discourse of 11 individuals with aphasia with different levels of fluency and severity ranging from mild to moderate.

The corpus is lemmatized, morphologically tagged, and contains error annotation marking errors typically encountered in aphasic speech, such as paraphasias or agrammatisms. It contains transcripts and time aligned audio tracks.

The corpus will be integrated within the Czech National Corpus environment. To illustrate possible applications of the corpus, I present an analysis of narrative discourse production using a subcorpus of a story retelling task, in which participants saw and retold a three minute video clip.

Using several linguistic measures (cf. e.g. Lind et al. 2009), preliminary profiles of Czech aphasic discourse are created, a topic which has not been addressed previously.

These measures include type- token ration of verbs and nouns, use of general all--purpose (GAP) verbs and nouns, distribution of case forms, number of pauses, errors, and repairs, frequency profiles of verbs and nouns, number of adjectival and adverbial modifiers, number of clauses per conversational unit, and the proportion of full sentences and sentence fragments.

Klíčová slova

aphasia aphasic discourse production corpus of aphasic speech specialized corpora