In this paper, we present a spoken dialog system used for collecting data for future research in the field of dementia prediction from speech. The dialog system was used to collect the speech data of patients with mild cognitive deficits.
The core task solved by the dialog system was the spoken description of the vivid shore picture for one minute. The patients also performed other simple speech-based tasks.
All utterances were recorded and manually transcribed to obtain a ground-truth reference. We describe the architecture of the dialog system as well as the results of the first speech recognition experiments.
The zero-shot Wav2Vec 2.0 speech recognizer was used and the recognition accuracy on word- and character-level was evaluated.