Charles Explorer logo
🇬🇧

The malleability of speech production: An examination of sophisticated voice disguise

Publication at Faculty of Arts |
2017

Abstract

The variability of speech production continues to present one of the key challenges to forensic phoneticians relying on the auditoryacoustic method of comparing voices, as well as to automatic methods of speaker recognition. Although a speaker's voice may be regarded as a reflection of the anatomy and physiology of his or her vocal tract, these only impose optimal values of individual parameters and limits; importantly, these limits are extremely generous, and the speech production mechanism is typically described as extremely plastic.

As speakers, we exploit this plasticity in our everyday lives when communicating the various components of communicative intent and indexing the various facets of our identity; short- and longterm segmental and suprasegmental information are combined in countless degrees of freedom. If we want to consider settings relevant for speaker identification, then the plasticity and malleability of speech production are perhaps best illustrated on voice disguise.

Although it appears that, in a majority of actual cases, perpetrators use only one or two ways of modifying their voice, the aim of this presentation is to examine speakers whose voice disguise strategies were sophisticated and, most importantly, resulted in speech which sounded natural and might pass for the given speaker's normal speech production. The analyses are based on the Database of Common Czech which features 100 male speakers recorded in a number of speaking tasks.

One of the tasks consisted in reading a short phonetically rich text in an ordinary voice, while in another task the speakers were asked to read another short text in a disguised voice. They were given sufficient time to devise a strategy to disguise their voice.

The two texts differed but contained some identical phrases which may be used for detailed comparison. A general mapping of disguise strategies was conducted by Růžičková and Skarnitzl (2017): 3 out of the 100 speakers did not perform any modification at all when disguising their voice, another 31 changed one characteristic.

At the other end of the spectrum, we identified 15 speakers who were, at least on first listening, rather difficult to recognize when comparing the natural and disguised voice production, and whose speech sounded natural. These speakers were contacted again and asked to record a longer text using the same kind of disguise; the same text, with neutral content, was used for both conditions this time.

The presentation will describe their phonatory and articulatory disguise (and in one case imitation) strategies in greater detail, including selected acoustic analyses and audio comparisons of the speakers' natural and disguised voice.