Forensic Speaker Identification Forensic Speaker
Identification (FSI) or forensic speaker recognition are terms that refer to the process that identifies if two or more speech
recordings are from the same speaker. FSI is a part of forensic
phonetics, which is the application that deals with the subject of
phonetics. Phonetics allows the forensic audio examiner to observe
areas focused on how people speak, how speech is perceived and how it
is transmitted acoustically.
The forensic comparison of voice
samples is an extremely complex process and requires expert knowledge
in not just one, but in several different specialty areas related to
speech science. In its most common form speaker identification involves
the comparison of one or more samples of an offender’s voice and is
compared to one or more samples of the suspect’s voice.
When a
forensic phonetician compares and describes voices, he/she usually does
so with respect to linguistic units, especially speech sounds, like
vowels or consonants. A forensic phonetician may observe for instance
that the “ee” vowels in two samples are different. The way speech
sounds are produced, articulatory phonetics, plays an equally important
part in the process of identification.
Phonemics (speech sounds
called phonemes) deals with how speech sounds are functionally
organized in the language. For instance in the English language the
vowel in the word “beat” and the vowel in the word “bit” exhibit
different phoneme behavior because, amongst other things, one vowel is
shorter than the other.
There are many other areas in which a
Forensic Audio Examiner is dealing with which is beyond the scope of
this mini introduction to FSI. Voice is more than just a string of
sounds. There’s a great deal of information that is transmitted when a
voice is spoken. For most people when they hear a voice, they can
identify, the sex of the voice, the language that was spoken and in
some cases the emotional state of the speaker. You don’t even need to
speak a particular language in order to understand the emotional state
of the speaker. However for a Forensic Audio Examiner there’s a wealth
of information hidden in voices and this data is collected, observed,
documented, compared and processed for FSI.

In this
picture you can see how the word “Go”, spoken by a male, appears in
waveform (top window) and spectrogram (bottom) windows. Formant and
pitch information are displayed as well in the spectrogram window.
|