UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Speech amplitude and zero crossing for automated identification of human speakers Wasson, Douglas Arnold

Abstract

This thesis involves an investigation of the usefulness of speech amplitude, low pass zero crossing rate (ZCR) and high pass zero crossing rate for speaker recognition. Speech samples were recorded from ten speakers and preprocessed to yield time quantized preprocessed waveforms of SPEECH AMPLITUDE, LOW PASS ZCR and HIGH PASS ZCR. These preprocessed waveforms were then averaged for each speaker to form three sets of averaged waveforms. These waveforms were then expanded in an n-dimensional feature space, where n is the number of speakers used in testing the system. The orthogonal functions describing the feature space were derived from the average preprocessed waveforms by using the Gram Schmidt orthogonalization technique. The average preprocessed waveforms were expanded in the feature space to provide a reference feature vector for each speaker. The recognition procedure was based on measuring the Euclidean distance between the feature vector derived from a sample preprocessed waveform for the unknown speaker and the reference feature vectors. The speaker whose reference vector was closest to the test vector was chosen as the speaker of that utterance. A variety of combinations of the preprocessed waveforms were tested. The best results showed an average percentage of incorrect decisions of 3.4% when one of ten speakers was identified on the basis of a single spoken sentence. The results indicate that speech amplitude and zero crossing rate are useful for speaker identification.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.