- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Speech synthesis by concatenation of Digital waveform...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Speech synthesis by concatenation of Digital waveform fragments Chu, Thien-Ke
Abstract
A method to rule-synthesize speech by concatenation of digital waveform fragments at a subphonemic level is presented. No special hardware is needed to implement this software synthesizer other than a D/A converter and an ordinary audio system. Computer software for an on-line analysis-by-synthesis process was developed. Phonetic cues, such as characteristic waveform fragments, durations of each quasi-steady state and the transition motion of a certain number of phonemes were extracted and stored. Classifications of phonetic cues were possible and necessary to reduce the storage requirement and to obtain rules for synthesis. An interpolation scheme was developed to generate transient waveforms to eliminate the discontinuities at the concatenated junctions. Pitch variation was found to be the most influential factor for creating intonation in polysyllable utterances and was achieved by a pitch modification routine included in the synthesis program. Test procedures and results are reported in which a comparable vowel recognition rate for synthetic words is 93% vs. the 94% of digitized natural words in the first test. Further studies are needed to generalize the method to synthesize unrestricted text. The findings of the phonetic cues could be applied to speech recognition in future work.
Item Metadata
Title |
Speech synthesis by concatenation of Digital waveform fragments
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
1978
|
Description |
A method to rule-synthesize speech by concatenation of digital waveform fragments at a subphonemic level is presented.
No special hardware is needed to implement this software
synthesizer other than a D/A converter and an ordinary audio system.
Computer software for an on-line analysis-by-synthesis process was developed. Phonetic cues, such as characteristic waveform fragments, durations of each quasi-steady state and the transition motion of a certain number of phonemes were extracted and stored. Classifications of phonetic cues were possible and necessary to reduce the storage requirement and to obtain rules for synthesis. An interpolation scheme was developed to generate transient waveforms to eliminate the discontinuities at the concatenated junctions. Pitch variation was found to be the most influential factor for creating intonation in polysyllable utterances and was achieved by a pitch modification routine included in the synthesis program.
Test procedures and results are reported in which a comparable vowel recognition rate for synthetic words is 93% vs. the 94% of digitized natural words in the first test.
Further studies are needed to generalize the method to synthesize unrestricted text. The findings of the phonetic cues could be applied to speech recognition in future work.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2010-03-11
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
|
DOI |
10.14288/1.0065436
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Campus | |
Scholarly Level |
Graduate
|
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.