Exploiting temporal change in pitch in formant estimation

March 31, 2008

Conference Paper

Author:

Tianyu Tom Wang

…

Thomas F. Quatieri

Published in:

Proc. IEEE Int. Conf. on Acoustic, Speech, and Signal Processes, ICASSP, 31 March - 4 April 2008, pp. 3929-3932.

R&D Area:

Cyber Security and Information Sciences

R&D Group:

Artificial Intelligence Technology and Systems

Exploiting temporal change in pitch in formant estimation

Summary

This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our work is inspired by auditory perception and physiological modeling studies implicating the use of temporal changes in speech by humans. Specifically, we develop and assess signal processing schemes aimed at exploiting temporal change of pitch as a basis for formant estimation. Our methods are cast in a generalized framework of two-dimensional processing of speech and show quantitative improvements under certain conditions over representations derived from traditional and homomorphic linear prediction. We conclude by highlighting potential benefits of our framework in the particular application of speaker recognition with preliminary results indicating a performance gender-gap closure on subsets of the TIMIT corpus.