Publications

Refine Results

(Filters Applied) Clear All

Robust text-independent speaker identification using Gaussian mixture speaker models

Published in:
IEEE Trans. Speech Audio Process., Vol. 3, No. 1, January 1995, pp. 72-83.

Summary

This paper introduces and motivates the use of Gaussian mixture models (GMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modeling speaker identify. The focus of this work is on applications which require high identification rates using short utterance from unconstrained conversational speech and robustness to degradations produced by transmission over a telephone channel. A complete experimental evaluation of the Gaussian mixture speaker model is conducted on a 49 speaker, conversational telephone speech database. The experiments examine algorithmic issues (initializations, variance limiting, model order selection), spectral variability robustness techniques, large population performance, and comparisons to other speaker modeling techniques (uni-modal Gaussian, VQ codebook, tied Gaussian mixture, and radial basis functions). The Gaussian mixture speaker model attains 96.8% identification accuracy using 5 second clean speech utterances and 80.8% accuracy using 15 second telephone speech utterances with a 49 speaker population and is shown to outperform the other speaker modeling techniques on an identical 16 speaker telephone speech task.
READ LESS

Summary

This paper introduces and motivates the use of Gaussian mixture models (GMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modeling speaker identify. The focus of this work is on applications which require...

READ MORE

Integrated models of signal and background with application to speaker identification in noise

Published in:
IEEE Trans. Speech Audio Process., Vol. 2, No. 2, April 1994, pp. 245-257.

Summary

This paper is concerned with the problem of robust parametric model estimation and classification in noisy acoustic environments. Characterization and modeling of the external noise sources in these environments is in itself an important issue in noise compensation. The techniques described here provide a mechanism for integrating parametric models of acoustic background with the signal model so that noise compensation is tightly coupled with signal model training and classification. Prior information about the acoustic background process is provided using a maximum likelihood parameter estimation procedure that integrates an a priori model of acoustic background with the signal model. An experimental study is presented in the paper on the application of this approach to text-independent speaker identification in noisy acoustic environments. Considerable improvement in speaker classification performance was obtained for classifying unlabeled sections of conversational speech utterances from a 16-speaker population under cross-environment training and testing conditions.
READ LESS

Summary

This paper is concerned with the problem of robust parametric model estimation and classification in noisy acoustic environments. Characterization and modeling of the external noise sources in these environments is in itself an important issue in noise compensation. The techniques described here provide a mechanism for integrating parametric models of...

READ MORE

An integrated speech-background model for robust speaker identification

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 2, 23-26 March 1992, pp. 185-188.

Summary

This paper examines a procedure for text independent speaker identification in noisy environments where the interfering background signals cannot be characterized using traditional broadband or impulsive noise models. In the procedure, both the speaker and the background processes are modeled using mixtures of Gaussians. Speaker and background models are integrated into a unified statistical framework allowing the decoupling of the underlying speech process from the noise corrupted observations via the expectation-maximization algorithm. Using this formalism, speaker model parameters are estimated in the presence of the background process, and a scoring procedure is implemented for computing the speaker likelihood in the noise corrupted environment. Performance is evaluated using a 16 speaker conversational speech database with both "speech babble" and white noise background processes.
READ LESS

Summary

This paper examines a procedure for text independent speaker identification in noisy environments where the interfering background signals cannot be characterized using traditional broadband or impulsive noise models. In the procedure, both the speaker and the background processes are modeled using mixtures of Gaussians. Speaker and background models are integrated...

READ MORE

Showing Results

1-3 of 3