Publications
Speaker recognition using G.729 speech codec parameters
Summary
Summary
Experiments in Gaussian-mixture-model speaker recognition from mel-filter bank energies (MFBs) of the G.729 codec all-pole spectral envelope, showed significant performance loss relative to the standard mel-cepstral coefficients of G.729 synthesized (coded) speech. In this paper, we investigate two approaches to recover speaker recognition performance from G.729 parameters, rather than deriving...
The NIST Speaker Recognition Evaluation - overview, methodology, systems, results, perspective
Summary
Summary
This paper, based on three presentations made in 1998 at the RLA2C Workshop in Avignon, discusses the evaluation of speaker recognition systems from several perspectives. A general discussion of the speaker recognition task and the challenges and issues involved in its evaluation is offered. The NIST evaluations in this area...
Approaches to speaker detection and tracking in conversational speech
Summary
Summary
Two approaches to detecting and tracking speakers in multispeaker audio are described. Both approaches use an adapted Gaussian mixture model, universal background model (GMM-UBM) speaker detection system as the core speaker recognition engine. In one approach, the individual log-likelihood ratio scores, which are produced on a frame-by-frame basis by the...
Speaker verification using adapted Gaussian mixture models
Summary
Summary
In this paper we describe the major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs). The system is built around the likelihood ratio test for verification, using simple but effective GMMs for likelihood functions, a universal background...
A study of computation speed-ups of the GMM-UBM speaker recognition system
Summary
Summary
The Gaussian Mixture Model Universal Background Model (GMM-UBM) speaker recognition system has demonstrated very high performance in several NIST evaluations. Such evaluations, however, are concerned only with classification accuracy. In many applications, system effectiveness must be evaluated in light of both accuracy and execution speed. We present here a number...
Evaluation of confidence measures for language identification
Summary
Summary
In this paper we examine various ways to derive confidence measures for a language identification system, using phone recognition followed by language models, and describe the application of an evaluation metric for measuring the "goodness" of the different confidence measures. Experiments are conducted on the 1996 NIST Language Identification Evaluation...
Speaker and language recognition using speech codec parameters
Summary
Summary
In this paper, we investigate the effect of speech coding on speaker and language recognition tasks. Three coders were selected to cover a wide range of quality and bit rates: GSM at 12.2 kb/s, G.729 at 8 kb/s, and G.723.1 at 5.3 kb/s. Our objective is to measure recognition performance...
Modeling of the glottal flow derivative waveform with application to speaker identification
Summary
Summary
An automatic technique for estimating and modeling the glottal flow derivative source waveform from speech, and applying the model parameters to speaker identification, is presented. The estimate of the glottal flow derivative is decomposed into coarse structure, representing the general flow shape, and fine structure, comprising aspiration and other perturbations...
Automatic speaker clustering from multi-speaker utterances
Summary
Summary
Blind clustering of multi-person utterances by speaker is complicated by the fact that each utterance has at least two talkers. In the case of a two-person conversation, one can simply split each conversation into its respective speaker halves, but this introduces error which ultimately hurts clustering. We propose a clustering...
Corpora for the evaluation of speaker recognition systems
Summary
Summary
Using standard speech corpora for development and evaluation has proven to be very valuable in promoting progress in speech and speaker recognition research. In this paper, we present an overview of current publicly available corpora intended for speaker recognition research and evaluation. We outline the corpora's salient features with respect...