Publications
SVM based speaker verification using a GMM supervector kernel and NAP variability compensation
Summary
Summary
Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent discovery is that latent...
Support vector machines using GMM supervectors for speaker verification
Summary
Summary
Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker recognition. The standard training method for GMMmodels is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. Recent methods in compensation for speaker and channel variability have proposed the idea...
Support vector machines for speaker and language recognition
Summary
Summary
Support vector machines (SVMs) have proven to be a powerful technique for pattern classification. SVMs map inputs into a high-dimensional space and then separate classes with a hyperplane. A critical aspect of using SVMs successfully is the design of the inner product, the kernel, induced by the high dimensional mapping...
Speaker adaptive cohort selection for Tnorm in text-independent speaker verification
Summary
Summary
In this paper we discuss an extension to the widely used score normalization technique of test normalization (Tnorm) for text-independent speaker verification. A new method of speaker Adaptive-Tnorm that offers advantages over the standard Tnorm by adjusting the speaker set to the target model is presented. Examples of this improvement...
Measuring human readability of machine generated text: three case studies in speech recognition and machine translation
Summary
Summary
We present highlights from three experiments that test the readability of current state-of-the art system output from (1) an automated English speech-to-text system (2) a text-based Arabic-to-English machine translation system and (3) an audio-based Arabic-to-English MT process. We measure readability in terms of reaction time and passage comprehension in each...
The 2004 MIT Lincoln Laboratory speaker recognition system
Summary
Summary
The MIT Lincoln Laboratory submission for the 2004 NIST Speaker Recognition Evaluation (SRE) was built upon seven core systems using speaker information from short-term acoustics, pitch and duration prosodic behavior, and phoneme and word usage. These different levels of information were modeled and classified using Gaussian Mixture Models, Support Vector...
Estimating and evaluating confidence for forensic speaker recognition
Summary
Summary
Estimating and evaluating confidence has become a key aspect of the speaker recognition problem because of the increased use of this technology in forensic applications. We discuss evaluation measures for speaker recognition and some of their properties. We then propose a framework for confidence estimation based upon scores and metainformation...
The MIT Lincoln Laboratory RT-04F diarization systems: applications to broadcast audio and telephone conversations
Summary
Summary
Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization has utility in making automatic transcripts more readable...
Fusing discriminative and generative methods for speaker recognition: experiments on switchboard and NFI/TNO field data
Summary
Summary
Discriminatively trained support vector machines have recently been introduced as a novel approach to speaker recognition. Support vector machines (SVMs) have a distinctly different modeling strategy in the speaker recognition problem. The standard Gaussian mixture model (GMM) approach focuses on modeling the probability density of the speaker and the background...
Dialect identification using Gaussian mixture models
Summary
Summary
Recent results in the area of language identification have shown a significant improvement over previous systems. In this paper, we evaluate the related problem of dialect identification using one of the techniques recently developed for language identification, the Gaussian mixture models with shifted-delta-cepstral features. The system shown is developed using...