Publications
Compensating for mismatch in high-level speaker recognition
Summary
Summary
Speaker recognition using high-level features has been a successful area of exploration. Features obtained from many different levels phones, words, prosodic events, etc. are used to characterize the speaker. A good modeling technique for these features is the support vector machine (SVM). SVMs model the n-gram frequencies from speaker utterances...
Experiments with lattice-based PPRLM language identification
Summary
Summary
In this paper we describe experiments conducted during the development of a lattice-based PPRLM language identification system as part of the NIST 2005 language recognition evaluation campaign. In experiments following LRE05 the PPRLM-lattice sub-system presented here achieved a 30s/primary condition EER of 4.87%, making it the single best performing recognizer...
Understanding scores in forensic speaker recognition
Summary
Summary
Recent work in forensic speaker recognition has introduced many new scoring methodologies. First, confidence scores (posterior probabilities) have become a useful method of presenting results to an analyst. The introduction of an objective measure of confidence score quality, the normalized cross entropy, has resulted in a systematic manner of evaluating...
Nonlinear equalization for RF receivers
Summary
Summary
This paper describes the need for High Performance Computing (HPC) to facilitate the development and implementation of a nonlinear equalizer that is capable of mitigating and/or eliminating nonlinear distortion to extend the dynamic range of radar front-end receivers decades beyond the analog state-of-the-art. The search space for the optimal nonlinear...
The mixer and transcript reading corpora: resources for multilingual, crosschannel speaker recognition research
Summary
Summary
This paper describes the planning and creation of the Mixer and Transcript Reading corpora, their properties and yields, and reports on the lessons learned during their development.
A scalable phonetic vocoder framework using joint predictive vector quantization of MELP parameters
Summary
Summary
We present the framework for a Scalable Phonetic Vocoder (SPV) capable of operating at bit rates from 300 - 1100 bps. The underlying system uses an HMM-based phonetic speech recognizer to estimate the parameters for MELP speech synthesis. We extend this baseline technique in three ways. First, we introduce the...
SVM based speaker verification using a GMM supervector kernel and NAP variability compensation
Summary
Summary
Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent discovery is that latent...
Support vector machines using GMM supervectors for speaker verification
Summary
Summary
Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker recognition. The standard training method for GMMmodels is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. Recent methods in compensation for speaker and channel variability have proposed the idea...
Support vector machines for speaker and language recognition
Summary
Summary
Support vector machines (SVMs) have proven to be a powerful technique for pattern classification. SVMs map inputs into a high-dimensional space and then separate classes with a hyperplane. A critical aspect of using SVMs successfully is the design of the inner product, the kernel, induced by the high dimensional mapping...
Exploiting nonacoustic sensors for speech encoding
Summary
Summary
The intelligibility of speech transmitted through low-rate coders is severely degraded when high levels of acoustic noise are present in the acoustic environment. Recent advances in nonacoustic sensors, including microwave radar, skin vibration, and bone conduction sensors, provide the exciting possibility of both glottal excitation and, more generally, vocal tract...