Publications
Experiments with lattice-based PPRLM language identification
Summary
Summary
In this paper we describe experiments conducted during the development of a lattice-based PPRLM language identification system as part of the NIST 2005 language recognition evaluation campaign. In experiments following LRE05 the PPRLM-lattice sub-system presented here achieved a 30s/primary condition EER of 4.87%, making it the single best performing recognizer...
Nonlinear equalization for RF receivers
Summary
Summary
This paper describes the need for High Performance Computing (HPC) to facilitate the development and implementation of a nonlinear equalizer that is capable of mitigating and/or eliminating nonlinear distortion to extend the dynamic range of radar front-end receivers decades beyond the analog state-of-the-art. The search space for the optimal nonlinear...
The mixer and transcript reading corpora: resources for multilingual, crosschannel speaker recognition research
Summary
Summary
This paper describes the planning and creation of the Mixer and Transcript Reading corpora, their properties and yields, and reports on the lessons learned during their development.
A scalable phonetic vocoder framework using joint predictive vector quantization of MELP parameters
Summary
Summary
We present the framework for a Scalable Phonetic Vocoder (SPV) capable of operating at bit rates from 300 - 1100 bps. The underlying system uses an HMM-based phonetic speech recognizer to estimate the parameters for MELP speech synthesis. We extend this baseline technique in three ways. First, we introduce the...
SVM based speaker verification using a GMM supervector kernel and NAP variability compensation
Summary
Summary
Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent discovery is that latent...
Support vector machines using GMM supervectors for speaker verification
Summary
Summary
Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker recognition. The standard training method for GMMmodels is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. Recent methods in compensation for speaker and channel variability have proposed the idea...
Support vector machines for speaker and language recognition
Summary
Summary
Support vector machines (SVMs) have proven to be a powerful technique for pattern classification. SVMs map inputs into a high-dimensional space and then separate classes with a hyperplane. A critical aspect of using SVMs successfully is the design of the inner product, the kernel, induced by the high dimensional mapping...
Exploiting nonacoustic sensors for speech encoding
Summary
Summary
The intelligibility of speech transmitted through low-rate coders is severely degraded when high levels of acoustic noise are present in the acoustic environment. Recent advances in nonacoustic sensors, including microwave radar, skin vibration, and bone conduction sensors, provide the exciting possibility of both glottal excitation and, more generally, vocal tract...
The MIT-LL/AFRL MT System
Summary
Summary
The MITLL/AFRL MT system is a statistical phrase-based translation system that implements many modern SMT training and decoding techniques. Our system was designed with the long term goal of dealing with corrupted ASR input for Speech-to-Speech MT applications. This paper will discuss the architecture of the MITLL/AFRL MT system, and...
Synthesis, analysis, and pitch modification of the breathy vowel
Summary
Summary
Breathiness is an aspect of voice quality that is difficult to analyze and synthesize, especially since its periodic and noise components are typically overlapping in frequency. The decomposition and manipulation of these two components is of importance in a variety of speech application areas such as text-to-speech synthesis, speech encoding...