Publications
Language recognition with discriminative keyword selection
Summary
Summary
One commonly used approach for language recognition is to convert the input speech into a sequence of tokens such as words or phones and then to use these token sequences to determine the target language. The language classification is typically performed by extracting N-gram statistics from the token sequences and...
Classification methods for speaker recognition
Summary
Summary
Automatic speaker recognition systems have a foundation built on ideas and techniques from the areas of speech science for speaker characterization, pattern recognition and engineering. In this chapter we provide an overview of the features, models, and classifiers derived from these areas that are the basis for modern automatic speaker...
Speaker verification using support vector machines and high-level features
Summary
Summary
High-level characteristics such as word usage, pronunciation, phonotactics, prosody, etc., have seen a resurgence for automatic speaker recognition over the last several years. With the availability of many conversation sides per speaker in current corpora, high-level systems now have the amount of data needed to sufficiently characterize a speaker. Although...
A new kernel for SVM MLLR based speaker recognition
Summary
Summary
Speaker recognition using support vector machines (SVMs) with features derived from generative models has been shown to perform well. Typically, a universal background model (UBM) is adapted to each utterance yielding a set of features that are used in an SVM. We consider the case where the UBM is a...
Nuisance attribute projection
Summary
Summary
Cross-channel degradation is one of the significant challenges facing speaker recognition systems. We study this problem in the support vector machine (SVM) context and nuisance variable compensation in high-dimensional spaces more generally. We present an approach to nuisance variable compensation by removing nuisance attribute-related dimensions in the SVM expansion space...
Text-independent speaker recognition
Summary
Summary
In this chapter, we focus on the area of text-independent speaker verification, with an emphasis on unconstrained telephone conversational speech. We begin by providing a general likelihood ratio detection task framework to describe the various components in modern text-independent speaker verification systems. We next describe the general hierarchy of speaker...
Language recognition with word lattices and support vector machines
Summary
Summary
Language recognition is typically performed with methods that exploit phonotactics--a phone recognition language modeling (PRLM) system. A PRLM system converts speech to a lattice of phones and then scores a language model. A standard extension to this scheme is to use multiple parallel phone recognizers (PPRLM). In this paper, we...
Robust speaker recognition with cross-channel data: MIT-LL results on the 2006 NIST SRE auxiliary microphone task
Summary
Summary
One particularly difficult challenge for cross-channel speaker verification is the auxiliary microphone task introduced in the 2005 and 2006 NIST Speaker Recognition Evaluations, where training uses telephone speech and verification uses speech from multiple auxiliary microphones. This paper presents two approaches to compensate for the effects of auxiliary microphones on...
The MIT-LL/IBM 2006 speaker recognition system: high-performance reduced-complexity recognition
Summary
Summary
Many powerful methods for speaker recognition have been introduced in recent years--high-level features, novel classifiers, and channel compensation methods. A common arena for evaluating these methods has been the NIST speaker recognition evaluation (SRE). In the NIST SRE from 2002-2005, a popular approach was to fuse multiple systems based upon...
Automatic language recognition via spectral and token based approaches
Summary
Summary
Automatic language recognition from speech consists of algorithms and techniques that model and classify the language being spoken. Current state-of-the-art language recognition systems fall into two broad categories: spectral- and token-sequence-based approaches. In this chapter, we describe algorithms for extracting features and models representing these types of language cues and...