Publications
Graph embedding for speaker recognition
Summary
Summary
This chapter presents applications of graph embedding to the problem of text-independent speaker recognition. Speaker recognition is a general term encompassing multiple applications. At the core is the problem of speaker comparison-given two speech recordings (utterances), produce a score which measures speaker similarity. Using speaker comparison, other applications can be...
Graph relational features for speaker recognition and mining
Summary
Summary
Recent advances in the field of speaker recognition have resulted in highly efficient speaker comparison algorithms. The advent of these algorithms allows for leveraging a background set, consisting a large numbers of unlabeled recordings, to improve recognition. In this work, a relational graph, where nodes represent utterances and links represent...
The MIT LL 2010 speaker recognition evaluation system: scalable language-independent speaker recognition
Summary
Summary
Research in the speaker recognition community has continued to address methods of mitigating variational nuisances. Telephone and auxiliary-microphone recorded speech emphasize the need for a robust way of dealing with unwanted variation. The design of recent 2010 NIST-SRE Speaker Recognition Evaluation (SRE) reflects this research emphasis. In this paper, we...
Towards reduced false-alarms using cohorts
Summary
Summary
The focus of the 2010 NIST Speaker Recognition Evaluation (SRE) was the low false alarm regime of the detection error trade-off (DET) curve. This paper presents several approaches that specifically target this issue. It begins by highlighting the main problem with operating in the low-false alarm regime. Two sets of...
Graph-embedding for speaker recognition
Summary
Summary
Popular methods for speaker classification perform speaker comparison in a high-dimensional space, however, recent work has shown that most of the speaker variability is captured by a low-dimensional subspace of that space. In this paper we examine whether additional structure in terms of nonlinear manifolds exist within the high-dimensional space...
Simple and efficient speaker comparison using approximate KL divergence
Summary
Summary
We describe a simple, novel, and efficient system for speaker comparison with two main components. First, the system uses a new approximate KL divergence distance extending earlier GMM parameter vector SVM kernels. The approximate distance incorporates data-dependent mixture weights as well as the standard MAP-adapted GMM mean parameters. Second, the...
Speaker comparison with inner product discriminant functions
Summary
Summary
Speaker comparison, the process of finding the speaker similarity between two speech signals, occupies a central role in a variety of applications - speaker verification, clustering, and identification. Speaker comparison can be placed in a geometric framework by casting the problem as a model comparison process. For a given speech...
A framework for discriminative SVM/GMM systems for language recognition
Summary
Summary
Language recognition with support vector machines and shifted-delta cepstral features has been an excellent performer in NIST-sponsored language evaluation for many years. A novel improvement of this method has been the introduction of hybrid SVM/GMM systems. These systems use GMM supervectors as an SVM expansion for classification. In prior work...
The MIT Lincoln Laboratory 2008 speaker recognition system
Summary
Summary
In recent years methods for modeling and mitigating variational nuisances have been introduced and refined. A primary emphasis in this years NIST 2008 Speaker Recognition Evaluation (SRE) was to greatly expand the use of auxiliary microphones. This offered the additional channel variations which has been a historical challenge to speaker...
Variability compensated support vector machines applied to speaker verification
Summary
Summary
Speaker verification using SVMs has proven successful, specifically using the GSV Kernel [1] with nuisance attribute projection (NAP) [2]. Also, the recent popularity and success of joint factor analysis [3] has led to promising attempts to use speaker factors directly as SVM features [4]. NAP projection and the use of...