Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

By

William M. Campbell Clear filter

Towards reduced false-alarms using cohorts

May 22, 2011

Conference Paper

Author:

Zahi N. Karam

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 22-27 May 2011, pp. 4512-4515.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

The focus of the 2010 NIST Speaker Recognition Evaluation (SRE) was the low false alarm regime of the detection error trade-off (DET) curve. This paper presents several approaches that specifically target this issue. It begins by highlighting the main problem with operating in the low-false alarm regime. Two sets of methods to tackle this issue are presented that require a large and diverse impostor set: the first set penalizes trials whose enrollment and test utterances are not nearest neighbors of each other while the second takes an adaptive score normalization approach similar to TopNorm and ATNorm.

READ LESS

Summary

Towards reduced false-alarms using cohorts

The MIT LL 2010 speaker recognition evaluation system: scalable language-independent speaker recognition

May 22, 2011

Conference Paper

Author:

Douglas E. Sturim

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 22-27 May 2011, pp. 5272-5275.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Research in the speaker recognition community has continued to address methods of mitigating variational nuisances. Telephone and auxiliary-microphone recorded speech emphasize the need for a robust way of dealing with unwanted variation. The design of recent 2010 NIST-SRE Speaker Recognition Evaluation (SRE) reflects this research emphasis. In this paper, we present the MIT submission applied to the tasks of the 2010 NIST-SRE with two main goals--language-independent scalable modeling and robust nuisance mitigation. For modeling, exclusive use of inner product-based and cepstral systems produced a language-independent computationally-scalable system. For robustness, systems that captured spectral and prosodic information, modeled nuisance subspaces using multiple novel methods, and fused scores of multiple systems were implemented. The performance of the system is presented on a subset of the NIST SRE 2010 core tasks.

READ LESS

Summary

The MIT LL 2010 speaker recognition evaluation system: scalable language-independent speaker recognition

USSS-MITLL 2010 human assisted speaker recognition

January 2, 2011

Conference Paper

Author:

Reva Schwartz

…

Published in:

Proc. IEEE ICASSP, 26 May 2011, pp. 5904-7.

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

The United States Secret Service (USSS) teamed with MIT Lincoln Laboratory (MIT/LL) in the US National Institute of Standards and Technology's 2010 Speaker Recognition Evaluation of Human Assisted Speaker Recognition (HASR). We describe our qualitative and automatic speaker comparison processes and our fusion of these processes, which are adapted from USSS casework. The USSS-MIT/LL 2010 HASR results are presented. We also present post-evaluation results. The results are encouraging within the resolving power of the evaluation, which was limited to enable reasonable levels of human effort. Future ideas and efforts are discussed, including new features and capitalizing on naive listeners.

READ LESS

Summary

USSS-MITLL 2010 human assisted speaker recognition

Graph-embedding for speaker recognition

September 30, 2010

Conference Paper

Author:

Zahi N. Karam

…

William M. Campbell

Published in:

INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, 26-30 September 2010, pp. 2742-2745.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Popular methods for speaker classification perform speaker comparison in a high-dimensional space, however, recent work has shown that most of the speaker variability is captured by a low-dimensional subspace of that space. In this paper we examine whether additional structure in terms of nonlinear manifolds exist within the high-dimensional space. We will use graph embedding as a proxy to the manifold and show the use of the embedding in data visualization and exploration. ISOMAP will be used to explore the existence and dimension of the space. We also examine whether the manifold assumption can help in two classification tasks: data-mining and standard NIST speaker recognition evaluations (SRE). Our results show that the data lives on a manifold and that exploiting this structure can yield significant improvements on the data-mining task. The improvement in preliminary experiments on all trials of the NIST SRE Eval-06 core task are less but significant.

READ LESS

Summary

Graph-embedding for speaker recognition

Simple and efficient speaker comparison using approximate KL divergence

September 26, 2010

Conference Paper

Author:

William M. Campbell

…

Zahi N. Karam

Published in:

INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, 26-30 September 2010, pp. 362-365.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

We describe a simple, novel, and efficient system for speaker comparison with two main components. First, the system uses a new approximate KL divergence distance extending earlier GMM parameter vector SVM kernels. The approximate distance incorporates data-dependent mixture weights as well as the standard MAP-adapted GMM mean parameters. Second, the system applies a weighted nuisance projection method for channel compensation. A simple eigenvector method of training is presented. The resulting speaker comparison system is straightforward to implement and is computationally simple? only two low-rank matrix multiplies and an inner product are needed for comparison of two GMM parameter vectors. We demonstrate the approach on a NIST 2008 speaker recognition evaluation task. We provide insight into what methods, parameters, and features are critical for good performance.

READ LESS

Summary

Simple and efficient speaker comparison using approximate KL divergence

Weighted nuisance attribute projection

June 28, 2010

Conference Paper

Author:

William M. Campbell

Published in:

Odyssey 2010, the Speaker and Language Recognition Workshop, 28 June - 1 July 2010.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Nuisance attribute projection (NAP) has become a common method for compensation of channel effects, session variation, speaker variation, and general mismatch in speaker recognition. NAP uses an orthogonal projection to remove a nuisance subspace from a larger expansion space that contains the speaker information. Training the NAP subspace is based on optimizing pairwise distances to reduce intraspeaker variability and retain interspeaker variability. In this paper, we introduce a novel form of NAP called weighted NAP (WNAP) which significantly extends the current methodology. For WNAP, we propose a training criterion that incorporates two critical extensions to NAP variable metrics and instance-weighted training. Both an eigenvector and iterative method are proposed for solving the resulting optimization problem. The effectiveness of WNAP is shown on a NIST speaker recognition evaluation task where error rates are reduced by over 20%.

READ LESS

Summary

Weighted nuisance attribute projection

The application of statistical relational learning to a database of criminal and terrorist activity

April 29, 2010

Conference Paper

Author:

Brian W. Delaney

…

Published in:

SIAM Conf. on Data Mining, 29 April - 1 May 2010.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

We apply statistical relational learning to a database of criminal and terrorist activity to predict attributes and event outcomes. The database stems from a collection of news articles and court records which are carefully annotated with a variety of variables, including categorical and continuous fields. Manual analysis of this data can help inform decision makers seeking to curb violent activity within a region. We use this data to build relational models from historical data to predict attributes of groups, individuals, or events. Our first example involves predicting social network roles within a group under a variety of different data conditions. Collective classification can be used to boost the accuracy under data poor conditions. Additionally, we were able to predict the outcome of hostage negotiations using models trained on previous kidnapping events. The overall framework and techniques described here are flexible enough to be used to predict a variety of variables. Such predictions could be used as input to a more complex system to recognize intent of terrorist groups or as input to inform human decision makers.

READ LESS

Summary

The application of statistical relational learning to a database of criminal and terrorist activity

Detection and simulation of scenarios with hidden Markov models and event dependency graphs

March 15, 2010

Conference Paper

Author:

William M. Campbell

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 5434-5437.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

The wide availability of signal processing and language tools to extract structured data from raw content has created a new opportunity for the processing of structured signals. In this work, we explore models for the simulation and recognition of scenarios - i.e., time sequences of structured data. For simulation, we construct two models - hidden Markov models (HMMs) and event dependency graphs. Combined, these two simulation methods allow the specification of dependencies in event ordering, simultaneous execution of multiple scenarios, and evolving networks of data. For scenario recognition, we consider the application of multi-grained HMMs. We explore, in detail, mismatch between training scenarios and simulated test scenarios. The methods are applied to terrorist scenario detection with a simulation coded by a subject matter expert.

READ LESS

Summary

Detection and simulation of scenarios with hidden Markov models and event dependency graphs

Speaker comparison with inner product discriminant functions

December 7, 2009

Conference Paper

Author:

William M. Campbell

…

Published in:

Neural Information Processing Symp., 7 December 2009.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Speaker comparison, the process of finding the speaker similarity between two speech signals, occupies a central role in a variety of applications - speaker verification, clustering, and identification. Speaker comparison can be placed in a geometric framework by casting the problem as a model comparison process. For a given speech signal, feature vectors are produced and used to adapt a Gaussian mixture model (GMM). Speaker comparison can then be viewed as the process of compensating and finding metrics on the space of adapted models. We propose a framework, inner product discriminant functions (IPDFs), which extends many common techniques for speaker comparison - support vector machines, joint factor analysis, and linear scoring. The framework uses inner products between the parameter vectors of GMM models motivated by several statistical methods. Compensation of nuisances is performed via linear transforms on GMM parameter vectors. Using the IPDF framework, we show that many current techniques are simple variations of each other. We demonstrate, on a 2006 NIST speaker recognition evaluation task, new scoring methods using IPDFs which produce excellent error rates and require significantly less computation than current techniques.

READ LESS

Summary

Speaker comparison with inner product discriminant functions

Variability compensated support vector machines applied to speaker verification

September 6, 2009

Conference Paper

Author:

Zahi N. Karam

…

William M. Campbell

Published in:

INTERSPEECH 2009, Proc. of the 10th Annual Conf. of the Internatinoal Speech Communication Association, 6-9 September 2009, pp. 1555-1558.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Speaker verification using SVMs has proven successful, specifically using the GSV Kernel [1] with nuisance attribute projection (NAP) [2]. Also, the recent popularity and success of joint factor analysis [3] has led to promising attempts to use speaker factors directly as SVM features [4]. NAP projection and the use of speaker factors with SVMs are methods of handling variability in SVM speaker verification: NAP by removing undesirable nuisance variability, and using the speaker factors by forcing the discrimination to be performed based on inter-speaker variability. These successes have led us to propose a new method we call variability compensated SVM (VCSVM) to handle both inter and intra-speaker variability directly in the SVM optimization. This is done by adding a regularized penalty to the optimization that biases the normal to the hyperplane to be orthogonal to the nuisance subspace or alternatively to the complement of the subspace containing the inter-speaker variability. This bias will attempt to ensure that inter-speaker variability is used in the recognition while intra-speaker variability is ignored. In this paper we present the theory and promising results on nuisance compensation.

READ LESS

Summary

Variability compensated support vector machines applied to speaker verification

Publications

Refine Results

By

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results