Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

USSS-MITLL 2010 human assisted speaker recognition

January 2, 2011

Conference Paper

Author:

Reva Schwartz

…

Published in:

Proc. IEEE ICASSP, 26 May 2011, pp. 5904-7.

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

The United States Secret Service (USSS) teamed with MIT Lincoln Laboratory (MIT/LL) in the US National Institute of Standards and Technology's 2010 Speaker Recognition Evaluation of Human Assisted Speaker Recognition (HASR). We describe our qualitative and automatic speaker comparison processes and our fusion of these processes, which are adapted from USSS casework. The USSS-MIT/LL 2010 HASR results are presented. We also present post-evaluation results. The results are encouraging within the resolving power of the evaluation, which was limited to enable reasonable levels of human effort. Future ideas and efforts are discussed, including new features and capitalizing on naive listeners.

READ LESS

Summary

USSS-MITLL 2010 human assisted speaker recognition

Forensic speaker recognition: a need for caution

March 1, 2009

Journal Article

Author:

Joseph P. Campbell Jr

…

Published in:

IEEE Signal Process. Mag., Vol. 26, No. 2, March 2009, pp. 95-103.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

There has long been a desire to be able to identify a person on the basis of his or her voice. For many years, judges, lawyers, detectives, and law enforcement agencies have wanted to use forensic voice authentication to investigate a suspect or to confirm a judgment of guilt or innocence. Challenges, realities, and cautions regarding the use of speaker recognition applied to forensic-quality samples are presented.

READ LESS

Summary

Forensic speaker recognition: a need for caution

Beyond frame independence: parametric modelling of time duration in speaker and language recognition

September 22, 2008

Conference Paper

Author:

Alan V. McCree

…

Published in:

INTERSPEECH 2008, 22-26 September 2008, pp. 767-770.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Advanced RF Techniques and Systems

Summary

In this work, we address the question of generating accurate likelihood estimates from multi-frame observations in speaker and language recognition. Using a simple theoretical model, we extend the basic assumption of independent frames to include two refinements: a local correlation model across neighboring frames, and a global uncertainty due to train/test channel mismatch. We present an algorithm for discriminative training of the resulting duration model based on logistic regression combined with a bisection search. We show that using this model we can achieve state-of-the-art performance for the NIST LRE07 task. Finally, we show that these more accurate class likelihood estimates can be combined to solve multiple problems using Bayes' rule, so that we can expand our single parametric backend to replace all six separate back-ends used in our NIST LRE submission for both closed and open sets.

READ LESS

Summary

Beyond frame independence: parametric modelling of time duration in speaker and language recognition

MIT Lincoln Laboratory multimodal person identification system in the CLEAR 2007 Evaluation

May 8, 2007

Conference Paper

Author:

Kevin Brady

Published in:

2nd Annual Classification of Event Activities and Relationships/Rich Transcription Evaluations, 8-11 May 2008, pp. 240-247.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Advanced RF Techniques and Systems

Summary

A description of the MIT Lincoln Laboratory system used in the person identification task of the recent CLEAR 2007 Evaluation is documented in this paper. This task is broken into audio, visual, and multimodal subtasks. The audio identification system utilizes both a GMM and a SVM subsystem, while the visual (face) identification system utilizes an appearance-based [Kernel] approach for identification. The audio channels, originating from a microphone array, were preprocessed with beamforming and noise preprocessing.

READ LESS

Summary

MIT Lincoln Laboratory multimodal person identification system in the CLEAR 2007 Evaluation

Publications

Refine Results

USSS-MITLL 2010 human assisted speaker recognition

Summary

Summary

Forensic speaker recognition: a need for caution

Summary

Summary

Beyond frame independence: parametric modelling of time duration in speaker and language recognition

Summary

Summary

MIT Lincoln Laboratory multimodal person identification system in the CLEAR 2007 Evaluation

Summary

Summary

Showing Results