Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

The MIT-LL/IBM 2006 speaker recognition system: high-performance reduced-complexity recognition

April 1, 2007

Conference Paper

Author:

William M. Campbell

…

Published in:

Proc. 32nd IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, April 2007, pp. IV-217 - IV-220.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Many powerful methods for speaker recognition have been introduced in recent years--high-level features, novel classifiers, and channel compensation methods. A common arena for evaluating these methods has been the NIST speaker recognition evaluation (SRE). In the NIST SRE from 2002-2005, a popular approach was to fuse multiple systems based upon cepstral features and different linguistic tiers of high-level features. With enough enrollment data, this approach produced dramatic error rate reductions and showed conceptually that better performance was attainable. A drawback in this approach is that many high-level systems were being run independently requiring significant computational complexity and resources. In 2006, MIT Lincoln Laboratory focused on a new system architecture which emphasized reduced complexity. This system was a carefully selected mixture of high-level techniques, new classifier methods, and novel channel compensation techniques. This new system has excellent accuracy and has substantially reduced complexity. The performance and computational aspects of the system are detailed on a NIST 2006 SRE task.

READ LESS

Summary

The MIT-LL/IBM 2006 speaker recognition system: high-performance reduced-complexity recognition

Robust speaker recognition with cross-channel data: MIT-LL results on the 2006 NIST SRE auxiliary microphone task

April 1, 2007

Conference Paper

Author:

Douglas E. Sturim

…

Published in:

Proc. 32nd IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, April 2007, pp. IV-49 - IV-52.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

One particularly difficult challenge for cross-channel speaker verification is the auxiliary microphone task introduced in the 2005 and 2006 NIST Speaker Recognition Evaluations, where training uses telephone speech and verification uses speech from multiple auxiliary microphones. This paper presents two approaches to compensate for the effects of auxiliary microphones on the speech signal. The first compensation method mitigates session effects through Latent Factor Analysis (LFA) and Nuisance Attribute Projection (NAP). The second approach operates directly on the recorded signal with noise reduction techniques. Results are presented that show a reduction in the performance gap between telephone and auxiliary microphone data.

READ LESS

Summary

Robust speaker recognition with cross-channel data: MIT-LL results on the 2006 NIST SRE auxiliary microphone task

Auditory modeling as a basis for spectral modulation analysis with application to speaker recognition

January 31, 2007

Technical Report

Author:

Tianyu Tom Wang

…

Thomas F. Quatieri

Published in:

MIT Lincoln Laboratory Report TR-1119

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

This report explores auditory modeling as a basis for robust automatic speaker verification. Specifically, we have developed feature-extraction front-ends that incorporate (1) time-varying, level-dependent filtering, (2) variations in analysis filterbank size,and (3) nonlinear adaptation. Our methods are motivated both by a desire to better mimic auditory processing relative to traditional front-ends (e.g., the mel-cepstrum) as well as by reported gains in automatic speech recognition robustness exploiting similar principles. Traditional mel-cepstral features in automatic speaker recognition are derived from ~20 invariant band-pass filter weights, thereby discarding temporal structure from phase. In contrast, cochlear frequency decomposition can be more precisely modeled as the output of ~3500 time-varying, level-dependent filters. Auditory signal processing is therefore more resolved in frequency than mel-cepstral analysis and also derives temporal information. Furthermore, loss of level-dependence has been suggested to reduce human speech reception in adverse acoustic environments. We were thus motivated to employ a recently proposed level-dependent compressed gammachirp filterbank in feature extraction as well as vary the number of filters or filter weights to improve frequency resolution. We are also simulating nonlinear adaptation models of inner hair cell function along the basilar membrane that presumably mimic temporal masking effects. Auditory-based front-ends are being evaluated with the Lincoln Laboratory Gaussian mixture model recognizer on the TIMIT database under clean and noisy (additive Gaussian white noise) conditions. Preliminary results of features derived from our auditory models suggest that they provide complementary information to the mel-cepstrum under clean and noisy conditions, resulting in speaker recognition performance improvements.

READ LESS

Summary

Auditory modeling as a basis for spectral modulation analysis with application to speaker recognition

Automatic language recognition via spectral and token based approaches

January 1, 2007

Book Chapter

Author:

Douglas A. Reynolds

…

Published in:

Chapter 41 in Springer Handbook of Speech Processing and Communication, 2007, pp. 811-24.

Topic:

language recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Automatic language recognition from speech consists of algorithms and techniques that model and classify the language being spoken. Current state-of-the-art language recognition systems fall into two broad categories: spectral- and token-sequence-based approaches. In this chapter, we describe algorithms for extracting features and models representing these types of language cues and systems for making recognition decisions using one or more of these language cues. A performance assessment of these systems is also provided, in terms of both accuracy and computation considerations, using the National Institute of Science and Technology (NIST) language recognition evaluation benchmarks.

READ LESS

Summary

Automatic language recognition via spectral and token based approaches

Practical attack graph generation for network defense

December 11, 2006

Conference Paper

Author:

Kyle W. Ingols

…

Published in:

Proc. of the 22nd Annual Computer Security Applications Conf., IEEE, 11-15 December 2006, pp.121-130.

Topic:

attack graphs

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

Attack graphs are a valuable tool to network defenders, illustrating paths an attacker can use to gain access to a targeted network. Defenders can then focus their efforts on patching the vulnerabilities and configuration errors that allow the attackers the greatest amount of access. We have created a new type of attack graph, the multiple-prerequisite graph, that scales nearly linearly as the size of a typical network increases. We have built a prototype system using this graph type. The prototype uses readily available source data to automatically compute network reachability, classify vulnerabilities, build the graph, and recommend actions to improve network security. We have tested the prototype on an operational network with over 250 hosts, where it helped to discover a previously unknown configuration error. It has processed complex simulated networks with over 50,000 hosts in under four minutes.

READ LESS

Summary

Practical attack graph generation for network defense

Experimental facility for measuring the impact of environmental noise and speaker variation on speech-to-speech translation devices

December 10, 2006

Conference Paper

Author:

Douglas A. Jones

…

Published in:

Proc. IEEE Spoken Language Technology Workshop, 10-13 December 2006, pp. 250-253.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

We describe the construction and use of a laboratory facility for testing the performance of speech-to-speech translation devices. Approximately 1500 English phrases from various military domains were recorded as spoken by each of 30 male and 12 female English speakers with variation in speaker accent, for a total of approximately 60,000 phrases available for experimentation. We describe an initial experiment using the facility which shows the impact of environmental noise and speaker variability on phrase recognition accuracy for two commercially available oneway speech-to-speech translation devices configured for English-to-Arabic.

READ LESS

Summary

Experimental facility for measuring the impact of environmental noise and speaker variation on speech-to-speech translation devices

The security of OpenBSD: milk or wine?

December 1, 2006

Journal Article

Author:

James A. Ozment

…

Stuart E. Schechter

Published in:

;login:, Vol. 31, No. 6, December 2006, pp. 26-32.

Topic:

software

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Purchase a fine wine, place it in a cellar, and wait a few years: The aging will have resulted in a delightful beverage, a product far better than the original. Purchase a gallon of milk, place it in a cellar, and wait a few years. You will be sorry. We know how the passing of time affects milk and wine, but how does aging affect the security of software? Many in the security research community have criticized software developers both for releasing software with so many vulnerabilities and for the lack of any apparent improvement in this software over time. However, critics have lacked quantitative evidence that applying effort over time will result in software with fewer vulnerabilities. In short, we don't know whether software security is destined to age like milk or has the potential to become wine. We thus investigated whether or not the rate at which vulnerabilities are reported in OpenBSD is decreasing over time.

READ LESS

Summary

The security of OpenBSD: milk or wine?

An efficient graph search decoder for phrase-based statistical machine translation

November 28, 2006

Conference Paper

Author:

Brian W. Delaney

…

Published in:

Int. Workshop on Spoken Language Translation, 28 November 2006.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper we describe an efficient implementation of a graph search algorithm for phrase-based statistical machine translation. Our goal was to create a decoder that could be used for both our research system and a real-time speech-to-speech machine translation demonstration system. The search algorithm is based on a Viterbi graph search with an A* heuristic. We were able to increase the speed of our decoder substantially through the use of on-the-fly beam pruning and other algorithmic enhancements. The decoder supports a variety of reordering constraints as well as arbitrary n-gram decoding. In addition, we have implemented disk based translation models and a messaging interface to communicate with other components for use in our real-time speech translation system.

READ LESS

Summary

An efficient graph search decoder for phrase-based statistical machine translation

The JHU Workshop 2006 IWSLT System

November 27, 2006

Conference Paper

Author:

Wade Shen

…

Published in:

Int. Workshop on Spoken Language Translation, IWSLT, 27-28 November 2006.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

This paper describes the SMT we built during the 2006 JHU Summer Workshop for the IWSLT 2006 evaluation. Our effort focuses on two parts of the speech translation problem: 1) efficient decoding of word lattices and 2) novel applications of factored translation models to IWSLT-specific problems. In this paper, we present results from the open-track Chinese-to-English condition. Improvements of 5-10% relative BLEU are obtained over a high performing baseline. We introduce a new open-source decoder that implements the state-of-the-art in statistical machine translation.

READ LESS

Summary

The JHU Workshop 2006 IWSLT System

The MIT-LL/AFRL IWSLT-2006 MT system

November 27, 2006

Conference Paper

Author:

Wade Shen

…

Published in:

Proc. Int. Workshop on Spoken Language Translation, IWSLT, 27-28 November 2006.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

The MIT-LL/AFRL MT system is a statistical phrase-based translation system that implements many modern SMT training and decoding techniques. Our system was designed with the long-term goal of dealing with corrupted ASR input and limited amounts of training data for speech-to-speech MT applications. This paper will discuss the architecture of the MIT-LL/AFRL MT system, improvements over our 2005 system, and experiments with manual and ASR transcription data that were run as part of the IWSLT-2006 evaluation campaign.

READ LESS

Summary

The MIT-LL/AFRL IWSLT-2006 MT system

Publications

Refine Results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results