Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

Time-varying autoregressive tests for multiscale speech analysis

September 6, 2009

Conference Paper

Author:

Daniel Rudoy

…

Published in:

INTERSPEECH 2009, 10th Annual Conf. of the International Speech Communication Association, pp. 2839-2842.

Topic:

speech enhancement

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper we develop hypothesis tests for speech waveform nonstationarity based on time-varying autoregressive models, and demonstrate their efficacy in speech analysis tasks at both segmental and sub-segmental scales. Key to the successful synthesis of these ideas is our employment of a generalized likelihood ratio testing framework tailored to autoregressive coefficient evolutions suitable for speech. After evaluating our framework on speech-like synthetic signals, we present preliminary results for two distinct analysis tasks using speech waveform data. At the segmental level, we develop an adaptive short-time segmentation scheme and evaluate it on whispered speech recordings, while at the sub-segmental level, we address the problem of detecting the glottal flow closed phase. Results show that our hypothesis testing framework can reliably detect changes in the vocal tract parameters across multiple scales, thereby underscoring its broad applicability to speech analysis.

READ LESS

Summary

Time-varying autoregressive tests for multiscale speech analysis

The MIT Lincoln Laboratory 2008 speaker recognition system

September 6, 2009

Conference Paper

Author:

Douglas E. Sturim

…

Published in:

INTERSPEECH 2009, 6-10 September 2009.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In recent years methods for modeling and mitigating variational nuisances have been introduced and refined. A primary emphasis in this years NIST 2008 Speaker Recognition Evaluation (SRE) was to greatly expand the use of auxiliary microphones. This offered the additional channel variations which has been a historical challenge to speaker verification systems. In this paper we present the MIT Lincoln Laboratory Speaker Recognition system applied to the task in the NIST 2008 SRE. Our approach during the evaluation was two-fold: 1) Utilize recent advances in variational nuisance modeling (latent factor analysis and nuisance attribute projection) to allow our spectral speaker verification systems to better compensate for the channel variation introduced, and 2) fuse systems targeting the different linguistic tiers of information, high and low. The performance of the system is presented when applied on a NIST 2008 SRE task. Post evaluation analysis is conducted on the sub-task when interview microphones are present.

READ LESS

Summary

The MIT Lincoln Laboratory 2008 speaker recognition system

Automatic registration of LIDAR and optical images of urban scenes

June 20, 2009

Conference Paper

Author:

Andrew Mastin

…

Published in:

CVPR 2009, IEEE Conf. on Computer Vision and Pattern Recognition, 20-25 June 2009, pp. 2639-2646.

Topic:

ladar

R&D area:

R&D group:

Embedded and AI Systems

Summary

Fusion of 3D laser radar (LIDAR) imagery and aerial optical imagery is an efficient method for constructing 3D virtual reality models. One difficult aspect of creating such models is registering the optical image with the LIDAR point cloud, which is characterized as a camera pose estimation problem. We propose a novel application of mutual information registration methods, which exploits the statistical dependency in urban scenes of optical apperance with measured LIDAR elevation. We utilize the well known downhill simplex optimization to infer camera pose parameters. We discuss three methods for measuring mutual information between LIDAR imagery and optical imagery. Utilization of OpenGL and graphics hardware in the optimization process yields registration times dramatically lower than previous methods. Using an initial registration comparable to GPS/INS accuracy, we demonstrate the utility of our algorithm with a collection of urban images and present 3D models created with the fused imagery.

READ LESS

Summary

Automatic registration of LIDAR and optical images of urban scenes

Compressed sensing arrays for frequency-sparse signal detection and geolocation

June 15, 2009

Conference Paper

Author:

Benjamin A. Miller

…

Published in:

Proc. of the 2009 DoD High Performance Computing Modernization Program Users Group Conf., HPCMP-UGC, 15 June 2009, pp. 297-301.

Topic:

signal processing

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Compressed sensing (CS) can be used to monitor very wide bands when the received signals are sparse in some basis. We have developed a compressed sensing receiver architecture with the ability to detect, demodulate, and geolocate signals that are sparse in frequency. In this paper, we evaluate detection, reconstruction, and angle of arrival (AoA) estimation via Monte Carlo simulation and find that, using a linear 4- sensor array and undersampling by a factor of 8, we achieve near-perfect detection when the received signals occupy up to 5% of the bandwidth being monitored and have an SNR of 20 dB or higher. The signals in our band of interest include frequency-hopping signals detected due to consistent AoA. We compare CS array performance using sensor-frequency and space-frequency bases, and determine that using the sensor-frequency basis is more practical for monitoring wide bands. Though it requires that the received signals be sparse in frequency, the sensor-frequency basis still provides spatial information and is not affected by correlation between uncompressed basis vectors.

READ LESS

Summary

Compressed sensing arrays for frequency-sparse signal detection and geolocation

Advocate: a distributed architecture for speech-to-speech translation

June 1, 2009

Journal Article

Author:

Arya R. Aminzadeh

…

Wade Shen

Published in:

Lincoln Laboratory Journal, Vol. 18, No. 1, June 2009, pp. 54-65.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Advocate is a set of communications application programming interfaces and service wrappers that serve as a framework for creating complex and scalable real-time software applications from component processing algorithms. Advocate can be used for a variety of distributed processing applications, but was initially designed to use existing speech processing and machine translation components in the rapid construction of large-scale speech-to-speech translation systems. Many such speech-to-speech translation applications require real-time processing, and Advocate provides this speed with low-latency communication between services.

READ LESS

Summary

Advocate: a distributed architecture for speech-to-speech translation

Polyphase nonlinear equalization of time-interleaved analog-to-digital converters

June 1, 2009

Journal Article

Author:

Joel I. Goodman

…

Published in:

IEEE J. Sel. Top. Sig. Process., Vol. 3, No. 3, June 2009, pp. 362-373.

Topic:

signal processing

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

As the demand for higher data rates increases, commercial analog-to-digital converters (ADCs) are more commonly being implemented with multiple on-chip converters whose outputs are time-interleaved. The distortion generated by time-interleaved ADCs is now not only a function of the nonlinear behavior of the constituent circuitry, but also mismatches associated with interleaving multiple output streams. To mitigate distortion generated by time-interleaved ADCs, we have developed a polyphase NonLinear EQualizer (pNLEQ) which is capable of simultaneously mitigating distortion generated by both the on-chip circuitry and mismatches due to time interleaving. In this paper, we describe the pNLEQ architecture and present measurements of its performance.

READ LESS

Summary

Polyphase nonlinear equalization of time-interleaved analog-to-digital converters

Machine translation for government applications

June 1, 2009

Journal Article

Author:

Douglas A. Jones

…

Published in:

Lincoln Laboratory Journal, Vol. 18, No. 1, June 2009, pp. 41-53.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

The idea of a mechanical process for converting one human language into another can be traced to a letter written by René Descartes in 1629, and after nearly 400 years, this vision has not been fully realized. Machine translation (MT) using digital computers has been a grand challenge for computer scientists, mathematicians, and linguists since the first international conference on MT was held at the Massachusetts Institute of Technology in 1952. Currently, Lincoln Laboratory is achieving success in a highly focused research program that specializes in developing speech translation technology for limited language resource domains and in adapting foreign-language proficiency standards for MT evaluation. Our specialized research program is situated within a general framework for multilingual speech and text processing for government applications.

READ LESS

Summary

Machine translation for government applications

Advocate: a distributed voice-oriented computing architecture

May 31, 2009

Conference Paper

Author:

Arya R. Aminzadeh

…

Published in:

North American Chapter of the Association for Computational Linguistics - Human Language Technologies Conf. (NAACL HLT 2009), 31 May - 5 June 2009.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Advocate is a lightweight and easy-to-use computing architecture that supports real-time, voice-oriented computing. It is designed to allow the combination of multiple speech and language processing components to create cohesive distributed applications. It is scalable, supporting local processing of all NLP/speech components when sufficient processing resources are available to one machine, or fully distributed/networked processing over an arbitrarily large compute structure when more compute resources are needed. Advocate is designed to operate in a large distributed test-bed in which an arbitrary number of NLP/speech services interface with an arbitrary number of Advocate clients applications. In this configuration, each Advocate client application employs automatic service discovery, calling them as required.

READ LESS

Summary

Advocate: a distributed voice-oriented computing architecture

Modeling and detection techniques for counter-terror social network analysis and intent recognition

March 7, 2009

Conference Paper

Author:

Clifford J. Weinstein

…

Published in:

2009 IEEE Aerospace Conf., March 7-14 2009.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper, we describe our approach and initial results on modeling, detection, and tracking of terrorist groups and their intents based on multimedia data. While research on automated information extraction from multimedia data has yielded significant progress in areas such as the extraction of entities, links, and events, less progress has been made in the development of automated tools for analyzing the results of information extraction to ?connect the dots.? Hence, our Counter-Terror Social Network Analysis and Intent Recognition (CT-SNAIR) work focuses on development of automated techniques and tools for detection and tracking of dynamically-changing terrorist networks as well as recognition of capability and potential intent. In addition to obtaining and working with real data for algorithm development and test, we have a major focus on modeling and simulation of terrorist attacks based on real information about past attacks. We describe the development and application of a new Terror Attack Description Language (TADL), which is used as a basis for modeling and simulation of terrorist attacks. Examples are shown which illustrate the use of TADL and a companion simulator based on a Hidden Markov Model (HMM) structure to generate transactions for attack scenarios drawn from real events. We also describe our techniques for generating realistic background clutter traffic to enable experiments to estimate performance in the presence of a mix of data. An important part of our effort is to produce scenarios and corpora for use in our own research, which can be shared with a community of researchers in this area. We describe our scenario and corpus development, including specific examples from the September 2004 bombing of the Australian embassy in Jakarta and a fictitious scenario which was developed in a prior project for research in social network analysis. The scenarios can be created by subject matter experts using a graphical editing tool. Given a set of time ordered transactions between actors, we employ social network analysis (SNA) algorithms as a filtering step to divide the actors into distinct communities before determining intent. This helps reduce clutter and enhances the ability to determine activities within a specific group. For modeling and simulation purposes, we generate random networks with structures and properties similar to real-world social networks. Modeling of background traffic is an important step in generating classifiers that can separate harmless activities from suspicious activity. An algorithm for recognition of simulated potential attack scenarios in clutter based on Support Vector Machine (SVM) techniques is presented. We show performance examples, including probability of detection versus probability of false alarm tradeoffs, for a range of system parameters.

READ LESS

Summary

Modeling and detection techniques for counter-terror social network analysis and intent recognition

Forensic speaker recognition: a need for caution

March 1, 2009

Journal Article

Author:

Joseph P. Campbell Jr

…

Published in:

IEEE Signal Process. Mag., Vol. 26, No. 2, March 2009, pp. 95-103.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

There has long been a desire to be able to identify a person on the basis of his or her voice. For many years, judges, lawyers, detectives, and law enforcement agencies have wanted to use forensic voice authentication to investigate a suspect or to confirm a judgment of guilt or innocence. Challenges, realities, and cautions regarding the use of speaker recognition applied to forensic-quality samples are presented.

READ LESS

Summary

Forensic speaker recognition: a need for caution

Publications

Refine Results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Advocate: a distributed voice-oriented computing architecture

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results