Publications

Refine Results

(Filters Applied) Clear All

Multisensor dynamic waveform fusion

Published in:
Proc. 32nd Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, April 2007, pp. IV-577 - IV-580.

Summary

Speech communication is significantly more difficult in severe acoustic background noise environments, especially when low-rate speech coders are used. Non-acoustic sensors, such as radar sensors, vibrometers, and bone-conduction microphones, offer significant potential in these situations. We extend previous work on fixed waveform fusion from multiple sensors to an optimal dynamic waveform fusion algorithm that minimizes both additive noise and signal distortion in the estimated speech signal. We show that a minimum mean squared error (MMSE) waveform matching criterion results in a generalized multichannel Wiener filter, and that this filter will simultaneously perform waveform fusion, noise suppression, and crosschannel noise cancellation. Formal intelligibility and quality testing demonstrate significant improvement from this approach.
READ LESS

Summary

Speech communication is significantly more difficult in severe acoustic background noise environments, especially when low-rate speech coders are used. Non-acoustic sensors, such as radar sensors, vibrometers, and bone-conduction microphones, offer significant potential in these situations. We extend previous work on fixed waveform fusion from multiple sensors to an optimal dynamic...

READ MORE

The MIT-LL/IBM 2006 speaker recognition system: high-performance reduced-complexity recognition

Published in:
Proc. 32nd IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, April 2007, pp. IV-217 - IV-220.

Summary

Many powerful methods for speaker recognition have been introduced in recent years--high-level features, novel classifiers, and channel compensation methods. A common arena for evaluating these methods has been the NIST speaker recognition evaluation (SRE). In the NIST SRE from 2002-2005, a popular approach was to fuse multiple systems based upon cepstral features and different linguistic tiers of high-level features. With enough enrollment data, this approach produced dramatic error rate reductions and showed conceptually that better performance was attainable. A drawback in this approach is that many high-level systems were being run independently requiring significant computational complexity and resources. In 2006, MIT Lincoln Laboratory focused on a new system architecture which emphasized reduced complexity. This system was a carefully selected mixture of high-level techniques, new classifier methods, and novel channel compensation techniques. This new system has excellent accuracy and has substantially reduced complexity. The performance and computational aspects of the system are detailed on a NIST 2006 SRE task.
READ LESS

Summary

Many powerful methods for speaker recognition have been introduced in recent years--high-level features, novel classifiers, and channel compensation methods. A common arena for evaluating these methods has been the NIST speaker recognition evaluation (SRE). In the NIST SRE from 2002-2005, a popular approach was to fuse multiple systems based upon...

READ MORE

Triage framework for resource conservation in a speaker identification system

Published in:
Proc. 32nd IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, April 2007, pp. IV-69 - IV-72.

Summary

We present a novel framework for triaging (prioritizing and discarding) data to conserve resources for a speaker identification (SID) system. Our work is motivated by applications that require a SID system to process an overwhelming volume of audio data. We design a triage filter whose goal is to conserve recognizer resources while preserving relevant content. We propose triage methods that use signal quality assessment tools, a scaled-down version of the main recognizer itself, and a fusion of these measures. We define a new precision-based measure of effectiveness for our triage framework. Our experimental results with the 35-speaker tactical SID corpus bear out the validity of our approach.
READ LESS

Summary

We present a novel framework for triaging (prioritizing and discarding) data to conserve resources for a speaker identification (SID) system. Our work is motivated by applications that require a SID system to process an overwhelming volume of audio data. We design a triage filter whose goal is to conserve recognizer...

READ MORE

PMatlab: parallel Matlab library for signal processing applications

Published in:
ICASSP, 32nd IEEE Int. Conf. on Acoustics Speech and Signal Processing, April 2007, pp. IV-1189 - IV-1192.

Summary

MATLAB is one of the most commonly used languages for scientific computing with approximately one million users worldwide. At MIT Lincoln Laboratory, MATLAB is used by technical staff to develop sensor processing algorithms. MATLAB'S popularity is based on availability of high-level abstractions leading to reduced code development time. Due to the compute intensive nature of scientific computing, these applications often require long running times and would benefit greatly from increased performance offered by parallel computing. pMatlab implements partitioned global address space (PGAS) support via standard operator overloading techniques. The core data structures in pMatlab are distributed arrays and maps, which simplify parallel programming by removing the need for explicit message passing. This paper presents the pMaltab design and results for the HPC Challenge benchmark suite. Additionally, two case studies of pMatlab use are described.
READ LESS

Summary

MATLAB is one of the most commonly used languages for scientific computing with approximately one million users worldwide. At MIT Lincoln Laboratory, MATLAB is used by technical staff to develop sensor processing algorithms. MATLAB'S popularity is based on availability of high-level abstractions leading to reduced code development time. Due to...

READ MORE

Coverage maximization using dynamic taint tracing

Published in:
MIT Lincoln Laboratory Report TR-1112

Summary

We present COMET, a system that automatically assembles a test suite for a C program to improve line coverage, and give initial results for a prototype implementation. COMET works dynamically, running the program under a variety of instrumentations in a feedback loop that adds new inputs to an initial corpus with each iteration. One instrumentation in particular is crucial to the success of this approach: dynamic taint tracing. Inputs are labeled as tainted at the byte level and all read/write pairs in the program are augmented to track the flow of taint between memory objects. This allows COMET to determine from which bytes of which inputs the variables in conditions derive, thereby dramatically narrowing the search over inputs necessary to expose new code. On a test set of 13 example program, COMET improves upon the level of coverage reached in random testing by an average of 23% relative, takes only about twice the time, and requires a tiny fraction of the number of inputs to do so.
READ LESS

Summary

We present COMET, a system that automatically assembles a test suite for a C program to improve line coverage, and give initial results for a prototype implementation. COMET works dynamically, running the program under a variety of instrumentations in a feedback loop that adds new inputs to an initial corpus...

READ MORE

Analysis of operational alternatives to the Terminal Doppler Weather Radar (TDWR)

Published in:
MIT Lincoln Laboratory Report ATC-332

Summary

Possible alternatives to the Terminal Doppler Weather Radar (TDWR) are assessed. We consider both the low altitude wind shear detection service provided by TDWR and its role in reducing weather-related airport delays through its input to the Integrated Terminal Weather System (ITWS). Airborne predictive wind shear (PWS) radars do not provide the broad area situational awareness needed to proactively reroute aircraft away from the affected runways. We considered in detail the alternative of using the ASR-9 Weather Systems Processor (WSP) and NEXRAD in lieu of TDWR. An objective metric for wind shear detection capability was calculated for each of these radars at all TDWR equipped airports. TDWR was uniformly superior by this metric, and at a number of the airports, the ASR-9/NEXRAD alternative scored so low as to raise questions whether it would be operationally acceptable. To assess airport weather delay reduction impact, we compared the accuracy of the high-benefit ITWS "Terminal Winds" product with and without TDWR input. Removal of the TDWR data would have increased the mean estimate error by a factor of 3 near the surface.
READ LESS

Summary

Possible alternatives to the Terminal Doppler Weather Radar (TDWR) are assessed. We consider both the low altitude wind shear detection service provided by TDWR and its role in reducing weather-related airport delays through its input to the Integrated Terminal Weather System (ITWS). Airborne predictive wind shear (PWS) radars do not...

READ MORE

The digital focal plane array (DFPA) architecture for data processing "on-chip"

Published in:
2007 Meeting of the Military Sensing Symposia (MSS) Specialty Group on Camouflage, Concealment & Deception; Passive Sensors; Detectors; and Materials, 5-9 February 2007.

Summary

The digital focal plane array (DFPA) project seeks to develop readout integrated circuits (ROICs) utilizing aggressively scaled and commercially available CMOS. Along with focal plane scaling and readout robustness benefits, the DFPA architecture provides a very simple way to implement processing algorithms directly on image data, in real-time, and prior to read-out of the data to an external digitizer or computer. In principle, almost any linear image processing filter kernel can be convolved with the scene image prior to readout. The useful size of the filter kernel is only limited by the size of the DFPA. Time domain filters can also be implemented on the ROIC to accomplish digital time domain integration (TDI) or change detection algorithms. The unique architecture can achieve the processing capability without the use of traditional digital adders or multipliers, like those used in most signal processors. Instead, a DFPA manipulates sequential digital counters under every pixel in a unique way to achieve the desired functionality. A non-addressable readout scheme is used for data transfer in four possible directions across the array. Although we are currently targeting longwave infrared (LWIR) applications, the approach can be potentially applied to any imaging application in any band.
READ LESS

Summary

The digital focal plane array (DFPA) project seeks to develop readout integrated circuits (ROICs) utilizing aggressively scaled and commercially available CMOS. Along with focal plane scaling and readout robustness benefits, the DFPA architecture provides a very simple way to implement processing algorithms directly on image data, in real-time, and prior...

READ MORE

Auditory modeling as a basis for spectral modulation analysis with application to speaker recognition

Published in:
MIT Lincoln Laboratory Report TR-1119

Summary

This report explores auditory modeling as a basis for robust automatic speaker verification. Specifically, we have developed feature-extraction front-ends that incorporate (1) time-varying, level-dependent filtering, (2) variations in analysis filterbank size,and (3) nonlinear adaptation. Our methods are motivated both by a desire to better mimic auditory processing relative to traditional front-ends (e.g., the mel-cepstrum) as well as by reported gains in automatic speech recognition robustness exploiting similar principles. Traditional mel-cepstral features in automatic speaker recognition are derived from ~20 invariant band-pass filter weights, thereby discarding temporal structure from phase. In contrast, cochlear frequency decomposition can be more precisely modeled as the output of ~3500 time-varying, level-dependent filters. Auditory signal processing is therefore more resolved in frequency than mel-cepstral analysis and also derives temporal information. Furthermore, loss of level-dependence has been suggested to reduce human speech reception in adverse acoustic environments. We were thus motivated to employ a recently proposed level-dependent compressed gammachirp filterbank in feature extraction as well as vary the number of filters or filter weights to improve frequency resolution. We are also simulating nonlinear adaptation models of inner hair cell function along the basilar membrane that presumably mimic temporal masking effects. Auditory-based front-ends are being evaluated with the Lincoln Laboratory Gaussian mixture model recognizer on the TIMIT database under clean and noisy (additive Gaussian white noise) conditions. Preliminary results of features derived from our auditory models suggest that they provide complementary information to the mel-cepstrum under clean and noisy conditions, resulting in speaker recognition performance improvements.
READ LESS

Summary

This report explores auditory modeling as a basis for robust automatic speaker verification. Specifically, we have developed feature-extraction front-ends that incorporate (1) time-varying, level-dependent filtering, (2) variations in analysis filterbank size,and (3) nonlinear adaptation. Our methods are motivated both by a desire to better mimic auditory processing relative to traditional...

READ MORE

High-power, slab-coupled optical waveguide laser array packaging for beam combining

Published in:
SPIE Vol. 6478, Photonics Packaging, Integration, and Interconnects VII, 23-25 January 2007, pp. 647806-1 - 647806-12.

Summary

Linear arrays of slab coupled optical waveguide lasers (SCOWL) are ideal sources for beam combining of array elements using techniques such as wavelength beam combining (WBC) and possibly coherent beam combining (CBC). SCOWL array elements have very high brightness, low divergence nearly diffraction limited output beams. Arrays of up to 1.2 cm in width containing as many as 240 elements have been demonstrated. In this presentation, the packaging techniques developed to ensure proper performance of SCOWL arrays will be described, with particular emphasis on the application to beam combining. A commercial high performance micro impingement cooler (MIC) was used to provide thermal management for these arrays. Based on performance data for this cooler, a numerical thermal model was constructed and used to investigate the thermal performance for several packaging schemes. In order to promote uniform optical performance of SCOWL array elements, assembly procedures, which included fluxless soldering using In and AuSn solder alloys, along with the use of thermal expansion matching materials were investigated. These techniques resulted in minimal contraction ([approx] 2 um) and smile ([approx]1 um) of the laser bar during the packaging procedure. Precise control of these parameters is required in order to minimize any detrimental impact on the resultant WBC beam quality. CBC of SCOWL arrays requires phase control of the array elements. Array packaging providing for individual electrical addressability of the array elements has been developed and demonstrated, allowing for phase control by current adjustment.
READ LESS

Summary

Linear arrays of slab coupled optical waveguide lasers (SCOWL) are ideal sources for beam combining of array elements using techniques such as wavelength beam combining (WBC) and possibly coherent beam combining (CBC). SCOWL array elements have very high brightness, low divergence nearly diffraction limited output beams. Arrays of up to...

READ MORE

Multifunction phased array radar: technical synopsis, cost implications, and operational capabilities

Published in:
87th Annual American Meteorological Society Meeting, 14-18 January 2007.

Summary

Current U.S. weather and aircraft surveillance radar networks vary in age from 10 to more than 40 years. Ongoing sustainment and upgrade programs can keep these operating in the near to mid term, but the responsible agencies (FAA, NWS and DoD/DHS) recognize that large-scale replacement activities must begin during the next decade. In addition, these agencies are re-evaluating their operational requirements for radar surveillance. FAA has announced that next generation air traffic control (ATC) will be based on Automatic Dependent Surveillance - Broadcast (ADS-B) (Scardina, 2002) rather than current primary and secondary radars. ADS-B, however, requires verification and back-up services which could be provided by retaining or replacing primary ATC radars.
READ LESS

Summary

Current U.S. weather and aircraft surveillance radar networks vary in age from 10 to more than 40 years. Ongoing sustainment and upgrade programs can keep these operating in the near to mid term, but the responsible agencies (FAA, NWS and DoD/DHS) recognize that large-scale replacement activities must begin during the...

READ MORE