Publications

Refine Results

(Filters Applied) Clear All

Improved Monte Carlo sampling for conflict probability estimation

Published in:
51st AIAA/ASME/AHS/ACS Structures, Structural Dynamics, and Materials Conf., 12-15 April 2010.

Summary

Probabilistic alerting systems for airborne collision avoidance often depend upon accurate estimates of the probability of conflict. Analytical, numerical approximation, and Monte Carlo methods have been applied to conflict probability estimation. The advantage of a Monte Carlo approach is the greater flexibility afforded in modeling the stochastic behavior of aircraft encounters, but typically many samples are required to provide an adequate conflict probability estimate. One approach to improve accuracy with fewer samples is to use importance sampling, where trajectories are sampled according to a proposal distribution that is different from the one specified by the model. This paper suggests several different sample proposal distributions and demonstrates how they result in significantly improved estimates.
READ LESS

Summary

Probabilistic alerting systems for airborne collision avoidance often depend upon accurate estimates of the probability of conflict. Analytical, numerical approximation, and Monte Carlo methods have been applied to conflict probability estimation. The advantage of a Monte Carlo approach is the greater flexibility afforded in modeling the stochastic behavior of aircraft...

READ MORE

Hybridization process for back-illuminated silicon Geiger-mode avalanche photodiode arrays

Published in:
SPIE Vol. 7681, Advanced Photon Counting Techniques IV, 5 April 2010, 76810P.

Summary

We present a unique hybridization process that permits high-performance back-illuminated silicon Geiger-mode avalanche photodiodes (GM-APDs) to be bonded to custom CMOS readout integrated circuits (ROICs) - a hybridization approach that enables independent optimization of the GM-APD arrays and the ROICs. The process includes oxide bonding of silicon GM-APD arrays to a transparent support substrate followed by indium bump bonding of this layer to a signal-processing ROIC. This hybrid detector approach can be used to fabricate imagers with high-fill-factor pixels and enhanced quantum efficiency in the near infrared as well as large-pixel-count, small-pixel-pitch arrays with pixel-level signal processing. In addition, the oxide bonding is compatible with high-temperature processing steps that can be used to lower dark current and improve optical response in the ultraviolet.
READ LESS

Summary

We present a unique hybridization process that permits high-performance back-illuminated silicon Geiger-mode avalanche photodiodes (GM-APDs) to be bonded to custom CMOS readout integrated circuits (ROICs) - a hybridization approach that enables independent optimization of the GM-APD arrays and the ROICs. The process includes oxide bonding of silicon GM-APD arrays to...

READ MORE

Noncontact detection of homemade explosive constituents via photodissociation followed by laser-induced fluorescence

Published in:
Opt. Express, Vol. 18, No. 6, 15 March 2010, pp. 5399-5406.

Summary

Noncontact detection of the homemade explosive constituents urea nitrate, nitromethane and ammonium nitrate is achieved using photodissociation followed by laser-induced fluorescence (PD-LIF). Our technique utilizes a single ultraviolet laser pulse (~7 ns) to vaporize and photodissociate the condensed-phase materials, and then to detect the resulting vibrationally-excited NO fragments via laser-induced fluorescence. PD-LIF excitation and emission spectra indicate the creation of NO in vibrationally-excited states with significant rotational energy, useful for low-background detection of the parent compound. The results for homemade explosives are compared to one another and 2,6- dinitrotoluene, a component present in many military explosives.
READ LESS

Summary

Noncontact detection of the homemade explosive constituents urea nitrate, nitromethane and ammonium nitrate is achieved using photodissociation followed by laser-induced fluorescence (PD-LIF). Our technique utilizes a single ultraviolet laser pulse (~7 ns) to vaporize and photodissociate the condensed-phase materials, and then to detect the resulting vibrationally-excited NO fragments via laser-induced...

READ MORE

Preserving the character of perturbations in scaled pitch contours

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 5 March 2010, pp. 417-420.

Summary

The global and fine dynamic components of a pitch contour in voice production, as in the speaking and singing voice, are important for both the meaning and character of an utterance. In speech, for example, slow pitch inflections, rapid pitch accents, and irregular regions all comprise the pitch contour. In applications where all components of a pitch contour are stretched or compressed in the same way, as for example in time-scale modification, an unnatural scaled contour may result. In this paper, we develop a framework for scaling pitch contours, motivated by the goal of maintaining naturalness in time-scale modification of voice. Specifically, we develop a multi-band algorithm to independently modify the slow trajectory and fast perturbation components of a contour for a more natural synthesis, and we present examples where pitch contours representative of speaking and singing voice are lengthened. In the speaking voice, the frequency content of flutter or irregularity is maintained, while slow pitch inflection is simply stretched or compressed. In the singing voice, rapid vibrato is preserved while slower note-to-note variation is scaled as desired.
READ LESS

Summary

The global and fine dynamic components of a pitch contour in voice production, as in the speaking and singing voice, are important for both the meaning and character of an utterance. In speech, for example, slow pitch inflections, rapid pitch accents, and irregular regions all comprise the pitch contour. In...

READ MORE

Detection and simulation of scenarios with hidden Markov models and event dependency graphs

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 5434-5437.

Summary

The wide availability of signal processing and language tools to extract structured data from raw content has created a new opportunity for the processing of structured signals. In this work, we explore models for the simulation and recognition of scenarios - i.e., time sequences of structured data. For simulation, we construct two models - hidden Markov models (HMMs) and event dependency graphs. Combined, these two simulation methods allow the specification of dependencies in event ordering, simultaneous execution of multiple scenarios, and evolving networks of data. For scenario recognition, we consider the application of multi-grained HMMs. We explore, in detail, mismatch between training scenarios and simulated test scenarios. The methods are applied to terrorist scenario detection with a simulation coded by a subject matter expert.
READ LESS

Summary

The wide availability of signal processing and language tools to extract structured data from raw content has created a new opportunity for the processing of structured signals. In this work, we explore models for the simulation and recognition of scenarios - i.e., time sequences of structured data. For simulation, we...

READ MORE

Multi-class SVM optimization using MCE training with application to topic identification

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 5350-5353.

Summary

This paper presents a minimum classification error (MCE) training approach for improving the accuracy of multi-class support vector machine (SVM) classifiers. We have applied this approach to topic identification (topic ID) for human-human telephone conversations from the Fisher corpus using ASR lattice output. The new approach yields improved performance over the traditional techniques for training multi-class SVM classifiers on this task.
READ LESS

Summary

This paper presents a minimum classification error (MCE) training approach for improving the accuracy of multi-class support vector machine (SVM) classifiers. We have applied this approach to topic identification (topic ID) for human-human telephone conversations from the Fisher corpus using ASR lattice output. The new approach yields improved performance over...

READ MORE

Kalman filter based speech synthesis

Author:
Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 4618-4621.

Summary

Preliminary results are reported from a very simple speech-synthesis system based on clustered-diphone Kalman Filter based modeling of line-spectral frequency based features. Parameters were estimated using maximum-likelihood EM training, with a constraint enforced that prevented eigenvalue magnitudes in the transition matrix from exceeding 1. Frames of training data were assigned diphone unit labels by forced alignment with an HMM recognition system. The HMM cluster tree was also used for Kalman Filter unit cluster assignments. The result is a simple synthesis system that has few parameters, synthesizes intelligible speech without audible discontinuities, and that can be adapted using MLLR techniques to support synthesis of a broad panoply of speakers from a single base model with small amounts of training data. The result is interesting for embedded synthesis applications.
READ LESS

Summary

Preliminary results are reported from a very simple speech-synthesis system based on clustered-diphone Kalman Filter based modeling of line-spectral frequency based features. Parameters were estimated using maximum-likelihood EM training, with a constraint enforced that prevented eigenvalue magnitudes in the transition matrix from exceeding 1. Frames of training data were assigned...

READ MORE

The MITLL NIST LRE 2009 language recognition system

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 4994-4997.

Summary

This paper presents a description of the MIT Lincoln Laboratory language recognition system submitted to the NIST 2009 Language Recognition Evaluation (LRE). This system consists of a fusion of three core recognizers, two based on spectral similarity and one based on tokenization. The 2009 LRE differed from previous ones in that test data included narrowband segments from worldwide Voice of America broadcasts as well as conventional recorded conversational telephone speech. Results are presented for the 23-language closed-set and open-set detection tasks at the 30, 10, and 3 second durations along with a discussion of the language-pair task. On the 30 second 23-language closed set detection task, the system achieved a 1.64 average error rate.
READ LESS

Summary

This paper presents a description of the MIT Lincoln Laboratory language recognition system submitted to the NIST 2009 Language Recognition Evaluation (LRE). This system consists of a fusion of three core recognizers, two based on spectral similarity and one based on tokenization. The 2009 LRE differed from previous ones in...

READ MORE

Toward signal processing theory for graphs and non-Euclidean data

Published in:
ICASSP 2010, IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 15 March 2010, pp. 5415-5417.

Summary

Graphs are canonical examples of high-dimensional non-Euclidean data sets, and are emerging as a common data structure in many fields. While there are many algorithms to analyze such data, a signal processing theory for evaluating these techniques akin to detection and estimation in the classical Euclidean setting remains to be developed. In this paper we show the conceptual advantages gained by formulating graph analysis problems in a signal processing framework by way of a practical example: detection of a subgraph embedded in a background graph. We describe an approach based on detection theory and provide empirical results indicating that the test statistic proposed has reasonable power to detect dense subgraphs in large random graphs.
READ LESS

Summary

Graphs are canonical examples of high-dimensional non-Euclidean data sets, and are emerging as a common data structure in many fields. While there are many algorithms to analyze such data, a signal processing theory for evaluating these techniques akin to detection and estimation in the classical Euclidean setting remains to be...

READ MORE

A linguistically-informative approach to dialect recognition using dialect-discriminating context-dependent phonetic models

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 5014-5017.

Summary

We propose supervised and unsupervised learning algorithms to extract dialect discriminating phonetic rules and use these rules to adapt biphones to identify dialects. Despite many challenges (e.g., sub-dialect issues and no word transcriptions), we discovered dialect discriminating biphones compatible with the linguistic literature, while outperforming a baseline monophone system by 7.5% (relative). Our proposed dialect discriminating biphone system achieves similar performance to a baseline all-biphone system despite using 25% fewer biphone models. In addition, our system complements PRLM (Phone Recognition followed by Language Modeling), verified by obtaining relative gains of 15-29% when fused with PRLM. Our work is an encouraging first step towards a linguistically-informative dialect recognition system, with potential applications in forensic phonetics, accent training, and language learning.
READ LESS

Summary

We propose supervised and unsupervised learning algorithms to extract dialect discriminating phonetic rules and use these rules to adapt biphones to identify dialects. Despite many challenges (e.g., sub-dialect issues and no word transcriptions), we discovered dialect discriminating biphones compatible with the linguistic literature, while outperforming a baseline monophone system by...

READ MORE