Publications

Refine Results

(Filters Applied) Clear All

Exploiting temporal change in pitch in formant estimation

Published in:
Proc. IEEE Int. Conf. on Acoustic, Speech, and Signal Processes, ICASSP, 31 March - 4 April 2008, pp. 3929-3932.

Summary

This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our work is inspired by auditory perception and physiological modeling studies implicating the use of temporal changes in speech by humans. Specifically, we develop and assess signal processing schemes aimed at exploiting temporal change of pitch as a basis for formant estimation. Our methods are cast in a generalized framework of two-dimensional processing of speech and show quantitative improvements under certain conditions over representations derived from traditional and homomorphic linear prediction. We conclude by highlighting potential benefits of our framework in the particular application of speaker recognition with preliminary results indicating a performance gender-gap closure on subsets of the TIMIT corpus.
READ LESS

Summary

This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our work is inspired by auditory perception and physiological modeling studies implicating the use of temporal changes in speech by humans. Specifically, we develop and assess...

READ MORE

Language recognition with discriminative keyword selection

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, 31 March - 4 April 2008, pp. 4145-4148.

Summary

One commonly used approach for language recognition is to convert the input speech into a sequence of tokens such as words or phones and then to use these token sequences to determine the target language. The language classification is typically performed by extracting N-gram statistics from the token sequences and then using an N-gram language model or support vector machine (SVM) to perform the classification. One problem with these approaches is that the number of N-grams grows exponentially as the order N is increased. This is especially problematic for an SVM classifier as each utterance is represented as a distinct N-gram vector. In this paper we propose a novel approach for modeling higher order Ngrams using an SVM via an alternating filter-wrapper feature selection method. We demonstrate the effectiveness of this technique on the NIST 2007 language recognition task.
READ LESS

Summary

One commonly used approach for language recognition is to convert the input speech into a sequence of tokens such as words or phones and then to use these token sequences to determine the target language. The language classification is typically performed by extracting N-gram statistics from the token sequences and...

READ MORE

Multisensor very low bit rate speech coding using segment quantization

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, 31 March - 4 April 2008, pp. 3997-4000.

Summary

We present two approaches to noise robust very low bit rate speech coding using wideband MELP analysis/synthesis. Both methods exploit multiple acoustic and non-acoustic input sensors, using our previously-presented dynamic waveform fusion algorithm to simultaneously perform waveform fusion, noise suppression, and crosschannel noise cancellation. One coder uses a 600 bps scalable phonetic vocoder, with a phonetic speech recognizer followed by joint predictive vector quantization of the error in wideband MELP parameters. The second coder operates at 300 bps with fixed 80 ms segments, using novel variable-rate multistage matrix quantization techniques. Formal test results show that both coders achieve equivalent intelligibility to the 2.4 kbps NATO standard MELPe coder in harsh acoustic noise environments, at much lower bit rates, with only modest quality loss.
READ LESS

Summary

We present two approaches to noise robust very low bit rate speech coding using wideband MELP analysis/synthesis. Both methods exploit multiple acoustic and non-acoustic input sensors, using our previously-presented dynamic waveform fusion algorithm to simultaneously perform waveform fusion, noise suppression, and crosschannel noise cancellation. One coder uses a 600 bps...

READ MORE

Improved GMM-based language recognition using constrained MLLR transforms

Author:
Published in:
Proc. 33rd IEEE Int. Conf. on Acoustics, Speech, and SIgnal Processing, ICASSP, 30 March - 4 April 2008, pp. 4149-4152.

Summary

In this paper we describe the application of a feature-space transform based on constrained maximum likelihood linear regression for unsupervised compensation of channel and speaker variability to the language recognition problem. We show that use of such transforms can improve baseline GMM-based language recognition performance on the 2005 NIST Language Recognition Evaluation (LRE05) task by 38%. Furthermore, gains from CMLLR are additive with other modeling enhancements such as vocal tract length normalization (VTLN). Further improvement is obtained using discriminative training, and it is shown that a system using only CMLLR adaption produces state-of-the-art accuracy with decreased test-time computational cost than systems using VTLN.
READ LESS

Summary

In this paper we describe the application of a feature-space transform based on constrained maximum likelihood linear regression for unsupervised compensation of channel and speaker variability to the language recognition problem. We show that use of such transforms can improve baseline GMM-based language recognition performance on the 2005 NIST Language...

READ MORE

Experimental demonstration of remote optical detection of trace explosives.

Published in:
SPIE Vol. 6954, Chemical, Biologica, Radiological, Nuclear and Explosives (CBRNE) Sensing IX, 18-20 March 2008, 695407.

Summary

MIT Lincoln Laboratory has developed a concept that could enable remote (10s of meters) detection of trace explosives' residues via a field-portable laser system. The technique relies upon laser-induced photodissociation of nitro-bearing explosives into vibrationally excited nitric oxide (NO) fragments. Subsequent optical probing of the first vibrationally excited state at 236 nm yields narrowband fluorescence at the shorter wavelength of 226 nm. With proper optical filtering, these photons provide a highly sensitive explosives signature that is not susceptible to interference from traditional optical clutter sources (e.g., red-shifted fluorescence). Quantitative measurements of trace residues of TNT have been performed demonstrating this technique using a breadboard system, which relies upon a pulsed optical parametric oscillator (OPO) based laser. Based on these results, performance projections for a fieldable system are made.
READ LESS

Summary

MIT Lincoln Laboratory has developed a concept that could enable remote (10s of meters) detection of trace explosives' residues via a field-portable laser system. The technique relies upon laser-induced photodissociation of nitro-bearing explosives into vibrationally excited nitric oxide (NO) fragments. Subsequent optical probing of the first vibrationally excited state at...

READ MORE

Analytic theory of power law graphs

Author:
Published in:
SIAM Conference on Parallel Processing for Scientific Computing

Summary

An analytical theory of power law graphs is presented basedon the Kronecker graph generation technique. The analysisuses Kronecker exponentials of complete bipartite graphsto formulate the sub-structure of such graphs. This allows various high level quantities (e.g. degree distribution,betweenness centrality, diameter, eigenvalues, and isoparametric ratio) to be computed directly from the model pa-rameters. The implications of this work on “clustering”and “dendragram” heuristics are also discussed.
READ LESS

Summary

An analytical theory of power law graphs is presented basedon the Kronecker graph generation technique. The analysisuses Kronecker exponentials of complete bipartite graphsto formulate the sub-structure of such graphs. This allows various high level quantities (e.g. degree distribution,betweenness centrality, diameter, eigenvalues, and isoparametric ratio) to be computed directly from the...

READ MORE

Integration of high-speed surface-channel charge coupled devices into an SOI CMOS process using strong phase shift lithography

Published in:
SPIE Vol. 6924, Optical Microlithography XXI, 26-27 February 2008, pp. 69244R.

Summary

To enable development of novel signal processing circuits, a high-speed surface-channel charge coupled device (CCD) process has been co-integrated with the Lincoln Laboratory 180-nm RF fully depleted silicon-on-insulator (FDSOI) CMOS technology. The CCDs support charge transfer clock speeds in excess of 1 GHz while maintaining high charge transfer efficiency (CTE). Both the CCD and CMOS gates are formed using a single-poly process, with CCD gates isolated by a narrow phase-shift-defined gap. CTE is strongly dependent on tight control of the gap critical dimension (CD). In this paper we review the tradeoffs encountered in the co-integration of the CCD and CMOS technologies. The effect of partial coherence on gap resolution and pattern fidelity is discussed. The impact of asymmetric bias due to phase error and phase shift mask (PSM) sidewall effects is presented, along with adopted mitigation strategies. Issues relating to CMOS pattern fidelity and CD control in the double patterning process are also discussed. Since some signal processing CCD structures involve two-dimensional transfer paths, many required geometries present phase compliance and trim engineering challenges. Approaches for implementing noncompliant geometries, such as T shapes, are described, and the impact of various techniques on electrical performance is discussed.
READ LESS

Summary

To enable development of novel signal processing circuits, a high-speed surface-channel charge coupled device (CCD) process has been co-integrated with the Lincoln Laboratory 180-nm RF fully depleted silicon-on-insulator (FDSOI) CMOS technology. The CCDs support charge transfer clock speeds in excess of 1 GHz while maintaining high charge transfer efficiency (CTE)...

READ MORE

Polymer matrix effects on acid generation

Published in:
SPIE Vol. 6923, Advances in Resist Materials and Processing Technology XXV, 24-29 February 2008, 692319.

Summary

We have measured the acid generation efficiency with EUV exposure of a PAG in different polymer matrixes representing the main classes of resist polymers as well as some previously described fluoropolymers for lithographic applications. The polymer matrix was found to have a significant effect on the acid generation efficiency of the PAG studied. A linear relationship exists between the absorbance of the resist and the acid generation efficiency. A second inverse relationship exists between Dill C and aromatic content of the resist polymer. It was shown that polymer sensitization is important for acid generation with EUV exposure and the Dill C parameter can be increased by up to five times with highly absorbing non-aromatic polymers, such as non-aromatic fluoropolymers, over an ESCAP polymer. The increase in the Dill C value will lead to an up to five fold increase in resist sensitivity. It is our expectation that these insights into the nature of polymer matrix effects on acid generation could lead to increased sensitivity for EUV resists.
READ LESS

Summary

We have measured the acid generation efficiency with EUV exposure of a PAG in different polymer matrixes representing the main classes of resist polymers as well as some previously described fluoropolymers for lithographic applications. The polymer matrix was found to have a significant effect on the acid generation efficiency of...

READ MORE

X-band receiver front-end chip in silicon germanium technology

Published in:
2008 IEEE Topical Meeting on Silicon Monolithic Integrated Circuits in RF Systems, 23-25 January 2008.

Summary

This paper reports a demonstration of X-band receiver RF front-end components and the integrated chipset implemented in 0.18 mum silicon germanium (SiGe) technology. The system architecture consists of a single down conversion from X-band at the input to S-band at the intermediate frequency (IF) output. The microwave monolithic integrated circuit (MMIC) includes an X-band low noise amplifier, lead-lag splitter, balanced amplifiers, double balanced mixer, absorptive filter, and an IF amplifier. The integrated chip achieved greater than 30 dB of gain and less than 6 dB of noise figure.
READ LESS

Summary

This paper reports a demonstration of X-band receiver RF front-end components and the integrated chipset implemented in 0.18 mum silicon germanium (SiGe) technology. The system architecture consists of a single down conversion from X-band at the input to S-band at the intermediate frequency (IF) output. The microwave monolithic integrated circuit...

READ MORE

Comparison of Rapid Update Cycle (RUC) model crosswinds with LIDAR crosswind measurements at St. Louis Lambert International Airport

Published in:
13th Conf. on Aviation, Range and Aerospace Meteorology, ARAM, 20-24 January 2008.

Summary

Turbulence associated with wake vortices generated by arriving and departing aircraft pose a potential safety risk to other nearby aircraft, and as such this potential risk may apply to aircraft operating on Closely Spaced Parallel Runways (CSPRs). To take wake vortex behavior into account, current aircraft departing/landing standards require a safe distance behind the wake generating aircraft at which operations can be conducted. The Federal Aviation Administration (FAA) and National Aeronautics and Space Administration (NASA) have initiated an improved wake avoidance solution, referred to as Wake Turbulence Mitigation for Departures (WTMD). The process is designed to safely increase runway capacity via actively monitoring wind conditions that impact wake behavior (Hallock, et al., 1998; Lang et al., 2005). An important component of WTMD is a Wind Forecast Algorithm (WFA) being developed by MIT Lincoln Laboratory (Cole & Winkler, 2004). The WFA predicts runway crosswinds from the surface up to a height of approximately ~300 m (1000 ft) once per minute and thus forecasts when winds favorable for WTMD will persist long enough for safe procedures for a particular runway (Lang et al., 2007). The algorithm uses 1–4 hr wind forecasts from the Rapid Update Cycle (RUC) model operated by the National Oceanic and Atmospheric Administration/National Centers for Environmental Prediction (NOAA/NCEP) for upper atmospheric wind profiles. Detailed description of the RUC model can be found elsewhere (Benjamin et al., 1994; 2004a; 2004b). Briefly, the RUC model inputs are assimilations of high frequency observations from a suite of meteorological sensors, including Automated Surface Observing System (ASOS), rawinsonde profiles, satellite, airborne sensors from commercial aircraft, etc. The vertical layers of the atmosphere are resolved approximately isentropically. The model is run hourly, producing hourly forecasts out to 24 hours. The coverage of the RUC grid includes the continental United States, southern Canada, northern Mexico, and adjacent coastal waters. Here we evaluate the performance of RUC in predicting crosswinds with reliability sufficient to support WTMD. For RUC validation, in situ wind profile data were obtained from a Light Imaging Detection and Ranging (LIDAR) deployed at St. Louis Lambert International Airport (STL). The focus of this study is to provide a general quantitative characterization of the difference between RUC predictions and LIDAR measurements of the runway crosswinds. Particular attention was given to cases with inaccurate RUC crosswind forecasts, and cases when significant horizontal and vertical shears occur during situations of convective weather or proximity to large scale weather features, e.g., air mass fronts. (In practice, WTMD procedures and existing weather sources in the Control Tower will manage, to an acceptable level of risk, the hazard exposure associated with the extreme wind shift examples presented here.) Also included was examination of performance degradation with longer RUC forecast horizons and coarser horizontal resolutions, which may be relevant with regard to actual operational forecast data availability, or future applications of the operational concept to include arrival operations. A detailed report for this study is also available (Huang et al., 2007).
READ LESS

Summary

Turbulence associated with wake vortices generated by arriving and departing aircraft pose a potential safety risk to other nearby aircraft, and as such this potential risk may apply to aircraft operating on Closely Spaced Parallel Runways (CSPRs). To take wake vortex behavior into account, current aircraft departing/landing standards require a...

READ MORE