Publications

Refine Results

(Filters Applied) Clear All

Pitch-scale modification using the modulated aspiration noise source

Published in:
INTERSPEECH, 17-21 September 2006.

Summary

Spectral harmonic/noise component analysis of spoken vowels shows evidence of noise modulations with peaks in the estimated noise source component synchronous with both the open phase of the periodic source and with time instants of glottal closure. Inspired by this observation of natural modulations and of fullband energy in the aspiration noise source, we develop an alternate approach to high-quality pitch-scale modification of continuous speech. Our strategy takes a dual processing approach, in which the harmonic and noise components of the speech signal are separately analyzed, modified, and re-synthesized. The periodic component is modified using standard modification techniques, and the noise component is handled by modifying characteristics of its source waveform. Since we have modeled an inherent coupling between the periodic and aspiration noise sources, the modification algorithm is designed to preserve the synchrony between temporal modulations of the two sources. The reconstructed modified signal is perceived in informal listening to be natural-sounding and typically reduces artifacts that occur in standard modification techniques.
READ LESS

Summary

Spectral harmonic/noise component analysis of spoken vowels shows evidence of noise modulations with peaks in the estimated noise source component synchronous with both the open phase of the periodic source and with time instants of glottal closure. Inspired by this observation of natural modulations and of fullband energy in the...

READ MORE

Missing feature theory with soft spectral subtraction for speaker verification

Published in:
Interspeech 2006, ICSLP, 17-21 September 2006.

Summary

This paper considers the problem of training/testing mismatch in the context of speaker verification and, in particular, explores the application of missing feature theory in the case of additive white Gaussian noise corruption in testing. Missing feature theory allows for corrupted features to be removed from scoring, the initial step of which is the detection of these features. One method of detection, employing spectral subtraction, is studied in a controlled manner and it is shown that with missing feature compensation the resulting verification performance is improved as long as a minimum number of features remain. Finally, a blending of "soft" spectral subtraction for noise mitigation and missing feature compensation is presented. The resulting performance improves on the constituent techniques alone, reducing the equal error rate by about 15% over an SNR range of 5 - 25 dB.
READ LESS

Summary

This paper considers the problem of training/testing mismatch in the context of speaker verification and, in particular, explores the application of missing feature theory in the case of additive white Gaussian noise corruption in testing. Missing feature theory allows for corrupted features to be removed from scoring, the initial step...

READ MORE

An overview of automatic speaker diarization systems

Published in:
IEEE Trans. Audio, Speech, and Language Processing, Vol. 14, No. 5, September 2006, pp. 1557-1565.

Summary

Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization can be used for helping speech recognition, facilitating the searching and indexing of audio archives, and increasing the richness of automatic transcriptions, making them more readable. In this paper, we provide an overview of the approaches currently used in a key area of audio diarization, namely speaker diarization, and discuss their relative merits and limitations. Performances using the different techniques are compared within the framework of the speaker diarization task in the DARPA EARS Rich Transcription evaluations. We also look at how the techniques are being introduced into real broadcast news systems and their portability to other domains and tasks such as meetings and speaker verification.
READ LESS

Summary

Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization can be used for helping speech recognition, facilitating...

READ MORE

Coherent beam combining of large number of PM fibres in 2-D fibre array

Published in:
Electron. Lett., Vol. 42, No. 18, 31 August 2006, pp. 17-18.

Summary

Coherent combining of a record 48 PM fibres in a phased array configuration is reported. The resulting Strehl ratio degrades by
READ LESS

Summary

Coherent combining of a record 48 PM fibres in a phased array configuration is reported. The resulting Strehl ratio degrades by

READ MORE

Using filter banks to improve interceptor performance against weaving targets

Author:
Published in:
AIAA Guidance, Navigation, and Control Conf., 21-24 August 2006.

Summary

It is well known that interceptor performance against a weaving or spiraling target can be improved by use of a special purpose weave guidance law. However the weave guidance law requires knowledge of the target weave frequency. When the target weave frequency is unknown an extended Kalman filter is usually considered for the problem because it can be used to estimate the target weave frequency. However, the performance of the extended Kalman filter is sensitive to initialization errors. This paper offers an unusual linear Kalman filter bank approach, where each filter is tuned to a different target weave frequency, as a potential solution for estimating the target weave frequency. Rather than combining individual filter outputs in some probabilistic sense, a straightforward algorithm is presented for choosing the filter that is most closely tuned to the actual target weave frequency. This paper demonstrates that this filter bank approach is superior to that of the extended Kalman filter for the weaving target problem.
READ LESS

Summary

It is well known that interceptor performance against a weaving or spiraling target can be improved by use of a special purpose weave guidance law. However the weave guidance law requires knowledge of the target weave frequency. When the target weave frequency is unknown an extended Kalman filter is usually...

READ MORE

An end-to-end demonstration of a receiver array based free-space photon counting communications link

Published in:
SPIE Vol. 6304, Free-Space Laser Communications VI, 13-17 August 2006, pp. 63040H-1 - 63040H-13.

Summary

NASA anticipates a significant demand for long-haul communications service from deep-space to Earth in the near future. To address this need, a substantial effort has been invested in developing a free-space laser communications system that can be operated at data rates that are 10-1000 times higher than current RF systems. We have built an endto- end free-space photon counting testbed to demonstrate many of the key technologies required for a deep space optical receiver. The testbed consists of two independent receivers, each using a Geiger-mode avalanche photodiode detector array. A hardware aggregator combines the photon arrivals from the two receivers and the aggregated photon stream is decoded in real time with a hardware turbo decoder. We have demonstrated signal acquisition, clock synchronization, and error free communications at data rates up to 14 million bits per second while operating within 1 dB of the channel capacity with an efficiency of greater than 1 bit per incident photon.
READ LESS

Summary

NASA anticipates a significant demand for long-haul communications service from deep-space to Earth in the near future. To address this need, a substantial effort has been invested in developing a free-space laser communications system that can be operated at data rates that are 10-1000 times higher than current RF systems...

READ MORE

Toward an interagency language roundtable based assessment of speech-to-speech translation capabilitites

Published in:
AMTA 2006, 7th Biennial Conf. of the Association for Machine Translation in the Americas, 8-12 August 2006.

Summary

We present observations from three exercises designed to map the effective listening and speaking skills of an operator of a speech-to-speech translation system (S2S) to the Interagency Language Roundtable (ILR) scale. Such a mapping is nontrivial, but will be useful for government and military decision makers in managing expectations of S2S technology. We observed domain-dependent S2S capabilities in the ILR range of Level 0+ to Level 1, and interactive text-based machine translation in the Level 3 range.
READ LESS

Summary

We present observations from three exercises designed to map the effective listening and speaking skills of an operator of a speech-to-speech translation system (S2S) to the Interagency Language Roundtable (ILR) scale. Such a mapping is nontrivial, but will be useful for government and military decision makers in managing expectations of...

READ MORE

Experience using active and passive mapping for network situational awareness

Published in:
5th IEEE Int. Symp. on Network Computing and Applications NCA06, 24-26 July 2006, pp. 19-26.

Summary

Passive network mapping has often been proposed as an approach to maintain up-to-date information on networks between active scans. This paper presents a comparison of active and passive mapping on an operational network. On this network, active and passive tools found largely disjoint sets of services and the passive system took weeks to discover the last 15% of active services. Active and passive mapping tools provided different, not complimentary information. Deploying passive mapping on an enterprise network does not reduce the need for timely active scans due to non-overlapping coverage and potentially long discovery times.
READ LESS

Summary

Passive network mapping has often been proposed as an approach to maintain up-to-date information on networks between active scans. This paper presents a comparison of active and passive mapping on an operational network. On this network, active and passive tools found largely disjoint sets of services and the passive system...

READ MORE

Assessment of air traffic control productivity enhancements from the Corridor Integrated Weather System (CIWS)

Published in:
MIT Lincoln Laboratory Report ATC-325

Summary

The Air Traffic Control (ATC) productivity benefits attributed to the Corridor Integrated Weather System (CIWS) were assessed using real-time observations of CIWS product usage during three multi-day thunderstorm events in 2005 at eight U.S. Air Route Traffic Control Centers (ARTCCs). CIWS improved ATC productivity by: reducing the time required to develop, coordinate, and implement weather impact mitigation plans; increasing the number of safety and capacity-enhancing plans that were executed (e.g., more efficient, proactive rerouting and greater ability to keep routes open; [and] assisting with FAA staffing decisions. Time savings per consecutive weather day for Traffic Management Coordinators (TMCs) in an ARTCC typically were 20-95 minutes. The overall frequency of capacity-enhancing decisions increased by 177% relative to the CIWS benefits study conducted in 2003. The annual CIWS delay savings are in excess of 92,000 hours. Corresponding airline direct operations cost (DOC) savings exceeded $94M and passenger value of time (PVT) savings exceeded $201M. Annual jet fuel savings exceeded 11M gallons. The ability of the Cleveland ARTCC to develop and execute weather impact mitigation plans improved significantly (e.g., by 50-80%) when CIWS products were available to Area Supervisors as well as to the TMCs.
READ LESS

Summary

The Air Traffic Control (ATC) productivity benefits attributed to the Corridor Integrated Weather System (CIWS) were assessed using real-time observations of CIWS product usage during three multi-day thunderstorm events in 2005 at eight U.S. Air Route Traffic Control Centers (ARTCCs). CIWS improved ATC productivity by: reducing the time required to...

READ MORE

Advanced language recognition using cepstra and phonotactics: MITLL system performance on the NIST 2005 Language Recognition Evaluation

Summary

This paper presents a description of the MIT Lincoln Laboratory submissions to the 2005 NIST Language Recognition Evaluation (LRE05). As was true in 2003, the 2005 submissions were combinations of core cepstral and phonotactic recognizers whose outputs were fused to generate final scores. For the 2005 evaluation, Lincoln Laboratory had five submissions built upon fused combinations of six core systems. Major improvements included the generation of phone streams using lattices, SVM-based language models using lattice-derived phonotactics, and binary tree language models. In addition, a development corpus was assembled that was designed to test robustness to unseen languages and sources. Language recognition trends based on NIST evaluations conducted since 1996 show a steady improvement in language recognition performance.
READ LESS

Summary

This paper presents a description of the MIT Lincoln Laboratory submissions to the 2005 NIST Language Recognition Evaluation (LRE05). As was true in 2003, the 2005 submissions were combinations of core cepstral and phonotactic recognizers whose outputs were fused to generate final scores. For the 2005 evaluation, Lincoln Laboratory had...

READ MORE