Publications

Refine Results

(Filters Applied) Clear All

'Perfect reconstruction' time-scaling filterbanks

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. III, 15-19 March 1999, pp. 945-948.

Summary

A filterbank-based method of time-scale modification is analyzed for elemental signals including clicks, sines, and AM-FM sines. It is shown that with the use of some basic properties of linear systems, as well as FM-to-AM filter transduction, "perfect reconstruction" time-scaling filterbanks can be constructed for these elemental signal classes under certain conditions on the filterbank. Conditions for perfect reconstruction time-scaling are shown analytically for the uniform filterbank case, while empirically for the nonuniform constant-Q (gammatone) case. Extension of perfect reconstruction to multi-component signals is shown to require both filterbank and signal-dependent conditions and indicates the need for a more complete theory of "perfect reconstruction" time-scaling filterbanks.
READ LESS

Summary

A filterbank-based method of time-scale modification is analyzed for elemental signals including clicks, sines, and AM-FM sines. It is shown that with the use of some basic properties of linear systems, as well as FM-to-AM filter transduction, "perfect reconstruction" time-scaling filterbanks can be constructed for these elemental signal classes under...

READ MORE

Evaluating intrusion detection systems without attacking your friends: The 1998 DARPA intrusion detection evaluation

Summary

Intrusion detection systems monitor the use of computers and the network over which they communicate, searching for unauthorized use, anomalous behavior, and attempts to deny users, machines or portions of the network access to services. Potential users of such systems need information that is rarely found in marketing literature, including how well a given system finds intruders and how much work is required to use and maintain that system in a fully functioning network with significant daily traffic. Researchers and developers can specify which prototypical attacks can be found by their systems, but without access to the normal traffic generated by day-to-day work, they can not describe how well their systems detect real attacks while passing background traffic and avoiding false alarms. This information is critical: every declared intrusion requires time to review, regardless of whether it is a correct detection for which a real intrusion occurred, or whether it is merely a false alarm. To meet the needs of researchers, developers and ultimately system administrators we have developed the first objective, repeatable, and realistic measurement of intrusion detection system performance. Network traffic on an Air Force base was measured, characterized and subsequently simulated on an isolated network on which a few computers were used to simulate thousands of different Unix systems and hundreds of different users during periods of normal network traffic. Simulated attackers mapped the network, issued denial of service attacks, illegally gained access to systems, and obtained super-user privileges. Attack types ranged from old, well-known attacks, to new, stealthy attacks. Seven weeks of training data and two weeks of testing data were generated, filling more than 30 CD-ROMs. Methods and results from the 1998 DARPA intrusion detection evaluation will be highlighted, and preliminary plans for the 1999 evaluation will be presented.
READ LESS

Summary

Intrusion detection systems monitor the use of computers and the network over which they communicate, searching for unauthorized use, anomalous behavior, and attempts to deny users, machines or portions of the network access to services. Potential users of such systems need information that is rarely found in marketing literature, including...

READ MORE

Machine-assisted language translation for U.S./RoK Combined Forces Command

Published in:
Army RD&A Mag., November-December 1999, pp. 38-41.

Summary

The U.S. military must operate worldwide in a variety of international environments where many different languages are used. There is a critical need for translation, and there is a shortage of translators who can interpret military terminology specifically. One coalition environment where the need is particularly strong is in the Republic of Korea (RoK) where, although U.S. and RoK military personnel have been working together for many years, the language barrier still significantly reduces the speed and effectiveness of coalition command and control. This article describes the Massachusetts Institute of Technology (MIT) Lincoln Laboratory's work on automated, two-way, English/Korean translation for enhanced coalition communications. Our ultimate goal is to enhance multilingual communications by producing accurate translations across a number of languages. Therefore, we have chosen an interlingua-based approach to machine translation that is readily adaptable to multiple languages. In this approach, a natural language understanding system transforms the input into an intermediate meaning representation called Semantic Frame, which serves as a basis for generating output in multiple languages. To produce useful and effective translation systems in the short term, we have focused on limited military task domains and have configured our system as a machine-assisted translation system. This allows the human translator to confirm or edit the machine translation.
READ LESS

Summary

The U.S. military must operate worldwide in a variety of international environments where many different languages are used. There is a critical need for translation, and there is a shortage of translators who can interpret military terminology specifically. One coalition environment where the need is particularly strong is in the...

READ MORE

Blind clustering of speech utterances based on speaker and language characteristics

Published in:
5th Int. Conf. Spoken Language Processing (ICSLP), 30 November - 4 December 1998.

Summary

Classical speaker and language recognition techniques can be applied to the classification of unknown utterances by computing the likelihoods of the utterances given a set of well trained target models. This paper addresses the problem of grouping unknown utterances when no information is available regarding the speaker or language classes or even the total number of classes. Approaches to blind message clustering are presented based on conventional hierarchical clustering techniques and an integrated cluster generation and selection method called the d* algorithm. Results are presented using message sets derived from the Switchboard and Callfriend corpora. Potential applications include automatic indexing of recorded speech corpora by speaker/language tags and automatic or semiautomatic selection of speaker specific speech utterances for speaker recognition adaptation.
READ LESS

Summary

Classical speaker and language recognition techniques can be applied to the classification of unknown utterances by computing the likelihoods of the utterances given a set of well trained target models. This paper addresses the problem of grouping unknown utterances when no information is available regarding the speaker or language classes...

READ MORE

Improving accent identification through knowledge of English syllable structure

Published in:
5th Int. Conf. on Spoken Language Processing, ICSLP, 30 November - 4 December 1998.

Summary

This paper studies the structure of foreign-accented read English speech. A system for accent identification is constructed by combining linguistic theory with statistical analysis. Results demonstrate that the linguistic theory is reflected in real speech data and its application improves accent identification. The work discussed here combines and applies previous research in language identification based on phonemic features [1] with the analysis of the structure and function of the English language [2]. Working with phonemically hand-labelled data in three accented speaker groups of Australian English (Vietnamese, Lebanese, and native speakers), we show that accents of foreign speakers can be predicted and manifest themselves differently as a function of their position within the syllable. When applying this knowledge, English vs. Vietnamese accent identification improves from 86% to 93% (English vs. Lebanese improves from 78% to 84%). The described algorithm is also applied to automatically aligned phonemes.
READ LESS

Summary

This paper studies the structure of foreign-accented read English speech. A system for accent identification is constructed by combining linguistic theory with statistical analysis. Results demonstrate that the linguistic theory is reflected in real speech data and its application improves accent identification. The work discussed here combines and applies previous...

READ MORE

Sheep, goats, lambs and wolves: a statistical analysis of speaker performance in the NIST 1998 speaker recognition evaluation

Summary

Performance variability in speech and speaker recognition systems can be attributed to many factors. One major factor, which is often acknowledged but seldom analyzed, is inherent differences in the recognizability of different speakers. In speaker recognition systems such differences are characterized by the use of animal names for different types of speakers, including sheep, goats, lambs and wolves, depending on their behavior with respect to automatic recognition systems. In this paper we propose statistical tests for the existence of these animals and apply these tests to hunt for such animals using results from the 1998 NIST speaker recognition evaluation.
READ LESS

Summary

Performance variability in speech and speaker recognition systems can be attributed to many factors. One major factor, which is often acknowledged but seldom analyzed, is inherent differences in the recognizability of different speakers. In speaker recognition systems such differences are characterized by the use of animal names for different types...

READ MORE

Vulnerabilities of reliable multicast protocols

Published in:
IEEE MILCOM '98, Vol. 3, 21 October 1998, pp. 934-938.

Summary

We examine vulnerabilities of several reliable multicast protocols. The various mechanisms employed by these protocols to provide reliability can present vulnerabilities. We show how some of these vulnerabilities can be exploited in denial-of-service attacks, and discuss potential mechanisms for withstanding such attacks.
READ LESS

Summary

We examine vulnerabilities of several reliable multicast protocols. The various mechanisms employed by these protocols to provide reliability can present vulnerabilities. We show how some of these vulnerabilities can be exploited in denial-of-service attacks, and discuss potential mechanisms for withstanding such attacks.

READ MORE

AM-FM separation using shunting neural networks

Published in:
Proc. of the IEEE-SP Int. Symp. on Time-Frequency and Time-Scale Analysis, 6-9 October 1998, pp. 553-556.

Summary

We describe an approach to estimating the amplitude-modulated (AM) and frequency-modulated (FM) components of a signal. Any signal can be written as the product of an AM component and an FM component. There have been several approaches to solving the AM-FM estimation problem described in the literature. Popular methods include the use of time-frequency analysis, the Hilbert transform, and the Teager energy operator. We focus on an approach based on FM-to-AM transduction that is motivated by auditory physiology. We show that the transduction approach can be realized as a bank of bandpass filters followed by envelope detectors and shunting neural networks, and the resulting dynamical system is capable of robust AM-FM estimation in noisy environments and over a broad range of filter bandwidths and locations. Our model is consistent with recent psychophysical experiments that indicate AM and FM components of acoustic signals may be transformed into a common neural code in the brain stem via FM-to-AM transduction. Applications of our model include signal recognition and multi-component decomposition.
READ LESS

Summary

We describe an approach to estimating the amplitude-modulated (AM) and frequency-modulated (FM) components of a signal. Any signal can be written as the product of an AM component and an FM component. There have been several approaches to solving the AM-FM estimation problem described in the literature. Popular methods include...

READ MORE

Magnitude-only estimation of handset nonlinearity with application to speaker recognition

Published in:
Proc. of the 1998 IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. II, Speech Processing II; Neural Networks for Signal Processing, 12-15 May 1998, pp. 745-748.

Summary

A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. The "magnitude-only" representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that are a potential source of degradation in speaker and speech recognition algorithms. As such, the method is particularly suited to algorithms that use only spectral magnitude information. The distortion model consists of a memoryless polynomial nonlinearity sandwiched between two finite-length linear filters. Minimization of a mean-squared spectral magnitude error, with respect to model parameters, relies on iterative estimation via a gradient descent technique, using a Jacobian in the iterative correction term with gradients calculated by finite-element approximation. Initial work has demonstrated the algorithm's usefulness in speaker recognition over telephone channels by reducing mismatch between high- and low-quality handset conditions.
READ LESS

Summary

A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. The "magnitude-only" representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that...

READ MORE

Audio signal processing based on sinusoidal analysis/synthesis

Published in:
Chapter 9 in Applications of Digital Signal Processing to Audio and Acoustics, 1998, pp. 343-416.

Summary

Based on a sinusoidal model, an analysis/synthesis technique is developed that characterizes audio signals, such as speech and music, in terms of the amplitudes, frequencies, and phases of the component sine waves. These parameters are estimated by applying a peak-picking algorithm to the short-time Fourier transform of the input waveform. Rapid changes in the highly resolved spectral components are tracked by using a frequency-matching algorithm and the concept of "birth" and "death" of the underlying sine waves. For a given frequency track, a cubic phase function is applied to the sine-wave generator, whose output is amplitude-modulated and added to sines for other frequency tracks. The resulting synthesized signal preserves the general wave form shape and is nearly perceptually indistinguishable from the original, thus providing the basis for a variety of applications including signal modification, sound splicing, morphing and extrapolation, and estimation of sound characteristics such as vibrato. Although this sine-wave analysis/synthesis is applicable to arbitrary signals, tailoring the system to a specific sound class can improve performance. A source/filter phase model is introduced within the sine-wave representation to improve signal modification, as in time-scale and pitch change and dynamic range compression, by attaining phase coherence where sinewave phase relations are preserved or controlled. A similar method of achieving phase coherence is also applied in revisiting the classical phase vocoder to improve modification of certain signal classes. A second refinement of the sine-wave analysis/synthesis invokes an additive deterministic/stochastic representation of sounds consisting of simultaneous harmonic and aharmonic contributions. A method of frequency tracking is given for the separation of these components, and is used in a number of applications. The sinewave model is also extended to two additively combined signals for the separation of simultaneous talkers or music duets. Finally, the use of sine-wave analysis/synthesis in providing insight for FM synthesis is described, and remaining challenges, such as an improved sine-wave representation of rapid attacks and other transient events, are presented.
READ LESS

Summary

Based on a sinusoidal model, an analysis/synthesis technique is developed that characterizes audio signals, such as speech and music, in terms of the amplitudes, frequencies, and phases of the component sine waves. These parameters are estimated by applying a peak-picking algorithm to the short-time Fourier transform of the input waveform...

READ MORE