Publications
Modeling of the glottal flow derivative waveform with application to speaker identification
Summary
Summary
An automatic technique for estimating and modeling the glottal flow derivative source waveform from speech, and applying the model parameters to speaker identification, is presented. The estimate of the glottal flow derivative is decomposed into coarse structure, representing the general flow shape, and fine structure, comprising aspiration and other perturbations...
Understanding-based translingual information retrieval
Summary
Summary
This paper describes our preliminary research on an understanding-based translingual information retrieval system for which the input to the system is a query sentence in English, and the output of the system is a set of documents either in English or in Korean. The understanding module produces a meaning representation...
Security implications of adaptive multimedia distribution
Summary
Summary
We discuss the security implications of different techniques used in adaptive audio and video distribution. Several sources of variability in the network make it necessary for applications to adapt. Ideally, each receiver should receive media quality commensurate with the capacity of the path leading to it from each sender. Several...
Automatic speaker clustering from multi-speaker utterances
Summary
Summary
Blind clustering of multi-person utterances by speaker is complicated by the fact that each utterance has at least two talkers. In the case of a two-person conversation, one can simply split each conversation into its respective speaker halves, but this introduces error which ultimately hurts clustering. We propose a clustering...
Corpora for the evaluation of speaker recognition systems
Summary
Summary
Using standard speech corpora for development and evaluation has proven to be very valuable in promoting progress in speech and speaker recognition research. In this paper, we present an overview of current publicly available corpora intended for speaker recognition research and evaluation. We outline the corpora's salient features with respect...
Implications of glottal source for speaker and dialect identification
Summary
Summary
In this paper we explore the importance of speaker specific information carried in the glottal source. We time align utterances of two speakers speaking the same sentence from the TIMIT database of American English. We then extract the glottal flow derivative from each speaker and interchange them. Through time alignment...
'Perfect reconstruction' time-scaling filterbanks
Summary
Summary
A filterbank-based method of time-scale modification is analyzed for elemental signals including clicks, sines, and AM-FM sines. It is shown that with the use of some basic properties of linear systems, as well as FM-to-AM filter transduction, "perfect reconstruction" time-scaling filterbanks can be constructed for these elemental signal classes under...
Evaluating intrusion detection systems without attacking your friends: The 1998 DARPA intrusion detection evaluation
Summary
Summary
Intrusion detection systems monitor the use of computers and the network over which they communicate, searching for unauthorized use, anomalous behavior, and attempts to deny users, machines or portions of the network access to services. Potential users of such systems need information that is rarely found in marketing literature, including...
Machine-assisted language translation for U.S./RoK Combined Forces Command
Summary
Summary
The U.S. military must operate worldwide in a variety of international environments where many different languages are used. There is a critical need for translation, and there is a shortage of translators who can interpret military terminology specifically. One coalition environment where the need is particularly strong is in the...
Blind clustering of speech utterances based on speaker and language characteristics
Summary
Summary
Classical speaker and language recognition techniques can be applied to the classification of unknown utterances by computing the likelihoods of the utterances given a set of well trained target models. This paper addresses the problem of grouping unknown utterances when no information is available regarding the speaker or language classes...