As ever greater numbers of telephone transactions are being conducted solely between a caller and an automated answering system, the need increases for software which can automatically identify and authenticate these callers without the need for an onerous speaker enrollment process. In this paper we introduce and investigate a novel speaker detection and tracking (SDT) technique, which dynamically merges the traditional enrollment and recognition phases of the static speaker recognition task. In this speaker recognition application, no prior speaker models exist and the goal is to detect and model new speakers as they call into the system while also recognizing utterances from the previously modeled callers. New speakers are added to the enrolled set of speakers and speech from speakers in the currently enrolled set is used to update models. We describe a system based on a GMM speaker identification (SID) system and develop a new measure to evaluate the performance of the system on the SDT task. Results for both static, open-set detection and the SDT task are presented using a portion of the Switchboard corpus of telephone speech communications. Static open-set detection produces an equal error rate of about 5%. As expected, performance for SDT is quite varied, depending greatly on the speaker set and ordering of the test sequence. These initial results, however, are quite promising and point to potential areas in which to improve the system performance.

READ LESS

Summary

Speaker detection and tracking for telephone transactions

Speech enhancement based on auditory spectral change

May 13, 2002

Conference Paper

Author:

Thomas F. Quatieri

…

Robert B. Dunn

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. I, Speech Processing Neural Networks for Signal Processing, 13-17 May 2002, pp. I-257 - I-260.

Topic:

speech enhancement

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper, an adaptive approach to the enhancement of speech signals is developed based on auditory spectral change. The algorithm is motivated by sensitivity of aural biologic systems to signal dynamics, by evidence that noise is aurally masked by rapid changes in a signal, and by analogies to these two aural phenomena in biologic visual processing. Emphasis is on preserving nonstationarity, i.e., speech transient and time-varying components, such as plosive bursts, formant transitions, and vowel onsets, while suppressing additive noise. The essence of the enhancement technique is a Wiener filter that uses a desired signal spectrum whose estimation adapts to stationarity of the measured signal. The degree of stationarity is derived from a signal change measurement, based on an auditory spectrum that accentuates change in spectral bands. The adaptive filter is applied in an unconventional overlap-add analysis/synthesis framework, using a very short 4-ms analysis window and a 1-ms frame interval. In informal listening, the reconstructions are judged to be "crisp" corresponding to good temporal resolution of transient and rapidly-moving speech events.

READ LESS

Summary

Speech enhancement based on auditory spectral change

Automated generation and analysis of attack graphs

May 12, 2002

Conference Paper

Author:

O. Sheyner

…

Published in:

Proc. of the 2002 IEEE Symp. on Security and Privacy, 12-15 May 2002, pp. 254-265.

Topic:

attack graphs

R&D area:

Cyber Security and Information Sciences

R&D group:

Cyber Operations and Analysis Technology

Summary

An integral part of modeling the global view of network security is constructing attack graphs. In practice, attack graphs are produced manually by Red Teams. Construction by hand, however, is tedious, error-prone, and impractical for attack graphs have larger than a hundred nodes. In this paper we present an automated technique for generating and analyzing attack graphs. We base our technique on symbolic model checking algorithms, letting us construct attack graphs automatically and efficiently. We also describe two analyses to help decide which attacks would be most cost-effective to guard against. We implemented our techniques in a tool suite and tested it on a small network example, which includes models of a firewall and an intrusion detection system.

READ LESS

Summary

Automated generation and analysis of attack graphs

Speech-to-speech translation: technology and applications study

May 10, 2002

Technical Report

Author:

Clifford J. Weinstein

Published in:

MIT Lincoln Laboratory Report TR-1080

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

This report describes a study effort on the state-of-the-art and lessons learned in automated, two- way, speech-to-speech translation and its potential application to military problems. The study includes and comments upon an extensive set of references on prior and current work in speech translation. The study includes recommendations on future military applications and on R&D needed to successfully achieve those applications. Key findings of the study include: (1) R&D speech translation systems have been demonstrated, but only in limited domains, and their performance is inadequate for operational use; (2) as far as we have been able to determine, there are currently no operational two-way speech translation systems; (3) intensive, sustained R&D will be needed to develop usable two-way speech translation systems. Major recommendations include: (1) a substantial R&D program in speech translation is needed, especially including full end-to-end system prototyping and evaluation; (2) close cooperation among researchers and users speaking multiple languages will be needed for the development of useful application systems; (3) to get military users involved and interacting in a mode which enables them to provide useful inputs and feedback on system requirements and performance, it will be necessary to provide them at the start with a fairly robust, open-domain system which works to the degree that some two-way speech translation is operational.

READ LESS

Summary

Speech-to-speech translation: technology and applications study

PVL: An Object Oriented Software Library for Parallel Signal Processing (Abstract)

January 1, 2002

Conference Paper

Author:

Edward M. Rutledge

…

Jeremy Kepner

Published in:

CLUSTER '01, 2001 IEEE Int. Conf. on Cluster Computing, 8-11 October 2001, p. 74.

Topic:

signal processing

R&D area:

R&D group:

Embedded and Open Systems

Summary

Real-time signal processing consumes the majority of the world's computing power Increasingly, programmable parallel microprocessors are used to address a wide variety of signal processing applications (e.g. scientific, video, wireless, medical, communication, encoding, radar, sonar and imaging). In programmable systems the major challenge is no longer hardware but software. Specifically, the key technical hurdle lies in mapping (i.e., placement and routing) of an algorithm onto a parallel computer in a general manner that preserves software portability. We have developed the Parallel Vector Library (PVL) to allow signal processing algorithms to be written using high level Matlab like constructs that are independent of the underlying parallel mapping. Programs written using PVL can be ported to a wide range of parallel computers without sacrificing performance. Furthermore, the mapping concepts in PVL provide the infrastructure for enabling new capabilities such as fault tolerance, dynamic scheduling and self-optimization. This presentation discusses PVL with particular emphasis on quantitative comparisons with standard parallel signal programming practices.

READ LESS

Summary

PVL: An Object Oriented Software Library for Parallel Signal Processing (Abstract)

Gender-dependent phonetic refraction for speaker recognition

January 1, 2002

Conference Paper

Author:

Walter D. Andrews

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, 13-17 May 2002, Vol. 1, pp. 149-152.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

This paper describes improvement to an innovative high-performance speaker recognition system. Recent experiments showed that with sufficient training data phone strings from multiple languages are exceptional features for speaker recognition. The prototype phonetic speaker recognition system used phone sequences from six languages to produce an equal error rate of 11.5% on Switchboard-I audio files. The improved system described in this paper reduces the equal error rate to less than 4%. This is accomplished by incorporating gender-dependent phone models, pre-processing the speech files to remove cross-talk, and developing more sophisticated fusion techniques for the multi-language likelihood scores.

READ LESS

Summary

Gender-dependent phonetic refraction for speaker recognition

Language identification using Gaussian mixture model tokenization

January 1, 2002

Conference Paper

Author:

Pedro A. Torres-Carrasquillo

…

Published in:

Proc. IEEE Int. Conf., on Acoustics, Speech and Signal Processing, ICASSP, Vol. I, 13-17 May 2002, pp. I-757 - I-760.

Topic:

language recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Phone tokenization followed by n-gram language modeling has consistently provided good results for the task of language identification. In this paper, this technique is generalized by using Gaussian mixture models as the basis for tokenizing. Performance results are presented for a system employing a GMM tokenizer in conjunction with multiple language processing and score combination techniques. On the 1996 CallFriend LID evaluation set, a 12-way closed set error rate of 17% was obtained.

READ LESS

Summary

Language identification using Gaussian mixture model tokenization

Interlingua-based English-Korean two-way speech translation of doctor-patient dialogues with CCLINC

January 1, 2002

Journal Article

Author:

Young-Suk Lee

…

Published in:

Machine Trans. Vol. 17, No. 3, 2002, pp. 213-243.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Development of a robust two-way real-time speech translation system exposes researchers and system developers to various challenges of machine translation (MT) and spoken language dialogues. The need for communicating in at least two different languages poses problems not present for a monolingual spoken language dialogue system, where no MT engine is embedded within the process flow. Integration of various component modules for real-time operation poses challenges not present for text translation. In this paper, we present the CCLINC (Common Coalition Language System at Lincoln Laboratory) English-Korean two-way speech translation system prototype trained on doctor-patient dialogues, which integrates various techniques to tackle the challenges of automatic real-time speech translation. Key features of the system include (i) language-independent meaning representation which preserves the hierarchical predicate-argument structure of an input utterance, providing a powerful mechanism for discourse understanding of utterances originating from different languages, word-sense disambiguation and generation of various word orders of many languages, (ii) adoption of the DARPA Communicator architecture, a plug-and-play distributed system architecture which facilitates integration of component modules and system operation in real time, and (iii) automatic acquisition of grammar rules and lexicons for easy porting of the system to different languages and domains. We describe these features in detail and present experimental results.

READ LESS

Summary

Interlingua-based English-Korean two-way speech translation of doctor-patient dialogues with CCLINC

The effect of personality type on the usage of a multimedia engineering education system

January 1, 2002

Journal Article

Author:

Albert I. Reuther

…

D. G. Meyer

Published in:

32nd Annual ASEE/IEEE Frontiers in Education Conf., 6-9 November 2002, pp. T3A-7 - T3A-12.

Topic:

computing

R&D area:

R&D group:

Embedded and Open Systems

Summary

Multimedia education has quickly entered our classrooms and offices providing tutorials and lessons on many different topics. The assumption that most people interact with these multimedia systems in similar ways can easily be made, but are these assumptions valid? What factors determine whether students will embrace computer-based multimedia-augmented learning? One factor may be the student's personality type. This paper explores the reasons why some students may enjoy learning using computer-based educational delivery systems while others may have absolutely no enthusiasm for this type of learning and how that enthusiasm may relate to the students' personality types.

READ LESS

Summary

The effect of personality type on the usage of a multimedia engineering education system

Detecting clusters of galaxies in the Sloan Digital Sky Survey. I. Monte Carlo comparison of cluster detection algorithms

January 1, 2002

Journal Article

Author:

Rita Seung Jung Kim

…

Published in:

Astron. J., Vol. 123, No. 1, January 2002, pp. 20-36.

Topic:

space

R&D area:

R&D group:

Embedded and Open Systems

Summary

We present a comparison of three cluster-finding algorithms from imaging data using Monte Carlo simulations of clusters embedded in a 25 deg(2) region of Sloan Digital Sky Survey (SDSS) imaging data: the matched filter (MF), the adaptive matched filter (AMF), and a color-magnitude filtered Voronoi tessellation technique (VTT). Among the two matched filters, we find that the MF is more efficient in detecting faint clusters, whereas the AMF evaluates the redshifts and richnesses more accurately, therefore suggesting a hybrid method (HMF) that combines the two. The HMF outperforms the VTT when using a background that is uniform, but it is more sensitive to the presence of a nonuniform galaxy background than is the VTT; this is due to the assumption of a uniform background in the HMF model. We thus find that for the detection thresholds we determine to be appropriate for the SDSS data, the performance of both algorithms are similar; we present the selection function for each method evaluated with these thresholds as a function of redshift and richness. For simulated clusters generated with a Schechter luminosity function (M(*r) = -21.5 and (a = -1.1), both algorithms are complete for Abell richness >~ clusters up to z ~0.4 for a sample magnitude limited to r = 21. While the cluster parameter evaluation shows a mild correlation with the local background density, the detection efficiency is not significantly affected by the background fluctuations, unlike previous shallower surveys.

READ LESS

Summary

Detecting clusters of galaxies in the Sloan Digital Sky Survey. I. Monte Carlo comparison of cluster detection algorithms

Publications

Refine Results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results