Publications

Refine Results

(Filters Applied) Clear All

Interlingua-based broad-coverage Korean-to-English translation in CCLINC

Published in:
Proc. First Int. Conf. on Human Language Technology, 18-21 March 2001.

Summary

At MIT Lincoln Laboratory, we have been developing a Korean-to-English machine translation system CCLINC (Common Coalition Language System at Lincoln Laboratory). The CCLINC Korean-to-English translation system consists of two core modules, language understanding and generation modules mediated by a language neutral meaning representation called a semantic frame. The key features of the system include: (i) Robust efficient parsing of Korean (a verb final language with overt case markers, relatively free word order, and frequent omissions of arguments). (ii) High quality translation via word sense disambiguation and accurate word order generation of the target language. (iii) Rapid system development and porting to new domains via knowledge-based automated acquisition of grammars. Having been trained on Korean newspaper articles on "missiles" and "chemical biological warfare," the system produces the translation output sufficient for content understanding of the original document.
READ LESS

Summary

At MIT Lincoln Laboratory, we have been developing a Korean-to-English machine translation system CCLINC (Common Coalition Language System at Lincoln Laboratory). The CCLINC Korean-to-English translation system consists of two core modules, language understanding and generation modules mediated by a language neutral meaning representation called a semantic frame. The key features...

READ MORE

The use of dynamic segment scoring for language-independent question answering

Published in:
Proc. 1st Int. Conf. on Human Language Technology Research, HLT, 18-21 March 2001.

Summary

This paper presents a novel language-independent question/answering (Q/A) system based on natural language processing techniques, shallow query understanding, dynamic sliding window techniques, and statistical proximity distribution matching techniques. The performance of the proposed system using the latest Text REtrieval Conference (TREC-8) data was comparable to results reported by the top TREC-8 contenders.
READ LESS

Summary

This paper presents a novel language-independent question/answering (Q/A) system based on natural language processing techniques, shallow query understanding, dynamic sliding window techniques, and statistical proximity distribution matching techniques. The performance of the proposed system using the latest Text REtrieval Conference (TREC-8) data was comparable to results reported by the top...

READ MORE

High Speed Interconnects and Parallel Software Libraries: Enabling Technologies for NVO

Author:
Published in:
Proc. of the Astronomical Society of the Pacific Conf. Series, Vol. 225, 2001, Virtual Observations of the Future, 13-16 June 2000, pp. 297-301.

Summary

The National Virtual Observatory (NVO) will directly or indirectly touch upon all steps in the process of transforming raw observational data into "meaningful" results. These steps include: (1) Acquisition and storage of raw data. (2) Data reduction (i.e. translating raw data into source detections). (3) Aquisition and storage of detected sources. (4) Multi-sensor/multi-temporal data mining of the products of steps (1), (2) and (3). (Not complete.)
READ LESS

Summary

The National Virtual Observatory (NVO) will directly or indirectly touch upon all steps in the process of transforming raw observational data into "meaningful" results. These steps include: (1) Acquisition and storage of raw data. (2) Data reduction (i.e. translating raw data into source detections). (3) Aquisition and storage of detected...

READ MORE

Exploiting VSIPL and OpenMP for Parallel Image Processing

Author:
Published in:
ADASS 2000, Astronomical Data Analysis Software and Systems X, 12-14 November 2000, pp. 209-212.

Summary

VSIPL and OpenMP are two open standards for portable high performance computing. VSIPL delivers optimized single processor performance while OpenMP provides a low overhead mechanism for executing thread based parallelism on shared memory systems. Image processing is one of the main areas where VSIPL and OpenMP can have a large impact. Currently, a large fraction of image processing applications are written in the Interpreted Data Language (IDL) environment. The aim of this work is to demonstrate that the performance benefits of these new standards can be brought to image processing community in a high level manner that is transparent to users. To this end, this talk presents a fast, FFT based algorithm for performing image convolutions. This algorithm has been implemented within the IDL environment using VSIPL (for optimized single processor performance) with added OpenMP directives (for parallelism). This work demonstrates that good parallel speedups are attainable using standards and can be integrated seamlessly into existing user environments.
READ LESS

Summary

VSIPL and OpenMP are two open standards for portable high performance computing. VSIPL delivers optimized single processor performance while OpenMP provides a low overhead mechanism for executing thread based parallelism on shared memory systems. Image processing is one of the main areas where VSIPL and OpenMP can have a large...

READ MORE

The Lincoln speaker recognition system: NIST EVAL2000

Published in:
6th Int. Conf. on Spoken Language, ICSLP, 16-20 October 2000.

Summary

This paper presents an overview of the Lincoln Laboratory systems fielded for the 2000 NIST speaker recognition evaluation (SRE00). In addition to the standard one-speaker detection tasks, this year's evaluation, as in 1999, included multi-speaker spokes dealing with detection, tracking and segmentation. The design approach for the Lincoln system in SRE00 was to develop a set of core one-speaker detection and multi-speaker clustering tools that could be applied to all the tasks. This paper will describe these core systems, how they are applied to the SRE00 tasks and the results they produce. Additionally, a new channel normalization technique known as handset-dependent test-score norm (HTnorm) is introduced.
READ LESS

Summary

This paper presents an overview of the Lincoln Laboratory systems fielded for the 2000 NIST speaker recognition evaluation (SRE00). In addition to the standard one-speaker detection tasks, this year's evaluation, as in 1999, included multi-speaker spokes dealing with detection, tracking and segmentation. The design approach for the Lincoln system in...

READ MORE

Analysis and results of the 1999 DARPA off-line intrusion detection evaluation

Published in:
Proc. Recent Advances in Intrusion Detection, RAID, 2-4 October 2000, pp. 162-182.

Summary

Eight sites participated in the second DARPA off-line intrusion detection evaluation in 1999. Three weeks of training and two weeks of test data were generated on a test bed that emulates a small government site. More than 200 instances of 58 attack types were launched against victim UNIX and Windows NT hosts. False alarm rates were low (less than 10 per day). Best detection was provided by network-based systems for old probe and old denial-of-service (DOS) attacks and by host-based systems for Solaris user-to-root (U2R) attacks. Best over-all performance would have been provided by a combined system that used both host- and network-based intrusion detection. Detection accuracy was poor for previously unseen new, stealthy, and Windows NT attacks. Ten of the 58 attack types were completely missed by all systems. Systems missed attacks because protocols and TCP services were not analyzed at all or to the depth required, because signatures for old attacks did not generalize to new attacks, and because auditing was not available on all hosts.
READ LESS

Summary

Eight sites participated in the second DARPA off-line intrusion detection evaluation in 1999. Three weeks of training and two weeks of test data were generated on a test bed that emulates a small government site. More than 200 instances of 58 attack types were launched against victim UNIX and Windows...

READ MORE

The 1999 DARPA Off-Line Intrusion Detection Evaluation

Published in:
Comput. Networks, Vol. 34, No. 4, October 2000, pp. 579-595.

Summary

Eight sites participated in the second Defense Advanced Research Projects Agency (DARPA) off-line intrusion detection evaluation in 1999. A test bed generated live background traffic similar to that on a government site containing hundreds of users on thousands of hosts. More than 200 instances of 58 attack types were launched against victim UNIX and Windows NT hosts in three weeks of training data and two weeks of test data. False-alarm rates were low (less than 10 per day). The best detection was provided by network-based systems for old probe and old denial-of-service (DOS) attacks and by host-based systems for Solaris user-to-root (U2R) attacks. The best overall performance would have been provided by a combined system that used both host- and network-based intrusion detection. Detection accuracy was poor for previously unseen, new, stealthy and Windows NT attacks. Ten of the 58 attack types were completely missed by all systems. Systems missed attacks because signatures for old attacks did not generalize to new attacks, auditing was not available on all hosts, and protocols and TCP services were not analyzed at all or to the depth required. Promising capabilities were demonstrated by host-based systems, anomaly detection systems and a system that performs forensic analysis on file system data.
READ LESS

Summary

Eight sites participated in the second Defense Advanced Research Projects Agency (DARPA) off-line intrusion detection evaluation in 1999. A test bed generated live background traffic similar to that on a government site containing hundreds of users on thousands of hosts. More than 200 instances of 58 attack types were launched...

READ MORE

Estimation of handset nonlinearity with application to speaker recognition

Published in:
IEEE Trans. Speech Audio Process., Vol. 8, No. 5, September 2000, pp. 567-584.

Summary

A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. This "magnitude-only" representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that are a potential source of degradation in speaker and speech recognition algorithms. As such, the method is particularly suited to algorithms that use only spectral magnitude information. The distortion model consists of a memoryless nonlinearity sandwiched between two finite-length linear filters. Nonlinearities considered include arbitrary finite-order polynomials and parametric sigmoidal functionals derived from a carbon-button handset model. Minimization of a mean-squared spectral magnitude distance with respect to model parameters relies on iterative estimation via a gradient descent technique. Initial work has demonstrated the importance of addressing handset nonlinearity, in addition to linear distortion, in speaker recognition over telephone channels. A nonlinear handset "mapping" applied to training or testing data to reduce mismatch between different types of handset microphone outputs, improves speaker verification performance relative to linear compensation only. Finally, a method is proposed to merge the mapper strategy with a method of likelihood score normalization (hnorm) for further mismatch reduction and speaker verification performance improvement.
READ LESS

Summary

A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. This "magnitude-only" representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that...

READ MORE

Speaker recognition using G.729 speech codec parameters

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. II, 5-9 June 2000, pp. 1089-1092.

Summary

Experiments in Gaussian-mixture-model speaker recognition from mel-filter bank energies (MFBs) of the G.729 codec all-pole spectral envelope, showed significant performance loss relative to the standard mel-cepstral coefficients of G.729 synthesized (coded) speech. In this paper, we investigate two approaches to recover speaker recognition performance from G.729 parameters, rather than deriving cepstra from MFBs of an all-pole spectrum. Specifically, the G.729 LSFs are converted to "direct" cepstral coefficients for which there exists a one-to-one correspondence with the LSFs. The G.729 residual is also considered; in particular, appending G.729 pitch as a single parameter to the direct cepstral coefficients gives further performance gain. The second nonparametric approach uses the original MFB paradigm, but adds harmonic striations to the G.729 all-pole spectral envelope. Although obtaining considerable performance gains with these methods, we have yet to match the performance of G.729 synthesized speech, motivating the need for representing additional fine structure of the G.729 residual.
READ LESS

Summary

Experiments in Gaussian-mixture-model speaker recognition from mel-filter bank energies (MFBs) of the G.729 codec all-pole spectral envelope, showed significant performance loss relative to the standard mel-cepstral coefficients of G.729 synthesized (coded) speech. In this paper, we investigate two approaches to recover speaker recognition performance from G.729 parameters, rather than deriving...

READ MORE

The NIST Speaker Recognition Evaluation - overview, methodology, systems, results, perspective

Published in:
Speech Commun., Vol. 31, Nos. 2-3, June 2000, pp. 225-254.

Summary

This paper, based on three presentations made in 1998 at the RLA2C Workshop in Avignon, discusses the evaluation of speaker recognition systems from several perspectives. A general discussion of the speaker recognition task and the challenges and issues involved in its evaluation is offered. The NIST evaluations in this area and specifically the 1998 evaluation, its objectives, protocols and test data, are described. The algorithms used by the systems that were developed for this evaluation are summarized, compared and contrasted. Overall performance results of this evaluation are presented by means of detection error trade-off (DET) curves. These show the performance trade-off of missed detections and false alarms for each system and the effects on performance of training condition, test segment duration, the speakers' sex and the match or mismatch of training and test handsets. Several factors that were found to have an impact on performance, including pitch frequency, handset type and noise, are discussed and DET curves showing their effects are presented. The paper concludes with some perspective on the history of this technology and where it may be going.
READ LESS

Summary

This paper, based on three presentations made in 1998 at the RLA2C Workshop in Avignon, discusses the evaluation of speaker recognition systems from several perspectives. A general discussion of the speaker recognition task and the challenges and issues involved in its evaluation is offered. The NIST evaluations in this area...

READ MORE