Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

USSS-MITLL 2010 human assisted speaker recognition

January 2, 2011

Conference Paper

Author:

Reva Schwartz

…

Published in:

Proc. IEEE ICASSP, 26 May 2011, pp. 5904-7.

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

The United States Secret Service (USSS) teamed with MIT Lincoln Laboratory (MIT/LL) in the US National Institute of Standards and Technology's 2010 Speaker Recognition Evaluation of Human Assisted Speaker Recognition (HASR). We describe our qualitative and automatic speaker comparison processes and our fusion of these processes, which are adapted from USSS casework. The USSS-MIT/LL 2010 HASR results are presented. We also present post-evaluation results. The results are encouraging within the resolving power of the evaluation, which was limited to enable reasonable levels of human effort. Future ideas and efforts are discussed, including new features and capitalizing on naive listeners.

READ LESS

Summary

USSS-MITLL 2010 human assisted speaker recognition

Using United States government language proficiency standards for MT evaluation

January 1, 2011

Book Chapter

Author:

Douglas A. Jones

…

Published in:

Chapter 5.3.3 in Handbook of Natural Language Processing and Machine Translation, 2011, pp. 775-82.

Topic:

human language technology

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

The purpose of this section is to discuss a method of measuring the degree to which the essential meaning of the original text is communicated in the MT output. We view this test to be a measurement of the fundamental goal of MT; that is, to convey information accurately from one language to another. We conducted a series of experiments in which educated native readers of English responded to test questions about translated versions of texts originally written in Arabic and Chinese. We compared the results for those subjects using machine translations of the texts with those using professional reference translations. These comparisons serve as a baseline for determining the level of foreign language reading comprehension that can be achieved by a native English reader relying on machine translation technology. This also allows us to explore the relationship between the current, broadly accepted automatic measures of performance for machine translation and a test derived from the Defense Language Proficiency Test, which is used throughout the Defense Department for measuring foreign language proficiency. Our goal is to put MT system performance evaluation into terms that are meaningful to US government consumers of MT output.

READ LESS

Summary

Using United States government language proficiency standards for MT evaluation

Topic identification

January 1, 2011

Book Chapter

Author:

Timothy J. Hazen

Published in:

Chapter 12, Spoken Language Understanding: Systems for Extracting from Speech, Gokhan Tur and Renato De Mori, eds., 2011, pp. 319-356.

Topic:

topic identification

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this chapter we discuss the problem of identifying the underlying topics beings discussed in spoken audio recordings. We focus primarily on the issues related to supervised topic classification or detection tasks using labeled training data, but we also discuss approaches for other related tasks including novel topic detection and unsupervised topic clustering. The chapter provides an overview of the common tasks and data sets, evaluation metrics, and algorithms most commonly used in this area of study.

READ LESS

Summary

Topic identification

Direct and latent modeling techniques for computing spoken document similarity

December 12, 2010

Conference Paper

Author:

Timothy J. Hazen

Published in:

SLT 2010, IEEE Workshop on Spoken Language Technology, 12-15 December 2010.

Topic:

topic identification

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Document similarity measures are required for a variety of data organization and retrieval tasks including document clustering, document link detection, and query-by-example document retrieval. In this paper we examine existing and novel document similarity measures for use with spoken document collections processed with automatic speech recognition (ASR) technology. We compare direct vector space approaches using the cosine similarity measure applied to feature vectors constructed with various forms of term frequency inverse document frequency (TF-IDF) normalization against latent topic modeling approaches based on latent Dirichlet allocation (LDA). In document link detection experiments on the Fisher Corpus, we find that an approach that applies bagging to models derived from LDA substantially outperforms the direct vector space approach.

READ LESS

Summary

Direct and latent modeling techniques for computing spoken document similarity

Subgraph detection using eigenvector L1 norms

December 6, 2010

Conference Paper

Author:

Benjamin A. Miller

…

Published in:

23rd Int. Conf. on Neural Info. Process. Syst., NIPS, 6-9 December 2010, pp. 1633-41.

Topic:

signal processing

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a "detection theory" for graph-valued data. Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph's so-called modularity matrix. This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. An analysis of subgraphs in real network datasets confirms the efficacy of this approach.

READ LESS

Summary

Subgraph detection using eigenvector L1 norms

The MIT-LL/AFRL IWSLT-2010 MT system

December 2, 2010

Conference Paper

Author:

Wade Shen

…

Published in:

Proc. Int. Workshop on Spoken Language Translation, IWSLT, 2 December 2010.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

This paper describes the MIT-LUAFRL statistical MT system and the improvements that were developed during the IWSLT 2010 evaluation campaign. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance on the Arabic and Turkish to English translation tasks. We also participated in the new French to English BTEC and English to French TALK tasks. We discuss the architecture of the MIT-LL/AFRL MT system, improvements over our 2008 system, and experiments we ran during the IWSLT-2010 evaluation. Specifically, we focus on 1) cross-domain translation using MAP adaptation, 2) Turkish morphological processing and translation, 3) improved Arabic morphology for MT preprocessing, and 4) system combination methods for machine translation.

READ LESS

Summary

The MIT-LL/AFRL IWSLT-2010 MT system

Physical layer considerations for wideband cognitive radio

October 31, 2010

Conference Paper

Author:

Joel I. Goodman

…

Published in:

MILCOM 2010, IEEE Military Communications Conference , 31 October-3 November 2010, pp. 2113-2118.

Topic:

signal processing

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Next generation cognitive radios will benefit from the capability of transmitting and receiving communications waveforms across many disjoint frequency channels spanning hundreds of megahertz of bandwidth. The information theoretic advantages of multi-channel operation for cognitive radio (CR), however, come at the expense of stringent linearity requirements on the analog transmit and receive hardware. This paper presents the quantitative advantages of multi-channel operation for next generation CR, and the advanced digital compensation algorithms to mitigate transmit and receive nonlinearities that enable broadband multi-channel operation. Laboratory measurements of the improvement in the performance of a multi-channel CR communications system operating below 2 GHz in over 500 MHz of instantaneous bandwidth are presented.

READ LESS

Summary

Physical layer considerations for wideband cognitive radio

Graph-embedding for speaker recognition

September 30, 2010

Conference Paper

Author:

Zahi N. Karam

…

William M. Campbell

Published in:

INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, 26-30 September 2010, pp. 2742-2745.

Topic:

social network

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Popular methods for speaker classification perform speaker comparison in a high-dimensional space, however, recent work has shown that most of the speaker variability is captured by a low-dimensional subspace of that space. In this paper we examine whether additional structure in terms of nonlinear manifolds exist within the high-dimensional space. We will use graph embedding as a proxy to the manifold and show the use of the embedding in data visualization and exploration. ISOMAP will be used to explore the existence and dimension of the space. We also examine whether the manifold assumption can help in two classification tasks: data-mining and standard NIST speaker recognition evaluations (SRE). Our results show that the data lives on a manifold and that exploiting this structure can yield significant improvements on the data-mining task. The improvement in preliminary experiments on all trials of the NIST SRE Eval-06 core task are less but significant.

READ LESS

Summary

Graph-embedding for speaker recognition

Transcript-dependent speaker recognition using mixer 1 and 2

September 26, 2010

Conference Paper

Author:

Frederick S. Richardson

…

Joseph P. Campbell Jr

Published in:

INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, 26-30 September 2010, pp. 2102-2015.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Transcript-dependent speaker-recognition experiments are performed with the Mixer 1 and 2 read-transcription corpus using the Lincoln Laboratory speaker recognition system. Our analysis shows how widely speaker-recognition performance can vary on transcript-dependent data compared to conversational data of the same durations, given enrollment data from the same spontaneous conversational speech. A description of the techniques used to deal with the unaudited data in order to create 171 male and 198 female text-dependent experiments from the Mixer 1 and 2 read transcription corpus is given.

READ LESS

Summary

Transcript-dependent speaker recognition using mixer 1 and 2

Multi-pitch estimation by a joint 2-D representation of pitch and pitch dynamics

September 26, 2010

Conference Paper

Author:

Tianyu Tom Wang

…

Thomas F. Quatieri

Published in:

INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, 26-30 September 2010, pp. 645-648.

Topic:

speech enhancement

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

Multi-pitch estimation of co-channel speech is especially challenging when the underlying pitch tracks are close in pitch value (e.g., when pitch tracks cross). Building on our previous work, we demonstrate the utility of a two-dimensional (2-D) analysis method of speech for this problem by exploiting its joint representation of pitch and pitch-derivative information from distinct speakers. Specifically, we propose a novel multi-pitch estimation method consisting of 1) a data-driven classifier for pitch candidate selection, 2) local pitch and pitch-derivative estimation by k-means clustering, and 3) a Kalman filtering mechanism for pitch tracking and assignment. We evaluate our method on a database of all-voiced speech mixtures and illustrate its capability to estimate pitch tracks in cases where pitch tracks are separate and when they are close in pitch value (e.g., at crossings).

READ LESS

Summary

Multi-pitch estimation by a joint 2-D representation of pitch and pitch dynamics

Publications

Refine Results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results