Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

By

Wade Shen Clear filter

Analyzing and interpreting automatically learned rules across dialects

September 9, 2012

Conference Paper

Author:

Nancy Chen

…

Published in:

INTERSPEECH 2012: 13th Annual Conf. of the Int. Speech Communication Assoc., 9-13 September 2012.

Topic:

language recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper, we demonstrate how informative dialect recognition systems such as acoustic pronunciation model (APM) help speech scientists locate and analyze phonetic rules efficiently. In particular, we analyze dialect-specific characteristics automatically learned from APM across two American English dialects. We show that unsupervised rule retrieval performs similarly to supervised retrieval, indicating that APM is useful in practical applications, where word transcripts are often unavailable. We also demonstrate that the top-ranking rules learned from APM generally correspond to the linguistic literature, and can even pinpoint potential research directions to refine existing knowledge. Thus, the APM system can help phoneticians analyze rules efficiently by characterizing large amounts of data to postulate rule candidates, so they can reserve time to conduct more targeted investigations. Potential applications of informative dialect recognition systems include forensic phonetics and diagnosis of spoken language disorders.

READ LESS

Summary

Analyzing and interpreting automatically learned rules across dialects

Assessing the speaker recognition performance of naive listeners using Mechanical Turk

May 22, 2011

Conference Paper

Author:

Wade Shen

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 22-27 May 2011, pp. 5916-5919.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper we attempt to quantify the ability of naive listeners to perform speaker recognition in the context of the NIST evaluation task. We describe our protocol: a series of listening experiments using large numbers of naive listeners (432) on Amazon's Mechanical Turk that attempts to measure the ability of the average human listener to perform speaker recognition. Our goal was to compare the performance of the average human listener to both forensic experts and state-of-the- art automatic systems. We show that naive listeners vary substantially in their performance, but that an aggregation of listener responses can achieve performance similar to that of expert forensic examiners.

READ LESS

Summary

Assessing the speaker recognition performance of naive listeners using Mechanical Turk

Informative dialect recognition using context-dependent pronunciation modeling

May 22, 2011

Conference Paper

Author:

Nancy Chen

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 22-27 May 2011, pp. 4396-4399.

Topic:

language recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

We propose an informative dialect recognition system that learns phonetic transformation rules, and uses them to identify dialects. A hidden Markov model is used to align reference phones with dialect specific pronunciations to characterize when and how often substitutions, insertions, and deletions occur. Decision tree clustering is used to find context-dependent phonetic rules. We ran recognition tasks on 4 Arabic dialects. Not only do the proposed systems perform well on their own, but when fused with baselines they improve performance by 21-36% relative. In addition, our proposed decision-tree system beats the baseline monophone system in recovering phonetic rules by 21% relative. Pronunciation rules learned by our proposed system quantify the occurrence frequency of known rules, and suggest rule candidates for further linguistic studies.

READ LESS

Summary

Informative dialect recognition using context-dependent pronunciation modeling

USSS-MITLL 2010 human assisted speaker recognition

January 2, 2011

Conference Paper

Author:

Reva Schwartz

…

Published in:

Proc. IEEE ICASSP, 26 May 2011, pp. 5904-7.

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

The United States Secret Service (USSS) teamed with MIT Lincoln Laboratory (MIT/LL) in the US National Institute of Standards and Technology's 2010 Speaker Recognition Evaluation of Human Assisted Speaker Recognition (HASR). We describe our qualitative and automatic speaker comparison processes and our fusion of these processes, which are adapted from USSS casework. The USSS-MIT/LL 2010 HASR results are presented. We also present post-evaluation results. The results are encouraging within the resolving power of the evaluation, which was limited to enable reasonable levels of human effort. Future ideas and efforts are discussed, including new features and capitalizing on naive listeners.

READ LESS

Summary

USSS-MITLL 2010 human assisted speaker recognition

Using United States government language proficiency standards for MT evaluation

January 1, 2011

Book Chapter

Author:

Douglas A. Jones

…

Published in:

Chapter 5.3.3 in Handbook of Natural Language Processing and Machine Translation, 2011, pp. 775-82.

Topic:

human language technology

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

The purpose of this section is to discuss a method of measuring the degree to which the essential meaning of the original text is communicated in the MT output. We view this test to be a measurement of the fundamental goal of MT; that is, to convey information accurately from one language to another. We conducted a series of experiments in which educated native readers of English responded to test questions about translated versions of texts originally written in Arabic and Chinese. We compared the results for those subjects using machine translations of the texts with those using professional reference translations. These comparisons serve as a baseline for determining the level of foreign language reading comprehension that can be achieved by a native English reader relying on machine translation technology. This also allows us to explore the relationship between the current, broadly accepted automatic measures of performance for machine translation and a test derived from the Defense Language Proficiency Test, which is used throughout the Defense Department for measuring foreign language proficiency. Our goal is to put MT system performance evaluation into terms that are meaningful to US government consumers of MT output.

READ LESS

Summary

Using United States government language proficiency standards for MT evaluation

The MIT-LL/AFRL IWSLT-2010 MT system

December 2, 2010

Conference Paper

Author:

Wade Shen

…

Published in:

Proc. Int. Workshop on Spoken Language Translation, IWSLT, 2 December 2010.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

This paper describes the MIT-LUAFRL statistical MT system and the improvements that were developed during the IWSLT 2010 evaluation campaign. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance on the Arabic and Turkish to English translation tasks. We also participated in the new French to English BTEC and English to French TALK tasks. We discuss the architecture of the MIT-LL/AFRL MT system, improvements over our 2008 system, and experiments we ran during the IWSLT-2010 evaluation. Specifically, we focus on 1) cross-domain translation using MAP adaptation, 2) Turkish morphological processing and translation, 3) improved Arabic morphology for MT preprocessing, and 4) system combination methods for machine translation.

READ LESS

Summary

The MIT-LL/AFRL IWSLT-2010 MT system

A linguistically-informative approach to dialect recognition using dialect-discriminating context-dependent phonetic models

March 14, 2010

Conference Paper

Author:

Nancy Chen

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15 March 2010, pp. 5014-5017.

Topic:

language recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

We propose supervised and unsupervised learning algorithms to extract dialect discriminating phonetic rules and use these rules to adapt biphones to identify dialects. Despite many challenges (e.g., sub-dialect issues and no word transcriptions), we discovered dialect discriminating biphones compatible with the linguistic literature, while outperforming a baseline monophone system by 7.5% (relative). Our proposed dialect discriminating biphone system achieves similar performance to a baseline all-biphone system despite using 25% fewer biphone models. In addition, our system complements PRLM (Phone Recognition followed by Language Modeling), verified by obtaining relative gains of 15-29% when fused with PRLM. Our work is an encouraging first step towards a linguistically-informative dialect recognition system, with potential applications in forensic phonetics, accent training, and language learning.

READ LESS

Summary

A linguistically-informative approach to dialect recognition using dialect-discriminating context-dependent phonetic models

Query-by-example spoken term detection using phonetic posteriorgram templates

December 13, 2009

Conference Paper

Author:

Timothy J. Hazen

…

Published in:

Proc. IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU, 13-17 December 2009, pp. 421-426.

Topic:

speech recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

This paper examines a query-by-example approach to spoken term detection in audio files. The approach is designed for low-resource situations in which limited or no in-domain training material is available and accurate word-based speech recognition capability is unavailable. Instead of using word or phone strings as search terms, the user presents the system with audio snippets of desired search terms to act as the queries. Query and test materials are represented using phonetic posteriorgrams obtained from a phonetic recognition system. Query matches in the test data are located using a modified dynamic time warping search between query templates and test utterances. Experiments using this approach are presented using data from the Fisher corpus.

READ LESS

Summary

Query-by-example spoken term detection using phonetic posteriorgram templates

The MIT-LL/AFRL IWSLT-2008 MT System

December 1, 2009

Conference Paper

Author:

Wade Shen

…

Published in:

Int. Workshop on Spoken Language Translation, IWSLT, 1-2 December 2009.

Topic:

machine translation

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

This paper describes the MIT-LL/AFRL statistical MT system and the improvements that were developed during the IWSLT 2008 evaluation campaign. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance for both text and speech-based translation on Chinese and Arabic translation tasks. We discuss the architecture of the MIT-LL/AFRL MT system, improvements over our 2007 system, and experiments we ran during the IWSLT-2008 evaluation. Specifically, we focus on 1) novel segmentation models for phrase-based MT, 2) improved lattice and confusion network decoding of speech input, 3) improved Arabic morphology for MT preprocessing, and 4) system combination methods for machine translation.

READ LESS

Summary

The MIT-LL/AFRL IWSLT-2008 MT System

A comparison of query-by-example methods for spoken term detection

September 6, 2009

Conference Paper

Author:

Wade Shen

…

Published in:

INTERSPEECH 2009, 6-10 September 2009.

Topic:

speech recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper we examine an alternative interface for phonetic search, namely query-by-example, that avoids OOV issues associated with both standard word-based and phonetic search methods. We develop three methods that compare query lattices derived from example audio against a standard ngrambased phonetic index and we analyze factors affecting the performance of these systems. We show that the best systems under this paradigm are able to achieve 77% precision when retrieving utterances from conversational telephone speech and returning 10 results from a single query (performance that is better than a similar dictionary-based approach) suggesting significant utility for applications requiring high precision. We also show that these systems can be further improved using relevance feedback: By incorporating four additional queries the precision of the best system can be improved by 13.7% relative. Our systems perform well despite high phone recognition error rates (> 40%) and make use of no pronunciation or letter-to-sound resources.

READ LESS

Summary

A comparison of query-by-example methods for spoken term detection

Publications

Refine Results

By

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results