Publications

Refine Results

(Filters Applied) Clear All

How Deep Neural Networks Can Improve Emotion Recognition on Video Data(547.86 KB)

Date:
September 25, 2016
Published in:
Proceedings of 2016 IEEE International Conference on Image Processing (ICIP)
Type:
Conference Paper
Topic:

Summary

There have been many impressive results obtained using deep learning for emotion recognition tasks in the last few years. In this work, we present a system that performs emotion recognition on video data using both convolutional neural networks (CNNs) and recurrent neural networks (RNNs).

The Offshore Precipitation Capability(1.48 MB)

Date:
September 16, 2016
Published in:
Project Report ATC-430, MIT Lincoln Laboratory
Type:
Project Report
Topic:

Summary

The Offshore Precipitation Capability (OPC) uses machine learning and image processing methods to estimate radar-like precipitation intensity and echo top heights beyond the range of weather radar.

High-throughput Ingest of Data Provenance Records into Accumulo(349.93 KB)

Author:
Date:
September 13, 2016
Published in:
Proceedings of IEEE High Performance Extreme Computing Conference (HPEC '16)
Type:
Conference Paper

Summary

Whole-system data provenance provides deep insight into the processing of data on a system, including detecting data integrity attacks. The downside to systems that collect whole-system data provenance is the sheer volume of data that is generated under many heavy workloads. In this paper, we investigate the use of D4M and Accumulo to support high-throughput data ingest of whole-system provenance data.

I-Vector Speaker and Language Recognition System on Android,

Date:
September 13, 2016
Published in:
Proceedings of IEEE High Performance Extreme Computing Conference (HPEC '16)
Type:
Conference Paper

Summary

I-Vector based speaker and language identification provides state of the art performance. However, this comes as a more computationally complex solution, which can often lead to challenges in resource-limited devices, such as phones or tablets. We present the implementation of an I-Vector speaker and language recognition system on the Android platform in the form of a fully functional application that allows speaker enrollment and language/speaker scoring within mobile contexts.

Relation of Automatically Extracted Formant Trajectories with Intelligibility Loss and Speaking Rate Decline in Amyotrophic Lateral Sclerosis(906.23 KB)

Date:
September 8, 2016
Published in:
Proceedings of Interspeech 2016, San Francisco, Calif.
Type:
Conference Paper
Topic:

Summary

Effective monitoring of bulbar disease progression in persons with amyotrophic lateral sclerosis (ALS) requires rapid, objective, automatic assessment of speech loss. The purpose of this work was to identify acoustic features that aid in predicting intelligibility loss and speaking rate decline in individuals with ALS.

Relating estimated cyclic spectral peak frequency to measured epilarynx length using Magnetic Resonance Imaging(272.05 KB)

Date:
September 8, 2016
Published in:
Proceedings of Interspeech 2016, San Francisco, Calif.
Type:
Conference Paper
Topic:

Summary

The epilarynx plays an important role in speech production, carrying information about the individual speaker and manner of articulation. Recent spectral processing techniques isolate a unique resonance with characteristics of the epilarynx previously shown via simulation, specifically cyclicity. Using Magnetic Resonance Imaging (MRI), the present work relates this estimated cyclic peak frequency to measured epilarynx length.

Language Recognition via Sparse Coding(354.13 KB)

Date:
September 8, 2016
Published in:
Proceedings of Interspeech 2016, San Francisco, Calif.
Type:
Conference Paper

Summary

Spoken language recognition requires a series of signal processing steps and learning algorithms to model distinguishing characteristics of different languages. In this paper, we present a sparse discriminative feature learning framework for language recognition. We use sparse coding, an unsupervised method, to compute efficient representations for spectral features from a speech utterance while learning basis vectors for language models.

Speaker Recognition Using Real vs Synthetic Parallel Data for DNN Channel Compensation(891.97 KB)

Date:
September 8, 2016
Published in:
Proceedings of Interspeech 2016, San Francisco, Calif.
Type:
Conference Paper

Summary

Recently there has been a great deal of interest in using deep neural networks (DNNs) for channel compensation under reverberant or noisy channel conditions such as those found in microphone data. This paper compares the use of real and synthetic data for training denoising DNNs for multi-microphone speaker recognition.

The AFRL-MITLL WMT16 News-Translation Task Systems(375.46 KB)

Date:
August 16, 2016
Published in:
Proceedings of the 11th Workshop on Machine Translation (WMT’16)
Type:
Conference Paper

Summary

This paper describes the AFRL-MITLL statistical machine translation systems and the improvements that were developed during the WMT16 evaluation campaign. New techniques applied this year include Neural Machine Translation, a unique selection process for language modelling data, additional out-of-vocabulary transliteration techniques, and morphology generation.

Corpora for the Evaluation of Robust Speaker Recognition Systems(177.37 KB)

Date:
August 10, 2016
Published in:
Proceedings of Interspeech 2016, San Francisco, Calif.
Type:
Conference Paper

Summary

The goal of this paper is to describe significant corpora available to support speaker recognition research and evaluation, along with details about the corpora collection and design. We describe the attributes of high-quality speaker recognition corpora. Considerations of the application, domain, and performance metrics are also discussed.