Publications
Speaker linking and applications using non-parametric hashing methods
Summary
Summary
Large unstructured audio data sets have become ubiquitous and present a challenge for organization and search. One logical approach for structuring data is to find common speakers and link occurrences across different recordings. Prior approaches to this problem have focused on basic methodology for the linking task. In this paper...
Speaker recognition using real vs synthetic parallel data for DNN channel compensation
Summary
Summary
Recent work has shown large performance gains using denoising DNNs for speech processing tasks under challenging acoustic conditions. However, training these DNNs requires large amounts of parallel multichannel speech data which can be impractical or expensive to collect. The effective use of synthetic parallel data as an alternative has been...
Language recognition via sparse coding
Summary
Summary
Spoken language recognition requires a series of signal processing steps and learning algorithms to model distinguishing characteristics of different languages. In this paper, we present a sparse discriminative feature learning framework for language recognition. We use sparse coding, an unsupervised method, to compute efficient representations for spectral features from a...
The AFRL-MITLL WMT16 news-translation task systems
Summary
Summary
This paper describes the AFRL-MITLL statistical machine translation systems and the improvements that were developed during the WMT16 evaluation campaign. New techniques applied this year include Neural Machine Translation, a unique selection process for language modelling data, additional out-of-vocabulary transliteration techniques, and morphology generation.
Matching community structure across online social networks
Summary
Summary
The discovery of community structure in networks is a problem of considerable interest in recent years. In online social networks, often times, users are simultaneously involved in multiple social media sites, some of which share common social relationships. It is of great interest to uncover a shared community structure across...
Cross-domain entity resolution in social media
Summary
Summary
The challenge of associating entities across multiple domains is a key problem in social media understanding. Successful cross-domain entity resolution provides integration of information from multiple sites to create a complete picture of user and community activities, characteristics, and trends. In this work, we examine the problem of entity resolution...
Channel compensation for speaker recognition using MAP adapted PLDA and denoising DNNs
Summary
Summary
Over several decades, speaker recognition performance has steadily improved for applications using telephone speech. A big part of this improvement has been the availability of large quantities of speaker-labeled data from telephone recordings. For new data applications, such as audio from room microphones, we would like to effectively use existing...
The MITLL NIST LRE 2015 Language Recognition System
Summary
Summary
In this paper we describe the most recent MIT Lincoln Laboratory language recognition system developed for the NIST 2015 Language Recognition Evaluation (LRE). The submission features a fusion of five core classifiers, with most systems developed in the context of an i-vector framework. The 2015 evaluation presented new paradigms. First...
A vocal modulation model with application to predicting depression severity
Summary
Summary
Speech provides a potential simple and noninvasive "on-body" means to identify and monitor neurological diseases. Here we develop a model for a class of vocal biomarkers exploiting modulations in speech, focusing on Major Depressive Disorder (MDD) as an application area. Two model components contribute to the envelope of the speech...
Operational assessment of keyword search on oral history
Summary
Summary
This project assesses the resources necessary to make oral history searchable by means of automatic speech recognition (ASR). There are many inherent challenges in applying ASR to conversational speech: smaller training set sizes and varying demographics, among others. We assess the impact of dataset size, word error rate and term-weighted...