Publications
The MITLL/AFRL IWSLT-2014 MT System
Summary
Summary
This report summarizes the MITLL-AFRL MT and ASR systems and the experiments run using them during the 2014 IWSLT evaluation campaign. Our MT system is much improved over last year, owing to integration of techniques such as PRO and DREM optimization, factored language models, neural network joint model rescoring, multiple...
Comparing a high and low-level deep neural network implementation for automatic speech recognition
Summary
Summary
The use of deep neural networks (DNNs) has improved performance in several fields including computer vision, natural language processing, and automatic speech recognition (ASR). The increased use of DNNs in recent years has been largely due to performance afforded by GPUs, as the computational cost of training large networks on...
Finding good enough: a task-based evaluation of query biased summarization for cross language information retrieval
Summary
Summary
In this paper we present our task-based evaluation of query biased summarization for cross-language information retrieval (CLIR) using relevance prediction. We describe our 13 summarization methods each from one of four summarization strategies. We show how well our methods perform using Farsi text from the CLEF 2008 shared-task, which we...
Using deep belief networks for vector-based speaker recognition
Summary
Summary
Deep belief networks (DBNs) have become a successful approach for acoustic modeling in speech recognition. DBNs exhibit strong approximation properties, improved performance, and are parameter efficient. In this work, we propose methods for applying DBNs to speaker recognition. In contrast to prior work, our approach to DBNs for speaker recognition...
Talking Head Detection by Likelihood-Ratio Test(220.2 KB)
Summary
Summary
Detecting accurately when a person whose face is visible in an audio-visual medium is the audible speaker is an enabling technology with a number of useful applications. The likelihood-ratio test formulation and feature signal processing employed here allow the use of high-dimensional feature sets in the audio and visual domain...
Sparse matrix partitioning for parallel eigenanalysis of large static and dynamic graphs
Summary
Summary
Numerous applications focus on the analysis of entities and the connections between them, and such data are naturally represented as graphs. In particular, the detection of a small subset of vertices with anomalous coordinated connectivity is of broad interest, for problems such as detecting strange traffic in a computer network...
Content+context=classification: examining the roles of social interactions and linguist content in Twitter user classification
Summary
Summary
Twitter users demonstrate many characteristics via their online presence. Connections, community memberships, and communication patterns reveal both idiosyncratic and general properties of users. In addition, the content of tweets can be critical for distinguishing the role and importance of a user. In this work, we explore Twitter user classification using...
VizLinc: integrating information extraction, search, graph analysis, and geo-location for the visual exploration of large data sets
Summary
Summary
In this demo paper we introduce VizLinc; an open-source software suite that integrates automatic information extraction, search, graph analysis, and geo-location for interactive visualization and exploration of large data sets. VizLinc helps users in: 1) understanding the type of information the data set under study might contain, 2) finding patterns...
Exploiting morphological, grammatical, and semantic correlates for improved text difficulty assessment
Summary
Summary
We present a low-resource, language-independent system for text difficulty assessment. We replicate and improve upon a baseline by Shen et al. (2013) on the Interagency Language Roundtable (ILR) scale. Our work demonstrates that the addition of morphological, information theoretic, and language modeling features to a traditional readability baseline greatly benefits...
Audio-visual identity grounding for enabling cross media search
Summary
Summary
Automatically searching for media clips in large heterogeneous datasets is an inherently difficult challenge, and nearly impossibly so when searching across distinct media types (e.g. finding audio clips that match an image). In this paper we introduce the exploitation of identity grounding for enabling this cross media search and exploration...