Publications

Refine Results

(Filters Applied) Clear All

Discrimination between singing and speech in real-world audio

Published in:
SLT 2014, IEEE Spoken Language Technology Workshop, 7-10 December 2014.

Summary

The performance of a spoken language system suffers when non-speech is incorrectly classified as speech. Singing is particularly difficult to discriminate from speech, since both are natural language. However, singing conveys a melody, whereas speech does not; in particular, a singer's fundamental frequency should not deviate significantly from an underlying sequence of notes, while a speaker's fundamental frequency is freer to deviate about a mean value. The present work presents a novel approach to discrimination between singing and speech that exploits the distribution of such deviations. The melody in singing is typically non known a priori, so the distribution cannot be measured directly. Instead, an approximation to its Fourier transform is proposed that allows the unknown melody to be treated as multiplicative noise. This feature vector is shown to be highly discriminative between speech and singing segments when coupled with a simple maximum likelihood classifier, outperforming prior work on real-world data.
READ LESS

Summary

The performance of a spoken language system suffers when non-speech is incorrectly classified as speech. Singing is particularly difficult to discriminate from speech, since both are natural language. However, singing conveys a melody, whereas speech does not; in particular, a singer's fundamental frequency should not deviate significantly from an underlying...

READ MORE

The MITLL/AFRL IWSLT-2014 MT System

Summary

This report summarizes the MITLL-AFRL MT and ASR systems and the experiments run using them during the 2014 IWSLT evaluation campaign. Our MT system is much improved over last year, owing to integration of techniques such as PRO and DREM optimization, factored language models, neural network joint model rescoring, multiple phrase tables, and development set creation. We focused our efforts this year on the tasks of translating from Arabic, Russian, Chinese, and Farsi into English, as well as translating from English to French. ASR performance also improved, partly due to increased efforts with deep neural networks for hybrid and tandem systems. Work focused on both the English and Italian ASR tasks.
READ LESS

Summary

This report summarizes the MITLL-AFRL MT and ASR systems and the experiments run using them during the 2014 IWSLT evaluation campaign. Our MT system is much improved over last year, owing to integration of techniques such as PRO and DREM optimization, factored language models, neural network joint model rescoring, multiple...

READ MORE

Comparing a high and low-level deep neural network implementation for automatic speech recognition

Published in:
1st Workshop for High Performance Technical Computing in Dynamic Languages, HPTCDL 2014, 17 November 2014.

Summary

The use of deep neural networks (DNNs) has improved performance in several fields including computer vision, natural language processing, and automatic speech recognition (ASR). The increased use of DNNs in recent years has been largely due to performance afforded by GPUs, as the computational cost of training large networks on a CPU is prohibitive. Many training algorithms are well-suited to the GPU; however, writing hand-optimized GPGPU code is a significant undertaking. More recently, high-level libraries have attempted to simplify GPGPU development by automatically performing tasks such as optimization and code generation. This work utilizes Theano, a high-level Python library, to implement a DNN for the purpose of phone recognition in ASR. Performance is compared against a low-level, hand-optimized C++/CUDA DNN implementation from Kaldi, a popular ASR toolkit. Results show that the DNN implementation in Theano has CPU and GPU runtimes on par with that of Kaldi, while requiring approximately 95% less lines of code.
READ LESS

Summary

The use of deep neural networks (DNNs) has improved performance in several fields including computer vision, natural language processing, and automatic speech recognition (ASR). The increased use of DNNs in recent years has been largely due to performance afforded by GPUs, as the computational cost of training large networks on...

READ MORE

Visualization evaluation for cyber security: trends and future directions(1.22 MB)

Published in:
Proceedings of the Eleventh Workshop on Visualization for Cyber Security

Summary

The Visualization for Cyber Security research community (VizSec) addresses longstanding challenges in cyber security by adapting and evaluating information visualization techniques with application to the cyber security domain. In this paper, we survey and categorize the evaluation metrics, components, and techniques that have been utilized in the past decade of VizSec research literature.
READ LESS

Summary

The Visualization for Cyber Security research community (VizSec) addresses longstanding challenges in cyber security by adapting and evaluating information visualization techniques with application to the cyber security domain. In this paper, we survey and categorize the evaluation metrics, components, and techniques that have been utilized in the past decade of...

READ MORE

On the challenges of effective movement

Published in:
ACM Workshop on Moving Target Defense (MTD 2014), 3 November 2014.

Summary

Moving Target (MT) defenses have been proposed as a gamechanging approach to rebalance the security landscape in favor of the defender. MT techniques make systems less deterministic, less static, and less homogeneous in order to increase the level of effort required to achieve a successful compromise. However, a number of challenges in achieving effective movement lead to weaknesses in MT techniques that can often be used by the attackers to bypass or otherwise nullify the impact of that movement. In this paper, we propose that these challenges can be grouped into three main types: coverage, unpredictability, and timeliness. We provide a description of these challenges and study how they impact prominent MT techniques. We also discuss a number of other considerations faced when designing and deploying MT defenses.
READ LESS

Summary

Moving Target (MT) defenses have been proposed as a gamechanging approach to rebalance the security landscape in favor of the defender. MT techniques make systems less deterministic, less static, and less homogeneous in order to increase the level of effort required to achieve a successful compromise. However, a number of...

READ MORE

Information leaks without memory disclosures: remote side channel attacks on diversified code

Published in:
CCS 2014: Proc. of the ACM Conf. on Computer and Communications Security, 3-7 November 2014.

Summary

Code diversification has been proposed as a technique to mitigate code reuse attacks, which have recently become the predominant way for attackers to exploit memory corruption vulnerabilities. As code reuse attacks require detailed knowledge of where code is in memory, diversification techniques attempt to mitigate these attacks by randomizing what instructions are executed and where code is located in memory. As an attacker cannot read the diversified code, it is assumed he cannot reliably exploit the code. In this paper, we show that the fundamental assumption behind code diversity can be broken, as executing the code reveals information about the code. Thus, we can leak information without needing to read the code. We demonstrate how an attacker can utilize a memory corruption vulnerability to create side channels that leak information in novel ways, removing the need for a memory disclosure vulnerability. We introduce seven new classes of attacks that involve fault analysis and timing side channels, where each allows a remote attacker to learn how code has been diversified.
READ LESS

Summary

Code diversification has been proposed as a technique to mitigate code reuse attacks, which have recently become the predominant way for attackers to exploit memory corruption vulnerabilities. As code reuse attacks require detailed knowledge of where code is in memory, diversification techniques attempt to mitigate these attacks by randomizing what...

READ MORE

Spectral anomaly detection in very large graphs: Models, noise, and computational complexity(92.92 KB)

Published in:
Proceedings of Seminar 14461: High-performance Graph Algorithms and Applications in Computational Science, Wadern, Germany

Summary

Anomaly detection in massive networks has numerous theoretical and computational challenges, especially as the behavior to be detected becomes small in comparison to the larger network. This presentation focuses on recent results in three key technical areas, specifically geared toward spectral methods for detection.
READ LESS

Summary

Anomaly detection in massive networks has numerous theoretical and computational challenges, especially as the behavior to be detected becomes small in comparison to the larger network. This presentation focuses on recent results in three key technical areas, specifically geared toward spectral methods for detection.

READ MORE

Optical phased-array ladar

Published in:
Appl. Opt., Vol. 53, No. 31, 1 November 2014, pp. 7551-5.

Summary

We demonstrate a ladar with 0.5 m class range resolution obtained by integrating a continuous-wave optical phased-array transmitter with a Geiger-mode avalanche photodiode receiver array. In contrast with conventional ladar systems, an array of continuous-wave sources is used to effectively pulse illuminate a target by electro-optically steering far-field fringes. From the reference frame of a point in the far field, a steered fringe appears as a pulse. Range information is thus obtained by measuring the arrival time of a pulse return from a target to a receiver pixel. This ladar system offers a number of benefits, including broad spectral coverage, high efficiency, small size, power scalability, and versatility.
READ LESS

Summary

We demonstrate a ladar with 0.5 m class range resolution obtained by integrating a continuous-wave optical phased-array transmitter with a Geiger-mode avalanche photodiode receiver array. In contrast with conventional ladar systems, an array of continuous-wave sources is used to effectively pulse illuminate a target by electro-optically steering far-field fringes. From...

READ MORE

Finding good enough: a task-based evaluation of query biased summarization for cross language information retrieval

Published in:
EMNLP 2014, Proc. of Conf. on Empirical Methods in Natural Language Processing, 25-29 October, 2014, pp. 657-69.

Summary

In this paper we present our task-based evaluation of query biased summarization for cross-language information retrieval (CLIR) using relevance prediction. We describe our 13 summarization methods each from one of four summarization strategies. We show how well our methods perform using Farsi text from the CLEF 2008 shared-task, which we translated to English automatically. We report precision/recall/F1, accuracy and time-on-task. We found that different summarization methods perform optimally for different evaluation metrics, but overall query biased word clouds are the best summarization strategy. In our analysis, we demonstrate that using the ROUGE metric on our sentence-based summaries cannot make the same kinds of distinctions as our evaluation framework does. Finally, we present our recommendations for creating much-needed evaluation standards and databases.
READ LESS

Summary

In this paper we present our task-based evaluation of query biased summarization for cross-language information retrieval (CLIR) using relevance prediction. We describe our 13 summarization methods each from one of four summarization strategies. We show how well our methods perform using Farsi text from the CLEF 2008 shared-task, which we...

READ MORE

Bayesian discovery of threat networks

Published in:
IEEE Trans. Signal Process., Vol. 62, No. 20, 15 October 2014, pp. 5324-38.

Summary

A novel unified Bayesian framework for network detection is developed, under which a detection algorithm is derived based on random walks on graphs. The algorithm detects threat networks using partial observations of their activity, and is proved to be optimum in the Neyman-Pearson sense. The algorithm is defined by a graph, at least one observation, and a diffusion model for threat. A link to well-known spectral detection methods is provided, and the equivalence of the random walk and harmonic solutions to the Bayesian formulation is proven. A general diffusion model is introduced that utilizes spatio-temporal relationships between vertices, and is used for a specific space-time formulation that leads to significant performance improvements on coordinated covert networks. This performance is demonstrated using a new hybrid mixed-membership blockmodel introduced to simulate random covert networks with realistic properties.
READ LESS

Summary

A novel unified Bayesian framework for network detection is developed, under which a detection algorithm is derived based on random walks on graphs. The algorithm detects threat networks using partial observations of their activity, and is proved to be optimum in the Neyman-Pearson sense. The algorithm is defined by a...

READ MORE