Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

I-vector speaker and language recognition system on Android

September 13, 2016

Conference Paper

Author:

Christian Vazquez-Machado

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

language recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

I-Vector based speaker and language identification provides state of the art performance. However, this comes as a more computationally complex solution, which can often lead to challenges in resource-limited devices, such as phones or tablets. We present the implementation of an I-Vector speaker and language recognition system on the Android platform in the form of a fully functional application that allows speaker enrollment and language/speaker scoring within mobile contexts. We include a detailed account of the challenges to port the system and its dependencies, which were necessary to optimize matrix operations in the I-Vector implementation. The system was benchmarked on a for a Google Nexus 6, showing a speed increase of 61.68% in scoring and 82.63% in enrollment operations with the implemented optimizations. The application was tested in mobile settings on a Nexus 7 tablet with forty participants, showing a rough accuracy of 84%. The optimized platform showed the capacity to perform near real-time recognition within a mobile setting and showcases the viability of I-Vector systems on resource-limited environments.

READ LESS

Summary

I-vector speaker and language recognition system on Android

Enhancing HPC security with a user-based firewall

September 13, 2016

Conference Paper

Author:

Andrew J. Prout

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

High Performance Computing (HPC) systems traditionally allow their users unrestricted use of their internal network. While this network is normally controlled enough to guarantee privacy without the need for encryption, it does not provide a method to authenticate peer connections. Protocols built upon this internal network, such as those used in MPI, Lustre, Hadoop, or Accumulo, must provide their own authentication at the application layer. Many methods have been employed to perform this authentication, such as operating system privileged ports, Kerberos, munge, TLS, and PKI certificates. However, support for all of these methods requires the HPC application developer to include support and the user to configure and enable these services. The user-based firewall capability we have prototyped enables a set of rules governing connections across the HPC internal network to be put into place using Linux netfilter. By using an operating system-level capability, the system is not reliant on any developer or user actions to enable security. The rules we have chosen and implemented are crafted to not impact the vast majority of users and be completely invisible to them. Additionally, we have measured the performance impact of this system under various workloads.

READ LESS

Summary

Enhancing HPC security with a user-based firewall

Benchmarking the Graphulo processing framework

September 13, 2016

Conference Paper

Author:

Vijay N. Gadepally

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

Graph algorithms have wide applicability to a variety of domains and are often used on massive datasets. Recent standardization efforts such as the GraphBLAS are designed to specify a set of key computational kernels that hardware and software developers can adhere to. Graphulo is a processing framework that enables GraphBLAS kernels in the Apache Accumulo database. In our previous work, we have demonstrated a core Graphulo operation that performs large scale multiplication operations of database tables called TableMult. In this article, we present results of scaling the Graphulo engine to larger problems and scalablity when using greater number of resources. Specifically, we present the results of two experiments that demonstrate Graphulo scaling performance as linear with the number of available resources. The first experiment demonstrates cluster processing rates through Graphulo's TableMult operator on two large graphs, scaled between 2^17 and 2^19 vertices. The second experiment uses TableMult to extract a random set of rows from a large graph (2^19 nodes) to simulate a cued graph analytic. These benchmarking results are of relevance to Graphulo users who wish to apply Graphulo to their graph problems.

READ LESS

Summary

Benchmarking the Graphulo processing framework

Designing a new high performance computing education strategy for professional scientists and engineers

September 13, 2016

Conference Paper

Author:

Julia Mullen

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

For decades the High Performance Computing (HPC) community has used web content, workshops and embedded HPC scientists to enable practitioners to harness the power of parallel and distributed computing. The most successful approaches, face-to-face tutorials and embedded professionals, don't scale. To create scalable, flexible, educational experiences for practitioners in all phases of a career, from student to professional, we turn to Massively Open Online Courses (MOOCs). We detail the conversion of personalized tutorials to a selfpaced online course. In this demonstration, we highlight a course that mimics in-person tutorials by providing personalized paths through content that interleaves theory and practice, to help researchers learn key parallel computing concepts while developing familiarity with their HPC target system.

READ LESS

Summary

Designing a new high performance computing education strategy for professional scientists and engineers

From NoSQL Accumulo to NewSQL Graphulo: design and utility of graph algorithms inside a BigTable database

September 13, 2016

Conference Paper

Author:

Dylan D. Hutchison

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

algorithms

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

Google BigTable's scale-out design for distributed key-value storage inspired a generation of NoSQL databases. Recently the NewSQL paradigm emerged in response to analytic workloads that demand distributed computation local to data storage. Many such analytics take the form of graph algorithms, a trend that motivated the GraphBLAS initiative to standardize a set of matrix math kernels for building graph algorithms. In this article we show how it is possible to implement the GraphBLAS kernels in a BigTable database by presenting the design of Graphulo, a library for executing graph algorithms inside the Apache Accumulo database. We detail the Graphulo implementation of two graph algorithms and conduct experiments comparing their performance to two main-memory matrix math systems. Our results shed insight into the conditions that determine when executing a graph algorithm is faster inside a database versus an external system—in short, that memory requirements and relative I/O are critical factors.

READ LESS

Summary

From NoSQL Accumulo to NewSQL Graphulo: design and utility of graph algorithms inside a BigTable database

Sparse-coded net model and applications

September 13, 2016

Conference Paper

Author:

Youngjune L. Gwon

…

Published in:

2016 IEEE Int. Workshop on Machine Learning for Signal Processing, 13-16 September 2016.

Topic:

artificial intelligence

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

As an unsupervised learning method, sparse coding can discover high-level representations for an input in a large variety of learning problems. Under semi-supervised settings, sparse coding is used to extract features for a supervised task such as classification. While sparse representations learned from unlabeled data independently of the supervised task perform well, we argue that sparse coding should also be built as a holistic learning unit optimizing on the supervised task objectives more explicitly. In this paper, we propose sparse-coded net, a feedforward model that integrates sparse coding and task-driven output layers, and describe training methods in detail. After pretraining a sparse-coded net via semi-supervised learning, we optimize its task-specific performance in a novel backpropagation algorithm that can traverse nonlinear feature pooling operators to update the dictionary. Thus, sparse-coded net can be applied to supervised dictionary learning. We evaluate sparse-coded net with classification problems in sound, image, and text data. The results confirm a significant improvement over semi-supervised learning as well as superior classification performance against deep stacked autoencoder neural network and GMM-SVM pipelines in small to medium-scale settings.

READ LESS

Summary

Sparse-coded net model and applications

Julia implementation of the Dynamic Distributed Dimensional Data Model

September 13, 2016

Conference Paper

Author:

null

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

Julia is a new language for writing data analysis programs that are easy to implement and run at high performance. Similarly, the Dynamic Distributed Dimensional Data Model (D4M) aims to clarify data analysis operations while retaining strong performance. D4M accomplishes these goals through a composable, unified data model on associative arrays. In this work, we present an implementation of D4M in Julia and describe how it enables and facilitates data analysis. Several experiments showcase scalable performance in our new Julia version as compared to the original Matlab implementation.

READ LESS

Summary

Julia implementation of the Dynamic Distributed Dimensional Data Model

Benchmarking SciDB data import on HPC systems

September 13, 2016

Conference Paper

Author:

Siddharth S. Samsi

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

R&D group:

Summary

SciDB is a scalable, computational database management system that uses an array model for data storage. The array data model of SciDB makes it ideally suited for storing and managing large amounts of imaging data. SciDB is designed to support advanced analytics in database, thus reducing the need for extracting data for analysis. It is designed to be massively parallel and can run on commodity hardware in a high performance computing (HPC) environment. In this paper, we present the performance of SciDB using simulated image data. The Dynamic Distributed Dimensional Data Model (D4M) software is used to implement the benchmark on a cluster running the MIT SuperCloud software stack. A peak performance of 2.2M database inserts per second was achieved on a single node of this system. We also show that SciDB and the D4M toolbox provide more efficient ways to access random sub-volumes of massive datasets compared to the traditional approaches of reading volumetric data from individual files. This work describes the D4M and SciDB tools we developed and presents the initial performance results. This performance was achieved by using parallel inserts, a in-database merging of arrays as well as supercomputing techniques, such as distributed arrays and single-program-multiple-data programming.

READ LESS

Summary

Benchmarking SciDB data import on HPC systems

Relation of automatically extracted formant trajectories with intelligibility loss and speaking rate decline in amyotrophic lateral sclerosis

September 8, 2016

Conference Paper

Author:

Rachelle Horwitz-Martin

…

Published in:

INTERSPEECH 2016: 16th Annual Conf. of the Int. Speech Communication Assoc., 8-12 September 2016.

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

Effective monitoring of bulbar disease progression in persons with amyotrophic lateral sclerosis (ALS) requires rapid, objective, automatic assessment of speech loss. The purpose of this work was to identify acoustic features that aid in predicting intelligibility loss and speaking rate decline in individuals with ALS. Features were derived from statistics of the first (F1) and second (F2) formant frequency trajectories and their first and second derivatives. Motivated by a possible link between components of formant dynamics and specific articulator movements, these features were also computed for low-pass and high-pass filtered formant trajectories. When compared to clinician-rated intelligibility and speaking rate assessments, F2 features, particularly mean F2 speed and a novel feature, mean F2 acceleration, were most strongly correlated with intelligibility and speaking rate, respectively (Spearman correlations > 0.70, p < 0.0001). These features also yielded the best predictions in regression experiments (r > 0.60, p < 0.0001). Comparable results were achieved using low-pass filtered F2 trajectory features, with higher correlations and lower prediction errors achieved for speaking rate over intelligibility. These findings suggest information can be exploited in specific frequency components of formant trajectories, with implications for automatic monitoring of ALS.

READ LESS

Summary

Relation of automatically extracted formant trajectories with intelligibility loss and speaking rate decline in amyotrophic lateral sclerosis

Speaker linking and applications using non-parametric hashing methods

September 8, 2016

Conference Paper

Author:

Douglas E. Sturim

…

William M. Campbell

Published in:

INTERSPEECH 2016: 16th Annual Conf. of the Int. Speech Communication Assoc., 8-12 September 2016.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Large unstructured audio data sets have become ubiquitous and present a challenge for organization and search. One logical approach for structuring data is to find common speakers and link occurrences across different recordings. Prior approaches to this problem have focused on basic methodology for the linking task. In this paper, we introduce a novel trainable nonparametric hashing method for indexing large speaker recording data sets. This approach leads to tunable computational complexity methods for speaker linking. We focus on a scalable clustering method based on hashing canopy-clustering. We apply this method to a large corpus of speaker recordings, demonstrate performance tradeoffs, and compare to other hashing methods.

READ LESS

Summary

Speaker linking and applications using non-parametric hashing methods

Publications

Refine Results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results