Publications
Detecting pathogen exposure during the non-symptomatic incubation period using physiological data: proof of concept in non-human primates
Summary
Summary
Background and Objectives: Early warning of bacterial and viral infection, prior to the development of overt clinical symptoms, allows not only for improved patient care and outcomes but also enables faster implementation of public health measures (patient isolation and contact tracing). Our primary objectives in this effort are 3-fold. First...
GraphChallenge.org triangle counting performance [e-print]
Summary
Summary
The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems...
GraphChallenge.org sparse deep neural network performance [e-print]
Summary
Summary
The MIT/IEEE/Amazon GraphChallenge.org encourages community approaches to developing new solutions for analyzing graphs and sparse data. Sparse AI analytics present unique scalability difficulties. The Sparse Deep Neural Network (DNN) Challenge draws upon prior challenges from machine learning, high performance computing, and visual analytics to create a challenge that is reflective...
75,000,000,000 streaming inserts/second using hierarchical hypersparse GraphBLAS matrices
Summary
Summary
The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates...
Cultivating professional technical skills and understanding through hands-on online learning experiences
Summary
Summary
Life-long learning is necessary for all professions because the technologies, tools and skills required for success over the course of a career expand and change. Professionals in science, technology, engineering and mathematics (STEM) fields face particular challenges as new multi-disciplinary methods, e.g. Machine Learning and Artificial Intelligence, mature to replace...
Large scale parallelization using file-based communications
Summary
Summary
In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node...
Survey and benchmarking of machine learning accelerators
Summary
Summary
Advances in multicore processors and accelerators have opened the flood gates to greater exploration and application of machine learning techniques to a variety of applications. These advances, along with breakdowns of several trends including Moore's Law, have prompted an explosion of processors and accelerators that promise even greater computational and...
Streaming 1.9 billion hyperspace network updates per second with D4M
Summary
Summary
The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in a variety of languages (Python, Julia, and Matlab/Octave) and provides a lightweight in-memory database implementation of hypersparse arrays that are ideal for analyzing many types of network data. D4M relies on associative arrays which combine properties of spreadsheets...
AI enabling technologies: a survey
Summary
Summary
Artificial Intelligence (AI) has the opportunity to revolutionize the way the United States Department of Defense (DoD) and Intelligence Community (IC) address the challenges of evolving threats, data deluge, and rapid courses of action. Developing an end-to-end artificial intelligence system involves parallel development of different pieces that must work together...
A billion updates per second using 30,000 hierarchical in-memory D4M databases
Summary
Summary
Analyzing large scale networks requires high performance streaming updates of graph representations of these data. Associative arrays are mathematical objects combining properties of spreadsheets, databases, matrices, and graphs, and are well-suited for representing and analyzing streaming network data. The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in...