Publications

Refine Results

(Filters Applied) Clear All

Weather radar network benefit model for nontornadic thunderstorm wind casualty cost reduction

Author:
Published in:
Wea. Climate Soc., Vol. 12, No. 4, October 2020, pp. 789-804.

Summary

An econometric geospatial benefit model for nontornadic thunderstorm wind casualty reduction is developed for meteorological radar network planning. Regression analyses on 22 years (1998–2019) of storm event and warning data show, likely for the first time, a clear dependence of nontornadic severe thunderstorm warning performance on radar coverage. Furthermore, nontornadic thunderstorm wind casualty rates are observed to be negatively correlated with better warning performance. In combination, these statistical relationships form the basis of a cost model that can be differenced between radar network configurations to generate geospatial benefit density maps. This model, applied to the current contiguous U.S. weather radar network, yields a benefit estimate of $207 million (M) yr^-1 relative to no radar coverage at all. The remaining benefit pool with respect to enhanced radar coverage and scan update rate is about $36M yr^-1. Aggregating these nontornadic thunderstorm wind results with estimates from earlier tornado and flash flood cost reduction models yields a total benefit of $1.12 billion yr^-1 for the present-day radars and a remaining radar-based benefit pool of $778M yr^-1.
READ LESS

Summary

An econometric geospatial benefit model for nontornadic thunderstorm wind casualty reduction is developed for meteorological radar network planning. Regression analyses on 22 years (1998–2019) of storm event and warning data show, likely for the first time, a clear dependence of nontornadic severe thunderstorm warning performance on radar coverage. Furthermore, nontornadic...

READ MORE

A multi-task LSTM framework for improved early sepsis prediction

Summary

Early detection for sepsis, a high-mortality clinical condition, is important for improving patient outcomes. The performance of conventional deep learning methods degrades quickly as predictions are made several hours prior to the clinical definition. We adopt recurrent neural networks (RNNs) to improve early prediction of the onset of sepsis using times series of physiological measurements. Furthermore, physiological data is often missing and imputation is necessary. Absence of data might arise due to decisions made by clinical professionals which carries information. Using the missing data patterns into the learning process can further guide how much trust to place on imputed values. A new multi-task LSTM model is proposed that takes informative missingness into account during training that effectively attributes trust to temporal measurements. Experimental results demonstrate our method outperforms conventional CNN and LSTM models on the PhysioNet-2019 CiC early sepsis prediction challenge in terms of area under receiver-operating curve and precision-recall curve, and further improves upon calibration of prediction scores.
READ LESS

Summary

Early detection for sepsis, a high-mortality clinical condition, is important for improving patient outcomes. The performance of conventional deep learning methods degrades quickly as predictions are made several hours prior to the clinical definition. We adopt recurrent neural networks (RNNs) to improve early prediction of the onset of sepsis using...

READ MORE

Towards a distributed framework for multi-agent reinforcement learning research

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into the limits of these approaches as computation increases. In this paper, we present a distributed RL training framework designed for super computing infrastructures such as the MIT SuperCloud. We review a collection of challenging learning environments—such as Google Research Football, StarCraft II, and Multi-Agent Mujoco— which are at the frontier of reinforcement learning research. We provide results on these environments that illustrate the current state of the field on these problems. Finally, we also quantify and discuss the computational requirements needed for performing RL research by enumerating all experiments performed on these environments.
READ LESS

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into...

READ MORE

Leveraging linear algebra to count and enumerate simple subgraphs

Published in:
2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Summary

Even though subgraph counting and subgraph matching are well-known NP-Hard problems, they are foundational building blocks for many scientific and commercial applications. In order to analyze graphs that contain millions to billions of edges, distributed systems can provide computational scalability through search parallelization. One recent approach for exposing graph algorithm parallelization is through a linear algebra formulation and the use of the matrix multiply operation, which conceptually is equivalent to a massively parallel graph traversal. This approach has several benefits, including 1) a mathematically-rigorous foundation, and 2) ability to leverage specialized linear algebra accelerators and high-performance libraries. In this paper, we explore and define a linear algebra methodology for performing exact subgraph counting and matching for 4-vertex subgraphs excluding the clique. Matches on these simple subgraphs can be joined as components for a larger subgraph. With thorough analysis, we demonstrate that the linear algebra formulation leverages path aggregation which allows it to be up 2x to 5x more efficient in traversing the search space and compressing the results as compared to tree-based subgraph matching techniques.
READ LESS

Summary

Even though subgraph counting and subgraph matching are well-known NP-Hard problems, they are foundational building blocks for many scientific and commercial applications. In order to analyze graphs that contain millions to billions of edges, distributed systems can provide computational scalability through search parallelization. One recent approach for exposing graph algorithm...

READ MORE

Hardware foundation for secure computing

Published in:
2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Summary

Software security solutions are often considered to be more adaptable than their hardware counterparts. However, software has to work within the limitations of the system hardware platform, of which the selection is often dictated by functionality rather than security. Performance issues of security solutions without proper hardware support are easy to understand. The real challenge, however, is in the dilemma of "what should be done?" vs. "what could be done?" Security software could become ineffective if its "liberal" assumptions, e.g., the availability of a substantial trusted computing base (TCB) on the hardware platform, are violated. To address this dilemma, we have been developing and prototyping a security-by-design hardware foundation platform that enhances mainstream microprocessors with proper hardware security primitives to support and enhance software security solutions. This paper describes our progress in the use of a customized security co-processor to provide security services.
READ LESS

Summary

Software security solutions are often considered to be more adaptable than their hardware counterparts. However, software has to work within the limitations of the system hardware platform, of which the selection is often dictated by functionality rather than security. Performance issues of security solutions without proper hardware support are easy...

READ MORE

Enhanced parallel simulation for ACAS X development

Published in:
2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Summary

ACAS X is the next generation airborne collision avoidance system intended to meet the demands of the rapidly evolving U.S. National Airspace System (NAS). The collision avoidance safety and operational suitability of the system are optimized and continuously evaluated by simulating billions of characteristic aircraft encounters in a fast-time Monte Carlo environment. There is therefore an inherent computational cost associated with each ACAS X design iteration and parallelization of the simulations is necessary to keep up with rapid design cycles. This work describes an effort to profile and enhance the parallel computing infrastructure deployed on the computing resources offered by the Lincoln Laboratory Supercomputing Center. The approach to large-scale parallelization of our fast-time airspace encounter simulation tool is presented along with corresponding parallel profile data collected on different kinds of compute nodes. A simple stochastic model for distributed simulation is also presented to inform optimal work batching for improved simulation efficiency. The paper concludes with a discussion on how this high-performance parallel simulation method enables the rapid safety-critical design of ACAS X in a fast-paced iterative design process.
READ LESS

Summary

ACAS X is the next generation airborne collision avoidance system intended to meet the demands of the rapidly evolving U.S. National Airspace System (NAS). The collision avoidance safety and operational suitability of the system are optimized and continuously evaluated by simulating billions of characteristic aircraft encounters in a fast-time Monte...

READ MORE

GraphChallenge.org triangle counting performance [e-print]

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of preparsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. The triangle counting component of GraphChallenge.org tests the performance of graph processing systems to count all the triangles in a graph and exercises key graph operations found in many graph algorithms. In 2017, 2018, and 2019 many triangle counting submissions were received from a wide range of authors and organizations. This paper presents a performance analysis of the best performers of these submissions. These submissions show that their state-of-the-art triangle counting execution time, Ttri, is a strong function of the number of edges in the graph, Ne, which improved significantly from 2017 (Ttri \approx (Ne/10^8)^4=3) to 2018 (Ttri \approx Ne/10^9) and remained comparable from 2018 to 2019. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs
READ LESS

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems...

READ MORE

GraphChallenge.org sparse deep neural network performance [e-print]

Summary

The MIT/IEEE/Amazon GraphChallenge.org encourages community approaches to developing new solutions for analyzing graphs and sparse data. Sparse AI analytics present unique scalability difficulties. The Sparse Deep Neural Network (DNN) Challenge draws upon prior challenges from machine learning, high performance computing, and visual analytics to create a challenge that is reflective of emerging sparse AI systems. The sparse DNN challenge is based on a mathematically well-defined DNN inference computation and can be implemented in any programming environment. In 2019 several sparse DNN challenge submissions were received from a wide range of authors and organizations. This paper presents a performance analysis of the best performers of these submissions. These submissions show that their state-of-the-art sparse DNN execution time, TDNN, is a strong function of the number of DNN operations performed, Nop. The sparse DNN challenge provides a clear picture of current sparse DNN systems and underscores the need for new innovations to achieve high performance on very large sparse DNNs.
READ LESS

Summary

The MIT/IEEE/Amazon GraphChallenge.org encourages community approaches to developing new solutions for analyzing graphs and sparse data. Sparse AI analytics present unique scalability difficulties. The Sparse Deep Neural Network (DNN) Challenge draws upon prior challenges from machine learning, high performance computing, and visual analytics to create a challenge that is reflective...

READ MORE

A hardware root-of-trust design for low-power SoC edge devices

Published in:
2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Summary

In this work, we introduce a hardware root-of-trust architecture for low-power edge devices. An accelerator-based SoC design that includes the hardware root-of-trust architecture is developed. An example application for the device is presented. We examine attacks based on physical access given the significant threat they pose to unattended edge systems. The hardware root-of-trust provides security features to ensure the integrity of the SoC execution environment when deployed in uncontrolled, unattended locations. E-fused boot memory ensures the boot code and other security critical software is not compromised after deployment. Digitally signed programmable instruction memory prevents execution of code from untrusted sources. A programmable finite state machine is used to enforce access policies to device resources even if the application software on the device is compromised. Access policies isolate the execution states of application and security-critical software. The hardware root-of-trust architecture saves energy with a lower hardware overhead than a separate secure enclave while eliminating software attack surfaces for access control policies.
READ LESS

Summary

In this work, we introduce a hardware root-of-trust architecture for low-power edge devices. An accelerator-based SoC design that includes the hardware root-of-trust architecture is developed. An example application for the device is presented. We examine attacks based on physical access given the significant threat they pose to unattended edge systems...

READ MORE

Fast training of deep neural networks robust to adversarial perturbations

Published in:
2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Summary

Deep neural networks are capable of training fast and generalizing well within many domains. Despite their promising performance, deep networks have shown sensitivities to perturbations of their inputs (e.g., adversarial examples) and their learned feature representations are often difficult to interpret, raising concerns about their true capability and trustworthiness. Recent work in adversarial training, a form of robust optimization in which the model is optimized against adversarial examples, demonstrates the ability to improve performance sensitivities to perturbations and yield feature representations that are more interpretable. Adversarial training, however, comes with an increased computational cost over that of standard (i.e., nonrobust) training, rendering it impractical for use in largescale problems. Recent work suggests that a fast approximation to adversarial training shows promise for reducing training time and maintaining robustness in the presence of perturbations bounded by the infinity norm. In this work, we demonstrate that this approach extends to the Euclidean norm and preserves the human-aligned feature representations that are common for robust models. Additionally, we show that using a distributed training scheme can further reduce the time to train robust deep networks. Fast adversarial training is a promising approach that will provide increased security and explainability in machine learning applications for which robust optimization was previously thought to be impractical.
READ LESS

Summary

Deep neural networks are capable of training fast and generalizing well within many domains. Despite their promising performance, deep networks have shown sensitivities to perturbations of their inputs (e.g., adversarial examples) and their learned feature representations are often difficult to interpret, raising concerns about their true capability and trustworthiness. Recent...

READ MORE