Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

Tagged As

high performance computing Clear filter

Processing of crowdsourced observations of aircraft in a high performance computing environment

September 22, 2020

Conference Paper

Author:

Andrew J. Weinert

…

Published in:

2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Topic:

aircraft

R&D area:

Air Traffic Control

R&D group:

Transportation Safety and Resilience

Summary

As unmanned aircraft systems (UASs) continue to integrate into the U.S. National Airspace System (NAS), there is a need to quantify the risk of airborne collisions between unmanned and manned aircraft to support regulation and standards development. Both regulators and standards developing organizations have made extensive use of Monte Carlo collision risk analysis simulations using probabilistic models of aircraft flight. We've previously determined that the observations of manned aircraft by the OpenSky Network, a community network of ground-based sensors, are appropriate to develop models of the low altitude environment. This works overviews the high performance computing workflow designed and deployed on the Lincoln Laboratory Supercomputing Center to process 3.9 billion observations of aircraft. We then trained the aircraft models using more than 250,000 flight hours at 5,000 feet above ground level or below. A key feature of the workflow is that all the aircraft observations and supporting datasets are available as open source technologies or been released to the public domain.

READ LESS

Summary

Processing of crowdsourced observations of aircraft in a high performance computing environment

A hardware root-of-trust design for low-power SoC edge devices

September 22, 2020

Conference Paper

Author:

Alan J. Ehret

…

Published in:

2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Topic:

embedded computing

R&D area:

R&D group:

Summary

In this work, we introduce a hardware root-of-trust architecture for low-power edge devices. An accelerator-based SoC design that includes the hardware root-of-trust architecture is developed. An example application for the device is presented. We examine attacks based on physical access given the significant threat they pose to unattended edge systems. The hardware root-of-trust provides security features to ensure the integrity of the SoC execution environment when deployed in uncontrolled, unattended locations. E-fused boot memory ensures the boot code and other security critical software is not compromised after deployment. Digitally signed programmable instruction memory prevents execution of code from untrusted sources. A programmable finite state machine is used to enforce access policies to device resources even if the application software on the device is compromised. Access policies isolate the execution states of application and security-critical software. The hardware root-of-trust architecture saves energy with a lower hardware overhead than a separate secure enclave while eliminating software attack surfaces for access control policies.

READ LESS

Summary

A hardware root-of-trust design for low-power SoC edge devices

Towards a distributed framework for multi-agent reinforcement learning research

September 22, 2020

Conference Paper

Author:

Yutai Zhou

…

Published in:

2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Topic:

high performance computing

R&D area:

R&D group:

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into the limits of these approaches as computation increases. In this paper, we present a distributed RL training framework designed for super computing infrastructures such as the MIT SuperCloud. We review a collection of challenging learning environments—such as Google Research Football, StarCraft II, and Multi-Agent Mujoco— which are at the frontier of reinforcement learning research. We provide results on these environments that illustrate the current state of the field on these problems. Finally, we also quantify and discuss the computational requirements needed for performing RL research by enumerating all experiments performed on these environments.

READ LESS

Summary

Towards a distributed framework for multi-agent reinforcement learning research

Hardware foundation for secure computing

September 22, 2020

Conference Paper

Author:

Donato Kava

…

Published in:

2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Secure Resilient Systems and Technology

Summary

Software security solutions are often considered to be more adaptable than their hardware counterparts. However, software has to work within the limitations of the system hardware platform, of which the selection is often dictated by functionality rather than security. Performance issues of security solutions without proper hardware support are easy to understand. The real challenge, however, is in the dilemma of "what should be done?" vs. "what could be done?" Security software could become ineffective if its "liberal" assumptions, e.g., the availability of a substantial trusted computing base (TCB) on the hardware platform, are violated. To address this dilemma, we have been developing and prototyping a security-by-design hardware foundation platform that enhances mainstream microprocessors with proper hardware security primitives to support and enhance software security solutions. This paper describes our progress in the use of a customized security co-processor to provide security services.

READ LESS

Summary

Hardware foundation for secure computing

Leveraging linear algebra to count and enumerate simple subgraphs

September 22, 2020

Conference Paper

Author:

Vitaliy Gleyzer

…

Published in:

2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Topic:

high performance computing

R&D area:

Homeland Protection

R&D group:

Summary

Even though subgraph counting and subgraph matching are well-known NP-Hard problems, they are foundational building blocks for many scientific and commercial applications. In order to analyze graphs that contain millions to billions of edges, distributed systems can provide computational scalability through search parallelization. One recent approach for exposing graph algorithm parallelization is through a linear algebra formulation and the use of the matrix multiply operation, which conceptually is equivalent to a massively parallel graph traversal. This approach has several benefits, including 1) a mathematically-rigorous foundation, and 2) ability to leverage specialized linear algebra accelerators and high-performance libraries. In this paper, we explore and define a linear algebra methodology for performing exact subgraph counting and matching for 4-vertex subgraphs excluding the clique. Matches on these simple subgraphs can be joined as components for a larger subgraph. With thorough analysis, we demonstrate that the linear algebra formulation leverages path aggregation which allows it to be up 2x to 5x more efficient in traversing the search space and compressing the results as compared to tree-based subgraph matching techniques.

READ LESS

Summary

Leveraging linear algebra to count and enumerate simple subgraphs

Enhanced parallel simulation for ACAS X development

September 22, 2020

Conference Paper

Author:

Adam E. Gjersvik

Published in:

2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Topic:

high performance computing

R&D area:

Air Traffic Control

R&D group:

Transportation Safety and Resilience

Summary

ACAS X is the next generation airborne collision avoidance system intended to meet the demands of the rapidly evolving U.S. National Airspace System (NAS). The collision avoidance safety and operational suitability of the system are optimized and continuously evaluated by simulating billions of characteristic aircraft encounters in a fast-time Monte Carlo environment. There is therefore an inherent computational cost associated with each ACAS X design iteration and parallelization of the simulations is necessary to keep up with rapid design cycles. This work describes an effort to profile and enhance the parallel computing infrastructure deployed on the computing resources offered by the Lincoln Laboratory Supercomputing Center. The approach to large-scale parallelization of our fast-time airspace encounter simulation tool is presented along with corresponding parallel profile data collected on different kinds of compute nodes. A simple stochastic model for distributed simulation is also presented to inform optimal work batching for improved simulation efficiency. The paper concludes with a discussion on how this high-performance parallel simulation method enables the rapid safety-critical design of ACAS X in a fast-paced iterative design process.

READ LESS

Summary

Enhanced parallel simulation for ACAS X development

GraphChallenge.org triangle counting performance [e-print]

September 22, 2020

Conference Paper

Author:

Siddharth S. Samsi

…

Published in:

2020 IEEE High Performance Computing Conf., HPEC, 22-24 September 2020.

Topic:

graph processing

R&D area:

R&D group:

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of preparsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. The triangle counting component of GraphChallenge.org tests the performance of graph processing systems to count all the triangles in a graph and exercises key graph operations found in many graph algorithms. In 2017, 2018, and 2019 many triangle counting submissions were received from a wide range of authors and organizations. This paper presents a performance analysis of the best performers of these submissions. These submissions show that their state-of-the-art triangle counting execution time, Ttri, is a strong function of the number of edges in the graph, Ne, which improved significantly from 2017 (Ttri \approx (Ne/10^8)^4=3) to 2018 (Ttri \approx Ne/10^9) and remained comparable from 2018 to 2019. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs

READ LESS

Summary

GraphChallenge.org triangle counting performance [e-print]

Hypersparse neural network analysis of large-scale internet traffic

September 24, 2019

Conference Paper

Author:

Jeremy Kepner

…

Published in:

IEEE High Performance Extreme Computing Conf., HPEC, 24-26 September 2019.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

The Internet is transforming our society, necessitating a quantitative understanding of Internet traffic. Our team collects and curates the largest publicly available Internet traffic data containing 50 billion packets. Utilizing a novel hypersparse neural network analysis of "video" streams of this traffic using 10,000 processors in the MIT SuperCloud reveals a new phenomena: the importance of otherwise unseen leaf nodes and isolated links in Internet traffic. Our neural network approach further shows that a two-parameter modified Zipf-Mandelbrot distribution accurately describes a wide variety of source/destination statistics on moving sample windows ranging from 100,000 to 100,000,000 packets over collections that span years and continents. The inferred model parameters distinguish different network streams and the model leaf parameter strongly correlates with the fraction of the traffic in different underlying network topologies. The hypersparse neural network pipeline is highly adaptable and different network statistics and training models can be incorporated with simple changes to the image filter functions.

READ LESS

Summary

Hypersparse neural network analysis of large-scale internet traffic

Large scale parallelization using file-based communications

September 24, 2019

Conference Paper

Author:

Chansup Byun

…

Published in:

2019 IEEE High Performance Computing Conf., HPEC, 24-26 September 2019.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node message file transfers when both the sending and receiving processes are not on the same node. However, even with this additional overhead cost, its benefits are far greater for the overall cluster operation in addition to the performance enhancement in message communications for large scale parallel jobs. For example, when running a 2048-process parallel job, it achieved about 34 times better performance with MPI_Bcast() when using the local filesystem. Furthermore, since the security for transferring message files is handled entirely by using the secure copy protocol (scp) and the file system permissions, no additional security measures or ports are required other than those that are typically required on an HPC system.

READ LESS

Summary

Large scale parallelization using file-based communications

Streaming 1.9 billion hyperspace network updates per second with D4M

September 24, 2019

Conference Paper

Author:

Jeremy Kepner

…

Published in:

2019 IEEE High Performance Computing Conf., HPEC, 24-26 September 2019.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in a variety of languages (Python, Julia, and Matlab/Octave) and provides a lightweight in-memory database implementation of hypersparse arrays that are ideal for analyzing many types of network data. D4M relies on associative arrays which combine properties of spreadsheets, databases, matrices, graphs, and networks, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of D4M associative arrays put enormous pressure on the memory hierarchy. This work describes the design and performance optimization of an implementation of hierarchical associative arrays that reduces memory pressure and dramatically increases the update rate into an associative array. The parameters of hierarchical associative arrays rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical arrays achieve over 40,000 updates per second in a single instance. Scaling to 34,000 instances of hierarchical D4M associative arrays on 1,100 server nodes on the MIT SuperCloud achieved a sustained update rate of 1,900,000,000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets.

READ LESS

Summary

Streaming 1.9 billion hyperspace network updates per second with D4M

Publications

Refine Results

Tagged As

Summary

Summary

Summary

Summary

Summary

Summary

Hardware foundation for secure computing

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results