Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

Tagged As

high performance computing Clear filter

Measuring the impact of Spectre and Meltdown

September 25, 2018

Conference Paper

Author:

Andrew J. Prout

…

Published in:

IEEE High Performance Extreme Computing Conf., HPEC, 25-27 September 2018.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

The Spectre and Meltdown flaws in modern microprocessors represent a new class of attacks that have been difficult to mitigate. The mitigations that have been proposed have known performance impacts. The reported magnitude of these impacts varies depending on the industry sector and expected workload characteristics. In this paper, we measure the performance impact on several workloads relevant to HPC systems. We show that the impact can be significant on both synthetic and realistic workloads. We also show that the performance penalties are difficult to avoid even in dedicated systems where security is a lesser concern.

READ LESS

Summary

Measuring the impact of Spectre and Meltdown

Lessons learned from a decade of providing interactive, on-demand high performance computing to scientists and engineers

June 28, 2018

Conference Paper

Author:

Julia Mullen

…

Published in:

ISC High Performance Int. Workshops, Lecture Notes in Computer Science (LNSC 11203), 28 June 2018, pp. 655-68.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

For decades, the use of HPC systems was limited to those in the physical sciences who had mastered their domain in conjunction with a deep understanding of HPC architectures and algorithms. During these same decades, consumer computing device advances produced tablets and smartphones that allow millions of children to interactively develop and share code projects across the globe. As the HPC community faces the challenges associated with guiding researchers from disciplines using high productivity interactive tools to effective use of HPC systems, it seems appropriate to revisit the assumptions surrounding the necessary skills required for access to large computational systems. For over a decade, MIT Lincoln Laboratory has been supporting interactive, on demand high performance computing by seamlessly integrating familiar high productivity tools to provide users with an increased number of design turns, rapid prototyping capability, and faster time to insight. In this paper, we discuss the lessons learned while supporting interactive, on-demand high performance computing from the perspectives of the users and the team supporting the users and the system. Building on these lessons, we present an overview of current needs and the technical solutions we are building to lower the barrier to entry for new users from the humanities, social, and biological sciences.

READ LESS

Summary

Lessons learned from a decade of providing interactive, on-demand high performance computing to scientists and engineers

Performance measurements of supercomputing and cloud storage solutions

September 12, 2017

Conference Paper

Author:

Michael S. Jones

…

Published in:

HPEC 2017: IEEE Conf. on High Performance Extreme Computing, 12-14 September 2017.

Topic:

high performance computing

R&D area:

R&D group:

Summary

Increasing amounts of data from varied sources, particularly in the fields of machine learning and graph analytics, are causing storage requirements to grow rapidly. A variety of technologies exist for storing and sharing these data, ranging from parallel file systems used by supercomputers to distributed block storage systems found in clouds. Relatively few comparative measurements exist to inform decisions about which storage systems are best suited for particular tasks. This work provides these measurements for two of the most popular storage technologies: Lustre and Amazon S3. Lustre is an opensource, high performance, parallel file system used by many of the largest supercomputers in the world. Amazon's Simple Storage Service, or S3, is part of the Amazon Web Services offering, and offers a scalable, distributed option to store and retrieve data from anywhere on the Internet. Parallel processing is essential for achieving high performance on modern storage systems. The performance tests used span the gamut of parallel I/O scenarios, ranging from single-client, single-node Amazon S3 and Lustre performance to a large-scale, multi-client test designed to demonstrate the capabilities of a modern storage appliance under heavy load. These results show that, when parallel I/O is used correctly (i.e., many simultaneous read or write processes), full network bandwidth performance is achievable and ranged from 10 gigabits/s over a 10 GigE S3 connection to 0.35 terabits/s using Lustre on a 1200 port 10 GigE switch. These results demonstrate that S3 is well-suited to sharing vast quantities of data over the Internet, while Lustre is well-suited to processing large quantities of data locally.

READ LESS

Summary

Performance measurements of supercomputing and cloud storage solutions

A cloud-based brain connectivity analysis tool

September 12, 2017

Conference Paper

Author:

Laura J. Brattain

…

Published in:

HPEC 2017: IEEE Conf. on High Performance Extreme Computing, 12-14 September 2017.

Topic:

high performance computing

R&D area:

R&D group:

Summary

With advances in high throughput brain imaging at the cellular and sub-cellular level, there is growing demand for platforms that can support high performance, large-scale brain data processing and analysis. In this paper, we present a novel pipeline that combines Accumulo, D4M, geohashing, and parallel programming to manage large-scale neuron connectivity graphs in a cloud environment. Our brain connectivity graph is represented using vertices (fiber start/end nodes), edges (fiber tracks), and the 3D coordinates of the fiber tracks. For optimal performance, we take the hybrid approach of storing vertices and edges in Accumulo and saving the fiber track 3D coordinates in flat files. Accumulo database operations offer low latency on sparse queries while flat files offer high throughput for storing, querying, and analyzing bulk data. We evaluated our pipeline by using 250 gigabytes of mouse neuron connectivity data. Benchmarking experiments on retrieving vertices and edges from Accumulo demonstrate that we can achieve 1-2 orders of magnitude speedup in retrieval time when compared to the same operation from traditional flat files. The implementation of graph analytics such as Breadth First Search using Accumulo and D4M offers consistent good performance regardless of data size and density, thus is scalable to very large dataset. Indexing of neuron subvolumes is simple and logical with geohashing-based binary tree encoding. This hybrid data management backend is used to drive an interactive web-based 3D graphical user interface, where users can examine the 3D connectivity map in a Google Map-like viewer. Our pipeline is scalable and extensible to other data modalities.

READ LESS

Summary

A cloud-based brain connectivity analysis tool

Learning by doing, High Performance Computing education in the MOOC era

January 17, 2017

Journal Article

Author:

Julia Mullen

…

Published in:

J. Parallel Distrib. Comput., Vol. 105, July 2017, pp. 105-15.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

The High Performance Computing (HPC) community has spent decades developing tools that teach practitioners to harness the power of parallel and distributed computing. To create scalable and flexible educational experiences for practitioners in all phases of a career, we turn to Massively Open Online Courses (MOOCs). We detail the design of a unique self-paced online course that incorporates a focus on parallel solutions, personalization, and hands-on practice to familiarize student-users with their target system. Course material is presented through the lens of common HPC use cases and the strategies for parallelizing them. Using personalized paths, we teach researchers how to recognize the alignment between scientific applications and traditional HPC use cases, so they can focus on learning the parallelization strategies key to their workplace success. At the conclusion of their learning path, students should be capable of achieving performance gains on their HPC system.

READ LESS

Summary

Learning by doing, High Performance Computing education in the MOOC era

Julia implementation of the Dynamic Distributed Dimensional Data Model

September 13, 2016

Conference Paper

Author:

null

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

Julia is a new language for writing data analysis programs that are easy to implement and run at high performance. Similarly, the Dynamic Distributed Dimensional Data Model (D4M) aims to clarify data analysis operations while retaining strong performance. D4M accomplishes these goals through a composable, unified data model on associative arrays. In this work, we present an implementation of D4M in Julia and describe how it enables and facilitates data analysis. Several experiments showcase scalable performance in our new Julia version as compared to the original Matlab implementation.

READ LESS

Summary

Julia implementation of the Dynamic Distributed Dimensional Data Model

Designing a new high performance computing education strategy for professional scientists and engineers

September 13, 2016

Conference Paper

Author:

Julia Mullen

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

For decades the High Performance Computing (HPC) community has used web content, workshops and embedded HPC scientists to enable practitioners to harness the power of parallel and distributed computing. The most successful approaches, face-to-face tutorials and embedded professionals, don't scale. To create scalable, flexible, educational experiences for practitioners in all phases of a career, from student to professional, we turn to Massively Open Online Courses (MOOCs). We detail the conversion of personalized tutorials to a selfpaced online course. In this demonstration, we highlight a course that mimics in-person tutorials by providing personalized paths through content that interleaves theory and practice, to help researchers learn key parallel computing concepts while developing familiarity with their HPC target system.

READ LESS

Summary

Designing a new high performance computing education strategy for professional scientists and engineers

In-storage embedded accelerator for sparse pattern processing

September 13, 2016

Conference Paper

Author:

Sang-Woo Jun

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

R&D group:

Summary

We present a novel architecture for sparse pattern processing, using flash storage with embedded accelerators. Sparse pattern processing on large data sets is the essence of applications such as document search, natural language processing, bioinformatics, subgraph matching, machine learning, and graph processing. One slice of our prototype accelerator is capable of handling up to 1TB of data, and experiments show that it can outperform C/C++ software solutions on a 16-core system at a fraction of the power and cost; an optimized version of the accelerator can match the performance of a 48-core server.

READ LESS

Summary

In-storage embedded accelerator for sparse pattern processing

Enhancing HPC security with a user-based firewall

September 13, 2016

Conference Paper

Author:

Andrew J. Prout

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

High Performance Computing (HPC) systems traditionally allow their users unrestricted use of their internal network. While this network is normally controlled enough to guarantee privacy without the need for encryption, it does not provide a method to authenticate peer connections. Protocols built upon this internal network, such as those used in MPI, Lustre, Hadoop, or Accumulo, must provide their own authentication at the application layer. Many methods have been employed to perform this authentication, such as operating system privileged ports, Kerberos, munge, TLS, and PKI certificates. However, support for all of these methods requires the HPC application developer to include support and the user to configure and enable these services. The user-based firewall capability we have prototyped enables a set of rules governing connections across the HPC internal network to be put into place using Linux netfilter. By using an operating system-level capability, the system is not reliant on any developer or user actions to enable security. The rules we have chosen and implemented are crafted to not impact the vast majority of users and be completely invisible to them. Additionally, we have measured the performance impact of this system under various workloads.

READ LESS

Summary

Enhancing HPC security with a user-based firewall

Novel graph processor architecture, prototype system, and results

September 13, 2016

Conference Paper

Author:

William S. Song

…

Published in:

HPEC 2016: IEEE Conf. on High Performance Extreme Computing, 13-15 September 2016.

Topic:

high performance computing

R&D area:

R&D group:

Summary

Graph algorithms are increasingly used in applications that exploit large databases. However, conventional processor architectures are inadequate for handling the throughput and memory requirements of graph computation. Lincoln Laboratory's graph-processor architecture represents a rethinking of parallel architectures for graph problems. Our processor utilizes innovations that include a sparse matrix-based graph instruction set, a cacheless memory system, accelerator-based architecture, a systolic sorter, high-bandwidth multidimensional toroidal communication network, and randomized communications. A field-programmable gate array (FPGA) prototype of the new graph processor has been developed with significant performance enhancement over conventional processors in graph computational throughput.

READ LESS

Summary

Novel graph processor architecture, prototype system, and results

Publications

Refine Results

Tagged As

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results