Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

Tagged As

high performance computing Clear filter

Parallel MATLAB for extreme virtual memory

January 1, 2005

Conference Paper

Author:

Hahn G. Kim

…

Published in:

Proc. of the HPCMP Users Group Conf., 27-30 June 2005, pp. 381-387.

Topic:

high performance computing

R&D area:

R&D group:

Embedded and Open Systems

Summary

Many DoD applications have extreme memory requirements, often with data sets larger than memory on a single computer. Such data sets can be addressed with out-of-core methods, which use memory as a "window" to view a section of the data stored on disk at a time. The Parallel Matlab for eXtreme Virtual Memory (pMatlab XVM) library adds out-of-core extensions to the Parallel Matlab (pMatlab) library. The DARPA High Productivity Computing Systems' HPC challenge FFT benchmark has been implemented in C+MPI, pMatlab, pMatlab hand coded for out-of-core and pMatlab XVM. We found that 1) the performance of the C+MPI and pMatlab versions were comparable; 2) the out-of-core versions deliver 80% of the performance of the in-core versions; 3) the out-of-core versions were able to perform a 1 TB (64 billion point) FFT; and 4) the pMatlab XVM program was smaller, easier to implement and verify, and more efficient than its hand coded equivalent. We plan to apply pMatlab XVM to the full HPC challenge benchmark suite. Using next generation hardware, problems sizes a factor of 100 to 1000 times larger should be feasible. We are also transitioning this technology to several DoD signal processing applications. Finally, the flexibility of pMatlab XVM allows hardware designers to experiment with FFT parameters in software before designing hardware for a real-time, ultra-long FFT.

READ LESS

Summary

Parallel MATLAB for extreme virtual memory

Technology requirements for supporting on-demand interactive grid computing

January 1, 2005

Conference Paper

Author:

Albert I. Reuther

…

Published in:

Proc. of the HPCMP Users Group Conf., 27-30 June 2005, pp. 320-327.

Topic:

high performance computing

R&D area:

R&D group:

Embedded and Open Systems

Summary

It is increasingly being recognized that a large pool of High Performance Computing (HPC) users requires interactive, on-demand access to HPC resources. How to provide these resources is a significant technical challenge that can be addressed from two directions. The first approach is to adapt existing batch queue based HPC systems to make them more interactive. The second approach is to start with existing interactive desktop environments (e.g., MATLAB) and design a system from the ground up that allows interactive parallel computing. The Lincoln Laboratory Grid (LLGrid) project has taken the latter approach. The LLGrid system has been operational for over a year with a few hundred processors and roughly 70 users, having run over 13,000 interactive jobs and consumed approximately 10,000 processor days of computation. This paper compares the on-demand and interactive computing features of four prominent batch queuing systems: openPBS, Sun GridEngine, Condor, and LSF. It goes on to briefly describe the LLGrid system, and how interactive, on-demand computing was achieved on it by binding to a resource management system. Finally, usage characteristics of the LLGrid system are discussed.

READ LESS

Summary

Technology requirements for supporting on-demand interactive grid computing

High performance computing productivity model synthesis

January 1, 2004

Journal Article

Author:

Jeremy Kepner

Published in:

Int. J. High Perform. Comp. App., Vol. 12, No. 4, Winter 2004, pp. 505-516.

Topic:

high performance computing

R&D area:

R&D group:

ISR Systems and Architectures

Summary

The Defense Advanced Research Projects Agency (DARPA) High Productivity Computing System (HPCS) program is developing systems that deliver increased value to users at a rate commensurate with the rate of improvement in the underlying technologies. For example, if the relevant technology was silicon, the goal of such a system would be to double in productivity (or value) every 18 months, following Moore's law. The key questions are how we define and measure productivity, and what the underlying technologies that affect productivity are. The goal of this paper is to synthesize from several different productivity models a single model that captures the main features of all the models. In addition we will start the process of putting the model on an empirical foundation by incorporating selected results from the software engineering and high performance computing (HPC) communities. An asymptotic analysis of the model is conducted to check that it makes sense in certain special cases. The model is extrapolated to a HPC context and several examples are explored, including HPC centers, HPC users, and interactive grid computing. Finally, the model hints at a profoundly different way of viewing HPC systems, where the user must be included in the equation, and innovative hardware is a key aspect to lowering the very high costs of HPC software.

READ LESS

Summary

High performance computing productivity model synthesis

HPC productivity: an overarching view

January 1, 2004

Journal Article

Author:

Jeremy Kepner

Published in:

Int. J. High Perform. Comp. App., Vol. 18, No. 4, Winter 2004, pp. 393-397.

Topic:

high performance computing

R&D area:

R&D group:

Embedded and Open Systems

Summary

The Defense Advanced Research Projects Agency (DARPA) High Productivity Computing Systems (HPCS) program is focused on providing a new generation of economically viable high productivity computing systems for national security and for the industrial user community. The value of a high performance computing (HPC) system to a user includes many factors, such as execution time on a particular problem, software development time, direct hardware costs, and indirect administrative and maintenance costs. This special issue, which focuses on HPC productivity, brings together, for the first time, a series of novel papers written by several distinguished authors who share their views on this topic. The topic of productivity in HPC is very new and the authors have been encouraged to speculate. The goal of this first paper is to present an overarching context and framework for the other papers and to define some common ideas that have emerged in considering the problem of HPC productivity. In addition, this paper defines several characteristic HPC workflows that are useful for understanding how users exploit HPC systems, and discusses the role of activity and purpose benchmarks in establishing an empirical basis for HPC productivity.

READ LESS

Summary

HPC productivity: an overarching view

MatlabMPI

January 1, 2004

Journal Article

Author:

Jeremy Kepner

…

Stan Ahalt

Published in:

Journal of Parallel and Distributed Computing, Vol. 64, No. 8, pp. 997-1005.

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

In many projects the true costs of high performance computing are currently dominated by software. Addressing these costs may require shifting to higher level languages such as Matlab. MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI “look and feel” on top of standard Matlab file I/O, resulting in an extremely compact (?350 lines of code) and “pure” implementation which runs anywhere Matlab runs, and on any heterogeneous combination of computers. The performance has been tested on both shared and distributed memory parallel computers (e.g. Sun, SGI, HP, IBM, Linux, MacOSX and Windows). MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ?300 using 304 CPUs and ?15% of the theoretical peak (450 Gigaflops) on an IBM SP2 at the Maui High Performance Computing Center. In addition, this entire parallel benchmark application was implemented in 70 software-lines-of-code, illustrating the high productivity of this approach. MatlabMPI is available for download on the web.

READ LESS

Summary

MatlabMPI

300x faster Matlab using MatlabMPI

July 18, 2002

Journal Article

Author:

Jeremy Kepner

Published in:

https://arxiv.org/abs/astro-ph/0207389

Topic:

supercomputing

R&D area:

R&D group:

Embedded and Open Systems

Summary

The true costs of high performance computing are currently dominated by software. Addressing these costs requires shifting to high productivity languages such as Matlab. MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI "look and feel" on top of standard Matlab file I/O, resulting in an extremely compact (~250 lines of code) and "pure" implementation which runs anywhere Matlab runs, and on any heterogeneous combination of computers. The performance has been tested on both shared and distributedmemory parallel computers (e.g. Sun, SGI, HP, IBM and Linux). MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ~300 using 304 CPUs and ~15% of the theoretical peak (450 Gigaflops) on an IBM SP2 at the Maui High Performance Computing Center. In addition, this entire parallel benchmark application was implemented in 70 software-lines-of-code (SLOC) yielding 0.85 Gigaflop/SLOC or 4.4 CPUs/SLOC, which are the highest values of these software price performance metrics ever achieved for any application. The MatlabMPI software will be made available for download.

READ LESS

Summary

300x faster Matlab using MatlabMPI

Discrete optimization using decision-directed learning for distributed networked computing

January 1, 2002

Conference Paper

Author:

Joel I. Goodman

…

Published in:

36th Asilomar Conf. on Signals, Systems and Computers, Vol. 2, 3-6 November 2002, pp. 1189-1196.

Topic:

high performance computing

R&D area:

R&D group:

Summary

Decision-directed learning (DDL) is an iterative discrete approach to finding a feasible solution for large-scale combinatorial optimization problems. DDL is capable of efficiently formulating a solution to network scheduling problems that involve load limiting device utilization, selecting parallel configurations for software applications and host hardware using a minimum set of resources, and meeting time-to-result performance requirements in a dynamic network environment. This paper quantifies the algorithms that constitute DDL and compares its performance to other popular combinatorial self-directed real-time networked resource configuration for dynamically building a mission specific signal-processor for real-time distributed and parallel applications.

READ LESS

Summary

Discrete optimization using decision-directed learning for distributed networked computing

Parallel programming with MatlabMPI

July 20, 2001

Journal Article

Author:

Jeremy Kepner

Published in:

https://arxiv.org/abs/astro-ph/0107406

Topic:

high performance computing

R&D area:

R&D group:

Embedded and Open Systems

Summary

MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI "look and feel" on top of standard Matlab file I/O, resulting in an extremely compact (~100 lines) and "pure" implementation which runs anywhere Matlab runs. The performance has been tested on both shared and distributed memory parallel computers. MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ~70 on a parallel computer.

READ LESS

Summary

Parallel programming with MatlabMPI

Cluster Computing for Embedded/Real-Time Systems

January 1, 2000

Book Chapter

Author:

Jeremy Kepner

…

D. Katz

Published in:

Cluster Computing White Paper

Topic:

high performance computing

R&D area:

Cyber Security and Information Sciences

R&D group:

Lincoln Laboratory Supercomputing Center

Summary

Cluster computing is not a new area of computing. It is, however, evident that there is agrowing interest in its usage in all areas where applications have traditionally used parallelor distributed computing platforms. The mounting interest has been fuelled in part by theavailability of powerful microprocessors and high-speed networks as off-the-shelf commoditycomponents as well as in part by the rapidly maturing software components available tosupport high performance and high availability applications.This rising interest in clusters led to the formation of an IEEE Computer Society Task Forceon Cluster Computing (TFCC1) in early 1999. An objective of the TFCC was to act both as amagnet and a focal point for all cluster computing related activities. As such, an earlyactivity that was deemed necessary was to produce a White Paper on cluster computing andits related technologies.Generally a White Paper is looked upon as a statement of policy on a particular subject. Theaim of this White Paper is to provide a relatively unbiased report on the existing, new andemerging technologies as well as the surrounding infrastructure deemed important to thecluster computing community. This White Paper is essentially a snapshot of cluster-relatedtechnologies and applications in year 2000.This White Paper provides an authoritative review of all the hardware and softwaretechnologies that can be used to make up a cluster now or in the near future. Thesetechnologies range from the network level, through the operating system and middlewarelevels up to the application and tools level. The White Paper also tackles the increasinglyimportant areas of High Availability and Embedded/Real Time applications, which are bothconsidered crucial areas for future clusters.The White Paper has been broken down into twelve chapters, each of which has been puttogether by academics and industrial researchers who are both experts in their fields andwhere willing to volunteer their time and effort to put together this White Paper.On a personal note, I would like to thank all the contributing authors for finding the time toput the effort into their chapters and making the overall paper an excellent state-of-the-artreview of clusters. In addition, I would like to thank the reviewers for their timely comments.

READ LESS

Summary

Cluster Computing for Embedded/Real-Time Systems

Publications

Refine Results

Tagged As

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Showing Results