Publications

Refine Results

(Filters Applied) Clear All

Parallel MATLAB for extreme virtual memory

Published in:
Proc. of the HPCMP Users Group Conf., 27-30 June 2005, pp. 381-387.

Summary

Many DoD applications have extreme memory requirements, often with data sets larger than memory on a single computer. Such data sets can be addressed with out-of-core methods, which use memory as a "window" to view a section of the data stored on disk at a time. The Parallel Matlab for eXtreme Virtual Memory (pMatlab XVM) library adds out-of-core extensions to the Parallel Matlab (pMatlab) library. The DARPA High Productivity Computing Systems' HPC challenge FFT benchmark has been implemented in C+MPI, pMatlab, pMatlab hand coded for out-of-core and pMatlab XVM. We found that 1) the performance of the C+MPI and pMatlab versions were comparable; 2) the out-of-core versions deliver 80% of the performance of the in-core versions; 3) the out-of-core versions were able to perform a 1 TB (64 billion point) FFT; and 4) the pMatlab XVM program was smaller, easier to implement and verify, and more efficient than its hand coded equivalent. We plan to apply pMatlab XVM to the full HPC challenge benchmark suite. Using next generation hardware, problems sizes a factor of 100 to 1000 times larger should be feasible. We are also transitioning this technology to several DoD signal processing applications. Finally, the flexibility of pMatlab XVM allows hardware designers to experiment with FFT parameters in software before designing hardware for a real-time, ultra-long FFT.
READ LESS

Summary

Many DoD applications have extreme memory requirements, often with data sets larger than memory on a single computer. Such data sets can be addressed with out-of-core methods, which use memory as a "window" to view a section of the data stored on disk at a time. The Parallel Matlab for...

READ MORE

Technology requirements for supporting on-demand interactive grid computing

Summary

It is increasingly being recognized that a large pool of High Performance Computing (HPC) users requires interactive, on-demand access to HPC resources. How to provide these resources is a significant technical challenge that can be addressed from two directions. The first approach is to adapt existing batch queue based HPC systems to make them more interactive. The second approach is to start with existing interactive desktop environments (e.g., MATLAB) and design a system from the ground up that allows interactive parallel computing. The Lincoln Laboratory Grid (LLGrid) project has taken the latter approach. The LLGrid system has been operational for over a year with a few hundred processors and roughly 70 users, having run over 13,000 interactive jobs and consumed approximately 10,000 processor days of computation. This paper compares the on-demand and interactive computing features of four prominent batch queuing systems: openPBS, Sun GridEngine, Condor, and LSF. It goes on to briefly describe the LLGrid system, and how interactive, on-demand computing was achieved on it by binding to a resource management system. Finally, usage characteristics of the LLGrid system are discussed.
READ LESS

Summary

It is increasingly being recognized that a large pool of High Performance Computing (HPC) users requires interactive, on-demand access to HPC resources. How to provide these resources is a significant technical challenge that can be addressed from two directions. The first approach is to adapt existing batch queue based HPC...

READ MORE

High performance computing productivity model synthesis

Author:
Published in:
Int. J. High Perform. Comp. App., Vol. 12, No. 4, Winter 2004, pp. 505-516.

Summary

The Defense Advanced Research Projects Agency (DARPA) High Productivity Computing System (HPCS) program is developing systems that deliver increased value to users at a rate commensurate with the rate of improvement in the underlying technologies. For example, if the relevant technology was silicon, the goal of such a system would be to double in productivity (or value) every 18 months, following Moore's law. The key questions are how we define and measure productivity, and what the underlying technologies that affect productivity are. The goal of this paper is to synthesize from several different productivity models a single model that captures the main features of all the models. In addition we will start the process of putting the model on an empirical foundation by incorporating selected results from the software engineering and high performance computing (HPC) communities. An asymptotic analysis of the model is conducted to check that it makes sense in certain special cases. The model is extrapolated to a HPC context and several examples are explored, including HPC centers, HPC users, and interactive grid computing. Finally, the model hints at a profoundly different way of viewing HPC systems, where the user must be included in the equation, and innovative hardware is a key aspect to lowering the very high costs of HPC software.
READ LESS

Summary

The Defense Advanced Research Projects Agency (DARPA) High Productivity Computing System (HPCS) program is developing systems that deliver increased value to users at a rate commensurate with the rate of improvement in the underlying technologies. For example, if the relevant technology was silicon, the goal of such a system would...

READ MORE

HPC productivity: an overarching view

Author:
Published in:
Int. J. High Perform. Comp. App., Vol. 18, No. 4, Winter 2004, pp. 393-397.

Summary

The Defense Advanced Research Projects Agency (DARPA) High Productivity Computing Systems (HPCS) program is focused on providing a new generation of economically viable high productivity computing systems for national security and for the industrial user community. The value of a high performance computing (HPC) system to a user includes many factors, such as execution time on a particular problem, software development time, direct hardware costs, and indirect administrative and maintenance costs. This special issue, which focuses on HPC productivity, brings together, for the first time, a series of novel papers written by several distinguished authors who share their views on this topic. The topic of productivity in HPC is very new and the authors have been encouraged to speculate. The goal of this first paper is to present an overarching context and framework for the other papers and to define some common ideas that have emerged in considering the problem of HPC productivity. In addition, this paper defines several characteristic HPC workflows that are useful for understanding how users exploit HPC systems, and discusses the role of activity and purpose benchmarks in establishing an empirical basis for HPC productivity.
READ LESS

Summary

The Defense Advanced Research Projects Agency (DARPA) High Productivity Computing Systems (HPCS) program is focused on providing a new generation of economically viable high productivity computing systems for national security and for the industrial user community. The value of a high performance computing (HPC) system to a user includes many...

READ MORE

MatlabMPI

Author:
Published in:
Journal of Parallel and Distributed Computing, Vol. 64, No. 8, pp. 997-1005.

Summary

In many projects the true costs of high performance computing are currently dominated by software. Addressing these costs may require shifting to higher level languages such as Matlab. MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI “look and feel” on top of standard Matlab file I/O, resulting in an extremely compact (?350 lines of code) and “pure” implementation which runs anywhere Matlab runs, and on any heterogeneous combination of computers. The performance has been tested on both shared and distributed memory parallel computers (e.g. Sun, SGI, HP, IBM, Linux, MacOSX and Windows). MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ?300 using 304 CPUs and ?15% of the theoretical peak (450 Gigaflops) on an IBM SP2 at the Maui High Performance Computing Center. In addition, this entire parallel benchmark application was implemented in 70 software-lines-of-code, illustrating the high productivity of this approach. MatlabMPI is available for download on the web.
READ LESS

Summary

In many projects the true costs of high performance computing are currently dominated by software. Addressing these costs may require shifting to higher level languages such as Matlab. MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI...

READ MORE

300x faster Matlab using MatlabMPI

Author:
Published in:
https://arxiv.org/abs/astro-ph/0207389

Summary

The true costs of high performance computing are currently dominated by software. Addressing these costs requires shifting to high productivity languages such as Matlab. MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI "look and feel" on top of standard Matlab file I/O, resulting in an extremely compact (~250 lines of code) and "pure" implementation which runs anywhere Matlab runs, and on any heterogeneous combination of computers. The performance has been tested on both shared and distributedmemory parallel computers (e.g. Sun, SGI, HP, IBM and Linux). MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ~300 using 304 CPUs and ~15% of the theoretical peak (450 Gigaflops) on an IBM SP2 at the Maui High Performance Computing Center. In addition, this entire parallel benchmark application was implemented in 70 software-lines-of-code (SLOC) yielding 0.85 Gigaflop/SLOC or 4.4 CPUs/SLOC, which are the highest values of these software price performance metrics ever achieved for any application. The MatlabMPI software will be made available for download.
READ LESS

Summary

The true costs of high performance computing are currently dominated by software. Addressing these costs requires shifting to high productivity languages such as Matlab. MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic...

READ MORE

Discrete optimization using decision-directed learning for distributed networked computing

Summary

Decision-directed learning (DDL) is an iterative discrete approach to finding a feasible solution for large-scale combinatorial optimization problems. DDL is capable of efficiently formulating a solution to network scheduling problems that involve load limiting device utilization, selecting parallel configurations for software applications and host hardware using a minimum set of resources, and meeting time-to-result performance requirements in a dynamic network environment. This paper quantifies the algorithms that constitute DDL and compares its performance to other popular combinatorial self-directed real-time networked resource configuration for dynamically building a mission specific signal-processor for real-time distributed and parallel applications.
READ LESS

Summary

Decision-directed learning (DDL) is an iterative discrete approach to finding a feasible solution for large-scale combinatorial optimization problems. DDL is capable of efficiently formulating a solution to network scheduling problems that involve load limiting device utilization, selecting parallel configurations for software applications and host hardware using a minimum set of...

READ MORE

Parallel programming with MatlabMPI

Author:
Published in:
https://arxiv.org/abs/astro-ph/0107406

Summary

MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI "look and feel" on top of standard Matlab file I/O, resulting in an extremely compact (~100 lines) and "pure" implementation which runs anywhere Matlab runs. The performance has been tested on both shared and distributed memory parallel computers. MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ~70 on a parallel computer.
READ LESS

Summary

MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely...

READ MORE

Cluster Computing for Embedded/Real-Time Systems

Author:
Published in:
Cluster Computing White Paper

Summary

Cluster computing is not a new area of computing. It is, however, evident that there is agrowing interest in its usage in all areas where applications have traditionally used parallelor distributed computing platforms. The mounting interest has been fuelled in part by theavailability of powerful microprocessors and high-speed networks as off-the-shelf commoditycomponents as well as in part by the rapidly maturing software components available tosupport high performance and high availability applications.This rising interest in clusters led to the formation of an IEEE Computer Society Task Forceon Cluster Computing (TFCC1) in early 1999. An objective of the TFCC was to act both as amagnet and a focal point for all cluster computing related activities. As such, an earlyactivity that was deemed necessary was to produce a White Paper on cluster computing andits related technologies.Generally a White Paper is looked upon as a statement of policy on a particular subject. Theaim of this White Paper is to provide a relatively unbiased report on the existing, new andemerging technologies as well as the surrounding infrastructure deemed important to thecluster computing community. This White Paper is essentially a snapshot of cluster-relatedtechnologies and applications in year 2000.This White Paper provides an authoritative review of all the hardware and softwaretechnologies that can be used to make up a cluster now or in the near future. Thesetechnologies range from the network level, through the operating system and middlewarelevels up to the application and tools level. The White Paper also tackles the increasinglyimportant areas of High Availability and Embedded/Real Time applications, which are bothconsidered crucial areas for future clusters.The White Paper has been broken down into twelve chapters, each of which has been puttogether by academics and industrial researchers who are both experts in their fields andwhere willing to volunteer their time and effort to put together this White Paper.On a personal note, I would like to thank all the contributing authors for finding the time toput the effort into their chapters and making the overall paper an excellent state-of-the-artreview of clusters. In addition, I would like to thank the reviewers for their timely comments.
READ LESS

Summary

Cluster computing is not a new area of computing. It is, however, evident that there is agrowing interest in its usage in all areas where applications have traditionally used parallelor distributed computing platforms. The mounting interest has been fuelled in part by theavailability of powerful microprocessors and high-speed networks as...

READ MORE