Publications

Refine Results

(Filters Applied) Clear All

pMapper: automatic mapping of parallel Matlab programs

Published in:
Proc. of the HPCM (High Performance Computing Modernization), Users Group Conf., 2005, 27-30 June 2005, pp. 254-261.

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput DoD signal and image processing applications and simulations. Significant progress has been made in compiler optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms to multiprocessor computers. The pMapper infrastructure addresses the problem of performance optimization of multistage MATLAB applications on parallel architectures. pMapper is an automatic performance tuning library written as a layer on top of pMatlab. pMatlab is a parallel Matlab toolbox that provides MATLAB users with global array semantics. While pMatlab abstracts the message-passing interface, the responsibility of generating maps for numerical arrays still falls on the user. A processor map for a numerical array is defined as an assignment of blocks of data to processing elements. Choosing the best mapping for a set of numerical arrays in a program is a nontrivial task that requires significant knowledge of programming languages, parallel computing, and processor architecture. pMapper automates the task of map generation, increasing the ease of programming and productivity. In addition to automating the mapping of parallel Matlab programs, pMapper could be used as a mapping tool for embedded systems. This paper addresses the design details of the pMapper infrastructure and presents preliminary results.
READ LESS

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput DoD signal and image processing applications and simulations. Significant progress has been made in compiler optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings...

READ MORE

Automatic parallelization with pMapper

Published in:
2005 IEEE Int. Conf. on Cluster Computing, 27-30 September 2005, 46-51.

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput signal and image processing applications and simulations. Significant progress has been made in optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms. The pMapper infrastructure addresses the problem of performance optimization of multistage MATLAB applications on parallel architectures. pMapper is an automatic performance tuning library written as a layer on top of pMatlab: Parallel Matlab toolbox. While pMatlab abstracts the message-passing interface, the responsibility of mapping numerical arrays falls on the user. Choosing the best mapping for a set of numerical arrays is a nontrivial task that requires significant knowledge of programming languages, parallel computing, and processor architecture. pMapper automates the task of map generation. This abstract addresses the design details of pMapper and presents preliminary results.
READ LESS

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput signal and image processing applications and simulations. Significant progress has been made in optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms...

READ MORE

Polymorphous computing architecture (PCA) kernel-level benchmarks [revision 1]

Published in:
MIT Lincoln Laboratory Report PCA-KERNEL-1,REV.1

Summary

This document describes a series of kernel benchmarks for the PCA program. Each kernel benchmark is an operation of importance to DoD sensor applications making use of a PCA architecture. Many of these operations are a part of the composite example applications described elsewhere. The kernel-level benchmarks have been chosen to stress both computation and communication aspects of the architecture. "Computation" aspects include floating-point and integer performance, as well as the memory hierarchy, while the "communication" aspects include the network, the memory hierarchy, and the I/O capabilities. The particular benchmarks chosen are based on the frequency of their use in current and future applications. They are drawn from the areas of signal processing, communication, and information and knowledge processing. The specification of the benchmarks in this document is meant to be high-level and largely independent of the implementation.
READ LESS

Summary

This document describes a series of kernel benchmarks for the PCA program. Each kernel benchmark is an operation of importance to DoD sensor applications making use of a PCA architecture. Many of these operations are a part of the composite example applications described elsewhere. The kernel-level benchmarks have been chosen...

READ MORE

Showing Results

1-3 of 3