Publications

Refine Results

(Filters Applied) Clear All

Graphs and matrices

Author:
Published in:
Graph Algorithms in the Language of Linear Algebra, pp. 3-12

Summary

A linear algebraic approach to graph algorithms that exploits the sparse adjacency matrix representation of graphs can provide a variety of benefits. These benefits include syntactic simplicity, easier implementation, and higher performance. Selected examples are presented illustrating these benefits. These examples are drawn from the remainder of the book in the areas of algorithms, data analysis, and computation.
READ LESS

Summary

A linear algebraic approach to graph algorithms that exploits the sparse adjacency matrix representation of graphs can provide a variety of benefits. These benefits include syntactic simplicity, easier implementation, and higher performance. Selected examples are presented illustrating these benefits. These examples are drawn from the remainder of the book in...

READ MORE

3-d graph processor

Summary

Graph algorithms are used for numerous database applications such as analysis of financial transactions, social networking patterns, and internet data. While graph algorithms can work well with moderate size databases, processors often have difficulty providing sufficient throughput when the databases are large. This is because the processor architectures are poorly matched to the graph computational flow. For example, most modern processors utilize cache based memory in order to take advantage of highly localized memory access patterns. However, memory access patterns associated with graph processing are often random in nature and can result in high cache miss rates. In addition, graph algorithms require significant overhead computation for dealing with indices of vertices and edges.
READ LESS

Summary

Graph algorithms are used for numerous database applications such as analysis of financial transactions, social networking patterns, and internet data. While graph algorithms can work well with moderate size databases, processors often have difficulty providing sufficient throughput when the databases are large. This is because the processor architectures are poorly...

READ MORE

Multicore programming in pMatlab using distributed arrays

Author:
Published in:
CLADE '08: Proceedings of the 6th international workshop on Challenges of large applications in distributed environments

Summary

Matlab is one of the most commonly used languages for scientific computing with approximately one million users worldwide. Many of the programs written in matlab can benefit from the increased performance offered by multicore processors and parallel computing clusters. The Lincoln pMatlab library (http://www.ll.mit.edu/pMatlab) allows high performance parallel programs to be written quickly using the distributed arrays programming paradigm. This talk provides an introduction to distributed arrays programming and will describe the best programming practices for using distributed arrays to produce programs that perform well on multicore processors and parallel computing clusters. These practices include understanding the concepts of parallel concurrency vs. parallel data locality
READ LESS

Summary

Matlab is one of the most commonly used languages for scientific computing with approximately one million users worldwide. Many of the programs written in matlab can benefit from the increased performance offered by multicore processors and parallel computing clusters. The Lincoln pMatlab library (http://www.ll.mit.edu/pMatlab) allows high performance parallel programs to...

READ MORE

Analytic theory of power law graphs

Author:
Published in:
SIAM Conference on Parallel Processing for Scientific Computing

Summary

An analytical theory of power law graphs is presented basedon the Kronecker graph generation technique. The analysisuses Kronecker exponentials of complete bipartite graphsto formulate the sub-structure of such graphs. This allows various high level quantities (e.g. degree distribution,betweenness centrality, diameter, eigenvalues, and isoparametric ratio) to be computed directly from the model pa-rameters. The implications of this work on “clustering”and “dendragram” heuristics are also discussed.
READ LESS

Summary

An analytical theory of power law graphs is presented basedon the Kronecker graph generation technique. The analysisuses Kronecker exponentials of complete bipartite graphsto formulate the sub-structure of such graphs. This allows various high level quantities (e.g. degree distribution,betweenness centrality, diameter, eigenvalues, and isoparametric ratio) to be computed directly from the...

READ MORE

Performance metrics and software architecture

Published in:
High Performance Embedded Computing Handbook, Chapter 15

Summary

This chapter presents that high performance embedded computing (HPEC) software architectures and evaluation metrics. A canonical HPEC application is used to illustrate basic concepts. The chapter discusses different types of parallelism are reviewed, and performance analysis techniques. It presents a typical programmable multicomputer and explores the performance trade-offs of different parallel mappings on this computer using key system performance metrics. HPEC systems are amongst the most challenging systems in the world to build. Synthetic Aperture Radar (SAR) is one of the most common modes in a radar system and one of the most computationally stressing to implement. Often the first step in the development of a system is to produce a rough estimate of how many processors will be needed. The parallel opportunities at each stage of the calculation discussed in the previous section show that there are many different ways to exploit parallelism in this application. The chapter concludes with a discussion of the impact of different software implementations approaches.
READ LESS

Summary

This chapter presents that high performance embedded computing (HPEC) software architectures and evaluation metrics. A canonical HPEC application is used to illustrate basic concepts. The chapter discusses different types of parallelism are reviewed, and performance analysis techniques. It presents a typical programmable multicomputer and explores the performance trade-offs of different...

READ MORE

Radar Signal Processing: An Example of High Performance Embedded Computing

Published in:
High Performance Embedded Computing Handbook, Chapter 6

Summary

This chapter focuses on the computational complexity of the front-end of the surface moving-target indication (SMTI) radar application. SMTI radars can require over one trillion operations per second of computation for wideband systems. The adaptive beamforming performed in SMTI radars is one of the major computational complexity drivers. The goal of the SMTI radar is to process the received signals to detect targets while rejecting clutter returns and noise. The radar must also mitigate interference from unintentional sources such as RF systems transmitting in the same band and from jammers that may be intentionally trying to mask targets. The pulse compression stage filters the data to concentrate the signal energy of a relatively long transmitted radar pulse into a short pulse response. The relative range rate between the radar and the ground along the line of sight of the sidelobe may be the same as range rate of the target detected in the mainbeam.
READ LESS

Summary

This chapter focuses on the computational complexity of the front-end of the surface moving-target indication (SMTI) radar application. SMTI radars can require over one trillion operations per second of computation for wideband systems. The adaptive beamforming performed in SMTI radars is one of the major computational complexity drivers. The goal...

READ MORE

Parallel and Distributed Processing

Author:
Published in:
High Performance Embedded Computing Handbook, Chapter 18

Summary

This chapter discusses parallel and distributed programming technologies for high performance embedded systems. Computational or memory constraints can be overcome with parallel processing. The primary goal of parallel processing is to improve performance by distributing computation across multiple processors or increasing dataset sizes by distributing data across multiple processors’ memory. The typical programmer has little to no experience writing programs that run on multiple processors. The transition from serial to parallel programming requires significant changes in the programmer’s way of thinking. For example, the programmer must worry about how to distribute data and computation across multiple processors to maximize performance and how to synchronize and communicate between processors. Although most programmers will likely admit to having no experience with parallel programming, many have indeed had exposure to a rudimentary type in the form of threads. A typical threaded program starts execution as a single thread.
READ LESS

Summary

This chapter discusses parallel and distributed programming technologies for high performance embedded systems. Computational or memory constraints can be overcome with parallel processing. The primary goal of parallel processing is to improve performance by distributing computation across multiple processors or increasing dataset sizes by distributing data across multiple processors’ memory...

READ MORE

High productivity computing and usable petascale systems

Published in:
SC '06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing

Summary

High Performance Computing has seen extraordinary growth in peak performance which has been accompanied by a significant increase in the difficulty of using these systems. High Productivity Computing Systems (HPCS) seek to address this gap by producing petascale computers that are usable by a broader range of scientists and engineers. One of the most important HPCS innovations is the concept of a flatter memory hierarchy, which means that data from remote processors can be retrieved and used very efficiently. A flatter memory hierarchy increases performance and is easier to program.
READ LESS

Summary

High Performance Computing has seen extraordinary growth in peak performance which has been accompanied by a significant increase in the difficulty of using these systems. High Productivity Computing Systems (HPCS) seek to address this gap by producing petascale computers that are usable by a broader range of scientists and engineers...

READ MORE

Application of a Relative Development Time Productivity Metric to Parallel Software Development

Published in:
SE-HPCS '05: Proceedings of the second international workshop on Software engineering for high performance computing system applications

Summary

Evaluation of High Performance Computing (HPC) systems should take into account software development time productivity in addition to hardware performance, cost, and other factors. We propose a new metric for HPC software development time productivity, defined as the ratio of relative runtime performance to relative programmer effort. This formula has been used to analyze several HPC benchmark codes and classroom programming assignments. The results of this analysis show consistent trends for various programming models. This method enables a high-level evaluation of development time productivity for a given code implementation, which is essential to the task of estimating cost associated with HPC software development.
READ LESS

Summary

Evaluation of High Performance Computing (HPC) systems should take into account software development time productivity in addition to hardware performance, cost, and other factors. We propose a new metric for HPC software development time productivity, defined as the ratio of relative runtime performance to relative programmer effort. This formula has...

READ MORE

Next-generation technologies to enable sensor networks

Published in:
Handbook of Sensor Networks, Chapter 2

Summary

Examples are advances in ground moving target indicator (GMTI) processing, space-time adaptive processing (STAP), target discrimination, and electronic counter-countermeasures (ECCM). All these advances have improved the capabilities of radar sensors. Major improvements expected in the next several years will come from exploiting collaborative network-centric architectures to leverage synergies among individual sensors. Such an approach has become feasible as a result of major advances in network computing, as well as communication technologies in both wireless and fiber networks. The exponential growth of digital technology, together with highly capable networks, enable in-depth exploitation of sensor synergy, including multi-aspect sensing. New signal processing algorithms exploiting multi-sensor data have been demonstrated in non-real-time, achieving improved performance against surface mobile targets by leveraging high-speed sensor networks. The paper demonstrates a significant advancement in exploiting complex ground moving target indicator (GMTI) and synthetic aperture radar (SAR) data to accurately geo-locate and identify mobile targets.
READ LESS

Summary

Examples are advances in ground moving target indicator (GMTI) processing, space-time adaptive processing (STAP), target discrimination, and electronic counter-countermeasures (ECCM). All these advances have improved the capabilities of radar sensors. Major improvements expected in the next several years will come from exploiting collaborative network-centric architectures to leverage synergies among individual...

READ MORE