Publications

Refine Results

(Filters Applied) Clear All

Dynamic Distributed Dimensional Data Model (D4M) database and computation system

Summary

A crucial element of large web companies is their ability to collect and analyze massive amounts of data. Tuple store databases are a key enabling technology employed by many of these companies (e.g., Google Big Table and Amazon Dynamo). Tuple stores are highly scalable and run on commodity clusters, but lack interfaces to support efficient development of mathematically based analytics. D4M (Dynamic Distributed Dimensional Data Model) has been developed to provide a mathematically rich interface to tuple stores (and structured query language "SQL" databases). D4M allows linear algebra to be readily applied to databases. Using D4M, it is possible to create composable analytics with significantly less effort than using traditional approaches. This work describes the D4M technology and its application and performance.
READ LESS

Summary

A crucial element of large web companies is their ability to collect and analyze massive amounts of data. Tuple store databases are a key enabling technology employed by many of these companies (e.g., Google Big Table and Amazon Dynamo). Tuple stores are highly scalable and run on commodity clusters, but...

READ MORE

Fundamental Questions in the Analysis of Large Graphs

Summary

Graphs are a general approach for representing information that spans the widest possible range of computing applications. They are particularly important to computational biology, web search, and knowledge discovery. As the sizes of graphs increase, the need to apply advanced mathematical and computational techniques to solve these problems is growing dramatically. Examining the mathematical and computational foundations of the analysis of large graphs generally leads to more questions than answers. This book concludes with a discussion of some of these questions.
READ LESS

Summary

Graphs are a general approach for representing information that spans the widest possible range of computing applications. They are particularly important to computational biology, web search, and knowledge discovery. As the sizes of graphs increase, the need to apply advanced mathematical and computational techniques to solve these problems is growing...

READ MORE

3-d graph processor

Summary

Graph algorithms are used for numerous database applications such as analysis of financial transactions, social networking patterns, and internet data. While graph algorithms can work well with moderate size databases, processors often have difficulty providing sufficient throughput when the databases are large. This is because the processor architectures are poorly matched to the graph computational flow. For example, most modern processors utilize cache based memory in order to take advantage of highly localized memory access patterns. However, memory access patterns associated with graph processing are often random in nature and can result in high cache miss rates. In addition, graph algorithms require significant overhead computation for dealing with indices of vertices and edges.
READ LESS

Summary

Graph algorithms are used for numerous database applications such as analysis of financial transactions, social networking patterns, and internet data. While graph algorithms can work well with moderate size databases, processors often have difficulty providing sufficient throughput when the databases are large. This is because the processor architectures are poorly...

READ MORE

High-productivity software development with pMATLAB

Published in:
Comput. Sci. Eng., Vol. 11, No. 1, January/February 2009, pp. 75-79.

Summary

In this paper, we explore the ease of tackling a communication-intensive parallel computing task - namely, the 2D fast Fourier transform (FFT). We start with a simple serial Matlab code, explore in detail a ID parallel FFT, and illustrate how it can be extended to multidimensional FFTs.
READ LESS

Summary

In this paper, we explore the ease of tackling a communication-intensive parallel computing task - namely, the 2D fast Fourier transform (FFT). We start with a simple serial Matlab code, explore in detail a ID parallel FFT, and illustrate how it can be extended to multidimensional FFTs.

READ MORE

Radar Signal Processing: An Example of High Performance Embedded Computing

Published in:
High Performance Embedded Computing Handbook, Chapter 6

Summary

This chapter focuses on the computational complexity of the front-end of the surface moving-target indication (SMTI) radar application. SMTI radars can require over one trillion operations per second of computation for wideband systems. The adaptive beamforming performed in SMTI radars is one of the major computational complexity drivers. The goal of the SMTI radar is to process the received signals to detect targets while rejecting clutter returns and noise. The radar must also mitigate interference from unintentional sources such as RF systems transmitting in the same band and from jammers that may be intentionally trying to mask targets. The pulse compression stage filters the data to concentrate the signal energy of a relatively long transmitted radar pulse into a short pulse response. The relative range rate between the radar and the ground along the line of sight of the sidelobe may be the same as range rate of the target detected in the mainbeam.
READ LESS

Summary

This chapter focuses on the computational complexity of the front-end of the surface moving-target indication (SMTI) radar application. SMTI radars can require over one trillion operations per second of computation for wideband systems. The adaptive beamforming performed in SMTI radars is one of the major computational complexity drivers. The goal...

READ MORE

pMapper: automatic mapping of parallel Matlab programs

Published in:
Proc. of the HPCM (High Performance Computing Modernization), Users Group Conf., 2005, 27-30 June 2005, pp. 254-261.

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput DoD signal and image processing applications and simulations. Significant progress has been made in compiler optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms to multiprocessor computers. The pMapper infrastructure addresses the problem of performance optimization of multistage MATLAB applications on parallel architectures. pMapper is an automatic performance tuning library written as a layer on top of pMatlab. pMatlab is a parallel Matlab toolbox that provides MATLAB users with global array semantics. While pMatlab abstracts the message-passing interface, the responsibility of generating maps for numerical arrays still falls on the user. A processor map for a numerical array is defined as an assignment of blocks of data to processing elements. Choosing the best mapping for a set of numerical arrays in a program is a nontrivial task that requires significant knowledge of programming languages, parallel computing, and processor architecture. pMapper automates the task of map generation, increasing the ease of programming and productivity. In addition to automating the mapping of parallel Matlab programs, pMapper could be used as a mapping tool for embedded systems. This paper addresses the design details of the pMapper infrastructure and presents preliminary results.
READ LESS

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput DoD signal and image processing applications and simulations. Significant progress has been made in compiler optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings...

READ MORE

Automatic parallelization with pMapper

Published in:
2005 IEEE Int. Conf. on Cluster Computing, 27-30 September 2005, 46-51.

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput signal and image processing applications and simulations. Significant progress has been made in optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms. The pMapper infrastructure addresses the problem of performance optimization of multistage MATLAB applications on parallel architectures. pMapper is an automatic performance tuning library written as a layer on top of pMatlab: Parallel Matlab toolbox. While pMatlab abstracts the message-passing interface, the responsibility of mapping numerical arrays falls on the user. Choosing the best mapping for a set of numerical arrays is a nontrivial task that requires significant knowledge of programming languages, parallel computing, and processor architecture. pMapper automates the task of map generation. This abstract addresses the design details of pMapper and presents preliminary results.
READ LESS

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput signal and image processing applications and simulations. Significant progress has been made in optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms...

READ MORE

Discrete optimization using decision-directed learning for distributed networked computing

Summary

Decision-directed learning (DDL) is an iterative discrete approach to finding a feasible solution for large-scale combinatorial optimization problems. DDL is capable of efficiently formulating a solution to network scheduling problems that involve load limiting device utilization, selecting parallel configurations for software applications and host hardware using a minimum set of resources, and meeting time-to-result performance requirements in a dynamic network environment. This paper quantifies the algorithms that constitute DDL and compares its performance to other popular combinatorial self-directed real-time networked resource configuration for dynamically building a mission specific signal-processor for real-time distributed and parallel applications.
READ LESS

Summary

Decision-directed learning (DDL) is an iterative discrete approach to finding a feasible solution for large-scale combinatorial optimization problems. DDL is capable of efficiently formulating a solution to network scheduling problems that involve load limiting device utilization, selecting parallel configurations for software applications and host hardware using a minimum set of...

READ MORE

Showing Results

1-8 of 8