Publications

Refine Results

(Filters Applied) Clear All

pMapper: automatic mapping of parallel Matlab programs

Published in:
Proc. of the HPCM (High Performance Computing Modernization), Users Group Conf., 2005, 27-30 June 2005, pp. 254-261.

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput DoD signal and image processing applications and simulations. Significant progress has been made in compiler optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms to multiprocessor computers. The pMapper infrastructure addresses the problem of performance optimization of multistage MATLAB applications on parallel architectures. pMapper is an automatic performance tuning library written as a layer on top of pMatlab. pMatlab is a parallel Matlab toolbox that provides MATLAB users with global array semantics. While pMatlab abstracts the message-passing interface, the responsibility of generating maps for numerical arrays still falls on the user. A processor map for a numerical array is defined as an assignment of blocks of data to processing elements. Choosing the best mapping for a set of numerical arrays in a program is a nontrivial task that requires significant knowledge of programming languages, parallel computing, and processor architecture. pMapper automates the task of map generation, increasing the ease of programming and productivity. In addition to automating the mapping of parallel Matlab programs, pMapper could be used as a mapping tool for embedded systems. This paper addresses the design details of the pMapper infrastructure and presents preliminary results.
READ LESS

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput DoD signal and image processing applications and simulations. Significant progress has been made in compiler optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings...

READ MORE

A taxonomy of buffer overflows for evaluating static and dynamic software testing tools

Published in:
NIST Workshop on Software Security, Assurance Tools, Techniques, and Metrics, 7-8 November 2005.

Summary

A taxonomy that uses twenty-two attributes to characterize C-program overflows was used to construct 291 small C-program test cases that can be used to diagnostically determine the basic capabilities of static and dynamic analysis buffer overflow detection tools. Attributes in the taxonomy include the buffer location (e.g. stack, heap, data region, BSS, shared memory); scope difference between buffer allocation and access; index, pointer, and alias complexity when addressing buffer elements; complexity of the control flow and loop structure surrounding the overflow; type of container the buffer is within (e.g. structure, union, array); whether the overflow is caused by a signed/unsigned type error; the overflow magnitude and direction; and whether the overflow is discrete or continuous. As an example, the 291 test cases were used to measure the detection, false alarm, and confusion rates of five static analysis tools. They reveal specific strengths and limitations of tools and suggest directions for improvements.
READ LESS

Summary

A taxonomy that uses twenty-two attributes to characterize C-program overflows was used to construct 291 small C-program test cases that can be used to diagnostically determine the basic capabilities of static and dynamic analysis buffer overflow detection tools. Attributes in the taxonomy include the buffer location (e.g. stack, heap, data...

READ MORE

The MIT-LL/AFRL MT System

Published in:
Int. Workshop on Spoken Language Translation, IWSLT, 24-25 October 2005.

Summary

The MITLL/AFRL MT system is a statistical phrase-based translation system that implements many modern SMT training and decoding techniques. Our system was designed with the long term goal of dealing with corrupted ASR input for Speech-to-Speech MT applications. This paper will discuss the architecture of the MITLL/AFRL MT system, and experiments with manual and ASR transcription data that were run as part of the IWSLT-2005 Chinese-to-English evaluation campaign.
READ LESS

Summary

The MITLL/AFRL MT system is a statistical phrase-based translation system that implements many modern SMT training and decoding techniques. Our system was designed with the long term goal of dealing with corrupted ASR input for Speech-to-Speech MT applications. This paper will discuss the architecture of the MITLL/AFRL MT system, and...

READ MORE

Synthesis, analysis, and pitch modification of the breathy vowel

Published in:
2005 Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 16-19 October 2005, pp. 199-202.

Summary

Breathiness is an aspect of voice quality that is difficult to analyze and synthesize, especially since its periodic and noise components are typically overlapping in frequency. The decomposition and manipulation of these two components is of importance in a variety of speech application areas such as text-to-speech synthesis, speech encoding, and clinical assessment of disordered voices. This paper first investigates the perceptual relevance of a speech production model that assumes the speech noise component is modulated by the glottal airflow waveform. After verifying the importance of noise modulation in breathy vowels, we use the modulation model to address the particular problem of pitch modification of this signal class. Using a decomposition method referred to as pitch-scaled harmonic filtering to extract the additive noise component, we introduce a pitch modification algorithm that explicitly modifies the modulation characteristic of this noise component. The approach applies envelope shaping to the noise source that is derived from the inverse-filtered noise component. Modification examples using synthetic and real breathy vowels indicate promising performance with spectrally-overlapping periodic and noise components.
READ LESS

Summary

Breathiness is an aspect of voice quality that is difficult to analyze and synthesize, especially since its periodic and noise components are typically overlapping in frequency. The decomposition and manipulation of these two components is of importance in a variety of speech application areas such as text-to-speech synthesis, speech encoding...

READ MORE

Evaluating and strengthening enterprise network security using attack graphs

Summary

Assessing the security of large enterprise networks is complex and labor intensive. Current security analysis tools typically examine only individual firewalls, routers, or hosts separately and do not comprehensively analyze overall network security. We present a new approach that uses configuration information on firewalls and vulnerability information on all network devices to build attack graphs that show how far inside and outside attackers can progress through a network by successively compromising exposed and vulnerable hosts. In addition, attack graphs are automatically analyzed to produce a small set of prioritized recommendations to enhance network security. Field trials on networks with up to 3,400 hosts demonstrate the ability to accurately identify a small number of critical stepping-stone hosts that need to be patched to protect against external attackers. Simulation studies on complex networks with more than 40,000 hosts demonstrate good scaling. This analysis can be used for many purposes, including identifying critical stepping-stone hosts to patch or protect with a firewall, comparing the security of alternating network designs, determining the security risk caused by proposed changes in firewall rules or new vulnerabilities, and identifying the most critical hosts to patch when a new vulnerability is announced. Unique aspects of this work are new attack graph generation algorithms that scale to enterprise networks with thousands of hosts, efficient approaches to determine what other hosts and ports in large networks are reachable from each individual host, automatic data importation from network vulnerability scanners and firewalls, and automatic attack graph analyses to generate recommendations.
READ LESS

Summary

Assessing the security of large enterprise networks is complex and labor intensive. Current security analysis tools typically examine only individual firewalls, routers, or hosts separately and do not comprehensively analyze overall network security. We present a new approach that uses configuration information on firewalls and vulnerability information on all network...

READ MORE

Automatic parallelization with pMapper

Published in:
2005 IEEE Int. Conf. on Cluster Computing, 27-30 September 2005, 46-51.

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput signal and image processing applications and simulations. Significant progress has been made in optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms. The pMapper infrastructure addresses the problem of performance optimization of multistage MATLAB applications on parallel architectures. pMapper is an automatic performance tuning library written as a layer on top of pMatlab: Parallel Matlab toolbox. While pMatlab abstracts the message-passing interface, the responsibility of mapping numerical arrays falls on the user. Choosing the best mapping for a set of numerical arrays is a nontrivial task that requires significant knowledge of programming languages, parallel computing, and processor architecture. pMapper automates the task of map generation. This abstract addresses the design details of pMapper and presents preliminary results.
READ LESS

Summary

Algorithm implementation efficiency is key to delivering high-performance computing capabilities to demanding, high throughput signal and image processing applications and simulations. Significant progress has been made in optimization of serial programs, but many applications require parallel processing, which brings with it the difficult task of determining efficient mappings of algorithms...

READ MORE

Parallel out-of-core Matlab for extreme virtual memory (Abstract)

Published in:
2005 IEEE Int. Conf. on Cluster Computing, 27-30 September 2005, p. 482 [abstract only].

Summary

Large data sets that cannot fit in memory can be addressed with out-of-core methods, which use memory as a "window" to view a section of the data stored on disk at a time. The Parallel Matlab for eXtreme Virtual Memory (pMatlab XVM) library adds out-of-core extensions to the Parallel Matlab (pMatlab) library. We have applied pMatlab XVM to the DARPA High Productivity Computing Systems? HPCchallenge FFT benchmark. The benchmark was run using several different implementations: C+MPI, pMatlab, pMatlab hand coded for out-of-core and pMatlab XVM. These experiments found 1) the performance of the C+MPI and pMatlab versions were comparable; 2) the out-of-core versions deliver 80% of the performance of the in-core versions; 3) the out-of-core versions were able to perform a 1 terabyte (64 billion point) FFT and 4) the pMatlab XVM program was smaller, easier to implement and verify, and more efficient than its hand coded equivalent. We are transitioning this technology to several DoD signal processing applications and plan to apply pMatlab XVM to the full HPCchallenge benchmark suite. Using next generation hardware, problems sizes a factor of 100 to 1000 times larger should be feasible.
READ LESS

Summary

Large data sets that cannot fit in memory can be addressed with out-of-core methods, which use memory as a "window" to view a section of the data stored on disk at a time. The Parallel Matlab for eXtreme Virtual Memory (pMatlab XVM) library adds out-of-core extensions to the Parallel Matlab...

READ MORE

Introduction to parallel programming and pMatlab v2.0

Published in:
Lincoln Laboratory external web site, [2005].

Summary

The computational demands of software continue to outpace the capacities of processor and memory technologies, especially in scientific and engineering programs. One option to improve performance is parallel processing. However, despite decades of research and development, writing parallel programs continues to be difficult. This is especially the case for scientists and engineers who have limited backgrounds in computer science. MATLAB®, due to its ease of use compared to other programming languages like C and Fortran, is one of the most popular languages for implementing numerical computations, thus making it an excellent platform for developing an accessible parallel computing framework. The MIT Lincoln Laboratory has developed two libraries, pMatlab and MatlabMPI, that not only enables parallel programming with MATLAB in a simple fashion, accessible to non-computer scientists. This document will overview basic concepts in parallel programming and introduce pMatlab.
READ LESS

Summary

The computational demands of software continue to outpace the capacities of processor and memory technologies, especially in scientific and engineering programs. One option to improve performance is parallel processing. However, despite decades of research and development, writing parallel programs continues to be difficult. This is especially the case for scientists...

READ MORE

Writing parallel parameter sweep applications with pMATLAB

Published in:
Lincoln Laboratory external web site [2005].

Summary

Parameter sweep applications execute the same piece of code multiple times with unique sets of input parameters. This type of application is extremely amenable to parallelization. This document describes how to parallelize parameter sweep applications with pMATLAB by introducting a simple serial parameter sweep applicaiton written in MATLAB, then parallelizing the application using pMATLAB.
READ LESS

Summary

Parameter sweep applications execute the same piece of code multiple times with unique sets of input parameters. This type of application is extremely amenable to parallelization. This document describes how to parallelize parameter sweep applications with pMATLAB by introducting a simple serial parameter sweep applicaiton written in MATLAB, then parallelizing...

READ MORE

Using a diagnostic corpus of C programs to evaluate buffer overflow detection by static analysis tools

Published in:
10th European Software Engineering Conf., 5-9 September 2005.

Summary

A corpus of 291 small C-program test cases was developed to evaluate static and dynamic analysis tools designed to detect buffer overflows. The corpus was designed and labeled using a new, comprehensive buffer overflow taxonomy. It provides a benchmark to measure detection, false alarm, and confusion rates of tools, and also suggests areas for tool enhancement. Experiments with five tools demonstrate that some modern static analysis tools can accurately detect overflows in simple test cases but that others have serious limitations. For example, PolySpace demonstrated a superior detection rate, missing only one detection. Its performance could be enhanced if extremely long run times were reduced, and false alarms were eliminated for some C library functions. ARCHER performed well with no false alarms whatsoever. It could be enhanced by improving inter-procedural analysis and handling of C library functions. Splint detected significantly fewer overflows and exhibited the highest false alarm rate. Improvements in loop handling and reductions in false alarm rate would make it a much more useful tool. UNO had no false alarms, but missed overflows in roughly half of all test cases. It would need improvement in many areas to become a useful tool. BOON provided the worst performance. It did not detect overflows well in string functions, even though this was a design goal.
READ LESS

Summary

A corpus of 291 small C-program test cases was developed to evaluate static and dynamic analysis tools designed to detect buffer overflows. The corpus was designed and labeled using a new, comprehensive buffer overflow taxonomy. It provides a benchmark to measure detection, false alarm, and confusion rates of tools, and...

READ MORE