Publications
SIAM data mining "brings it" to annual meeting
Summary
Summary
The Data Mining Activity Group is one of SIAM's most vibrant and dynamic activity groups. To better share our enthusiasm for data mining with the broader SIAM community, our activity group organized six minisymposia at the 2016 Annual Meeting. These minisymposia included 48 talks organized by 11 SIAM members.
Learning by doing, High Performance Computing education in the MOOC era
Summary
Summary
The High Performance Computing (HPC) community has spent decades developing tools that teach practitioners to harness the power of parallel and distributed computing. To create scalable and flexible educational experiences for practitioners in all phases of a career, we turn to Massively Open Online Courses (MOOCs). We detail the design...
Resilience of cyber systems with over- and underregulation
Summary
Summary
Recent cyber attacks provide evidence of increased threats to our critical systems and infrastructure. A common reaction to a new threat is to harden the system by adding new rules and regulations. As federal and state governments request new procedures to follow, each of their organizations implements their own cyber...
Benchmarking SciDB data import on HPC systems
Summary
Summary
SciDB is a scalable, computational database management system that uses an array model for data storage. The array data model of SciDB makes it ideally suited for storing and managing large amounts of imaging data. SciDB is designed to support advanced analytics in database, thus reducing the need for extracting...
Enhancing HPC security with a user-based firewall
Summary
Summary
High Performance Computing (HPC) systems traditionally allow their users unrestricted use of their internal network. While this network is normally controlled enough to guarantee privacy without the need for encryption, it does not provide a method to authenticate peer connections. Protocols built upon this internal network, such as those used...
Benchmarking the Graphulo processing framework
Summary
Summary
Graph algorithms have wide applicability to a variety of domains and are often used on massive datasets. Recent standardization efforts such as the GraphBLAS are designed to specify a set of key computational kernels that hardware and software developers can adhere to. Graphulo is a processing framework that enables GraphBLAS...
Novel graph processor architecture, prototype system, and results
Summary
Summary
Graph algorithms are increasingly used in applications that exploit large databases. However, conventional processor architectures are inadequate for handling the throughput and memory requirements of graph computation. Lincoln Laboratory's graph-processor architecture represents a rethinking of parallel architectures for graph problems. Our processor utilizes innovations that include a sparse matrix-based graph...
From NoSQL Accumulo to NewSQL Graphulo: design and utility of graph algorithms inside a BigTable database
Summary
Summary
Google BigTable's scale-out design for distributed key-value storage inspired a generation of NoSQL databases. Recently the NewSQL paradigm emerged in response to analytic workloads that demand distributed computation local to data storage. Many such analytics take the form of graph algorithms, a trend that motivated the GraphBLAS initiative to standardize...
Julia implementation of the Dynamic Distributed Dimensional Data Model
Summary
Summary
Julia is a new language for writing data analysis programs that are easy to implement and run at high performance. Similarly, the Dynamic Distributed Dimensional Data Model (D4M) aims to clarify data analysis operations while retaining strong performance. D4M accomplishes these goals through a composable, unified data model on associative...
Designing a new high performance computing education strategy for professional scientists and engineers
Summary
Summary
For decades the High Performance Computing (HPC) community has used web content, workshops and embedded HPC scientists to enable practitioners to harness the power of parallel and distributed computing. The most successful approaches, face-to-face tutorials and embedded professionals, don't scale. To create scalable, flexible, educational experiences for practitioners in all...