Publications

Refine Results

(Filters Applied) Clear All

D4M 2.0 Schema: a general purpose high performance schema for the Accumulo database

Summary

Non-traditional, relaxed consistency, triple store databases are the backbone of many web companies (e.g., Google Big Table, Amazon Dynamo, and Facebook Cassandra). The Apache Accumulo database is a high performance open source relaxed consistency database that is widely used for government applications. Obtaining the full benefits of Accumulo requires using novel schemas. The Dynamic Distributed Dimensional Data Model (D4M) [http://www.mit.edu/~kepner/D4M] provides a uniform mathematical framework based on associative arrays that encompasses both traditional (i.e., SQL) and non-traditional databases. For non-traditional databases D4M naturally leads to a general purpose schema that can be used to fully index and rapidly query every unique string in a dataset. The D4M 2.0 Schema has been applied with little or no customization to cyber, bioinformatics, scientific citation, free text, and social media data. The D4M 2.0 Schema is simple, requires minimal parsing, and achieves the highest published Accumulo ingest rates. The benefits of the D4M 2.0 Schema are independent of the D4M interface. Any interface to Accumulo can achieve these benefits by using the D4M 2.0 Schema.
READ LESS

Summary

Non-traditional, relaxed consistency, triple store databases are the backbone of many web companies (e.g., Google Big Table, Amazon Dynamo, and Facebook Cassandra). The Apache Accumulo database is a high performance open source relaxed consistency database that is widely used for government applications. Obtaining the full benefits of Accumulo requires using...

READ MORE

Driving big data with big compute

Summary

Big Data (as embodied by Hadoop clusters) and Big Compute (as embodied by MPI clusters) provide unique capabilities for storing and processing large volumes of data. Hadoop clusters make distributed computing readily accessible to the Java community and MPI clusters provide high parallel efficiency for compute intensive workloads. Bringing the big data and big compute communities together is an active area of research. The LLGrid team has developed and deployed a number of technologies that aim to provide the best of both worlds. LLGrid MapReduce allows the map/reduce parallel programming model to be used quickly and efficiently in any language on any compute cluster. D4M (Dynamic Distributed Dimensional Data Model) provided a high level distributed arrays interface to the Apache Accumulo database. The accessibility of these technologies is assessed by measuring the effort to use these tools and is typically a few lines of code. The performance is assessed by measuring the insert rate into the Accumulo database. Using these tools a database insert rate of 4M inserts/second has been achieved on an 8 node cluster.
READ LESS

Summary

Big Data (as embodied by Hadoop clusters) and Big Compute (as embodied by MPI clusters) provide unique capabilities for storing and processing large volumes of data. Hadoop clusters make distributed computing readily accessible to the Java community and MPI clusters provide high parallel efficiency for compute intensive workloads. Bringing the...

READ MORE

Modeling modern network attacks and countermeasures using attack graphs

Published in:
ACSAC 2009, Annual Computer Security Applications Conf., 7 December 2009, pp. 117-126.

Summary

By accurately measuring risk for enterprise networks, attack graphs allow network defenders to understand the most critical threats and select the most effective countermeasures. This paper describes substantial enhancements to the NetSPA attack graph system required to model additional present-day threats (zero-day exploits and client-side attacks) and countermeasures (intrusion prevention systems, proxy firewalls, personal firewalls, and host-based vulnerability scans). Point-to-point reachability algorithms and structures were extensively redesigned to support "reverse" reachability computations and personal firewalls. Host-based vulnerability scans are imported and analyzed. Analysis of an operational network with 85 hosts demonstrates that client-side attacks pose a serious threat. Experiments on larger simulated networks demonstrated that NetSPA's previous excellent scaling is maintained. Less than two minutes are required to completely analyze a four-enclave simulated network with more than 40,000 hosts protected by personal firewalls.
READ LESS

Summary

By accurately measuring risk for enterprise networks, attack graphs allow network defenders to understand the most critical threats and select the most effective countermeasures. This paper describes substantial enhancements to the NetSPA attack graph system required to model additional present-day threats (zero-day exploits and client-side attacks) and countermeasures (intrusion prevention...

READ MORE

Practical attack graph generation for network defense

Published in:
Proc. of the 22nd Annual Computer Security Applications Conf., IEEE, 11-15 December 2006, pp.121-130.

Summary

Attack graphs are a valuable tool to network defenders, illustrating paths an attacker can use to gain access to a targeted network. Defenders can then focus their efforts on patching the vulnerabilities and configuration errors that allow the attackers the greatest amount of access. We have created a new type of attack graph, the multiple-prerequisite graph, that scales nearly linearly as the size of a typical network increases. We have built a prototype system using this graph type. The prototype uses readily available source data to automatically compute network reachability, classify vulnerabilities, build the graph, and recommend actions to improve network security. We have tested the prototype on an operational network with over 250 hosts, where it helped to discover a previously unknown configuration error. It has processed complex simulated networks with over 50,000 hosts in under four minutes.
READ LESS

Summary

Attack graphs are a valuable tool to network defenders, illustrating paths an attacker can use to gain access to a targeted network. Defenders can then focus their efforts on patching the vulnerabilities and configuration errors that allow the attackers the greatest amount of access. We have created a new type...

READ MORE

Validating and restoring defense in depth using attack graphs

Summary

Defense in depth is a common strategy that uses layers of firewalls to protect Supervisory Control and Data Acquisition (SCADA) subnets and other critical resources on enterprise networks. A tool named NetSPA is presented that analyzes firewall rules and vulnerabilities to construct attack graphs. These show how inside and outside attackers can progress by successively compromising exposed vulnerable hosts with the goal of reaching critical internal targets. NetSPA generates attack graphs and automatically analyzes them to produce a small set of prioritized recommendations to restore defense in depth. Field trials on networks with up to 3,400 hosts demonstrate that firewalls often do not provide defense in depth due to misconfigurations and critical unpatched vulnerabilities on hosts. In all cases, a small number of recommendations was provided to restore defense in depth. Simulations on networks with up to 50,000 hosts demonstrate that this approach scales well to enterprise-size networks.
READ LESS

Summary

Defense in depth is a common strategy that uses layers of firewalls to protect Supervisory Control and Data Acquisition (SCADA) subnets and other critical resources on enterprise networks. A tool named NetSPA is presented that analyzes firewall rules and vulnerabilities to construct attack graphs. These show how inside and outside...

READ MORE

Evaluating and strengthening enterprise network security using attack graphs

Summary

Assessing the security of large enterprise networks is complex and labor intensive. Current security analysis tools typically examine only individual firewalls, routers, or hosts separately and do not comprehensively analyze overall network security. We present a new approach that uses configuration information on firewalls and vulnerability information on all network devices to build attack graphs that show how far inside and outside attackers can progress through a network by successively compromising exposed and vulnerable hosts. In addition, attack graphs are automatically analyzed to produce a small set of prioritized recommendations to enhance network security. Field trials on networks with up to 3,400 hosts demonstrate the ability to accurately identify a small number of critical stepping-stone hosts that need to be patched to protect against external attackers. Simulation studies on complex networks with more than 40,000 hosts demonstrate good scaling. This analysis can be used for many purposes, including identifying critical stepping-stone hosts to patch or protect with a firewall, comparing the security of alternating network designs, determining the security risk caused by proposed changes in firewall rules or new vulnerabilities, and identifying the most critical hosts to patch when a new vulnerability is announced. Unique aspects of this work are new attack graph generation algorithms that scale to enterprise networks with thousands of hosts, efficient approaches to determine what other hosts and ports in large networks are reachable from each individual host, automatic data importation from network vulnerability scanners and firewalls, and automatic attack graph analyses to generate recommendations.
READ LESS

Summary

Assessing the security of large enterprise networks is complex and labor intensive. Current security analysis tools typically examine only individual firewalls, routers, or hosts separately and do not comprehensively analyze overall network security. We present a new approach that uses configuration information on firewalls and vulnerability information on all network...

READ MORE