Publications

Refine Results

(Filters Applied) Clear All

Secure and resilient cloud computing for the Department of Defense

Summary

Cloud computing offers substantial benefits to its users: the ability to store and access massive amounts of data, on-demand delivery of computing services, the capability to widely share information, and the scalability of resource usage. Lincoln Laboratory is developing technology that will strengthen the security and resilience of cloud computing so that the Department of Defense can confidently deploy cloud services for its critical missions.
READ LESS

Summary

Cloud computing offers substantial benefits to its users: the ability to store and access massive amounts of data, on-demand delivery of computing services, the capability to widely share information, and the scalability of resource usage. Lincoln Laboratory is developing technology that will strengthen the security and resilience of cloud computing...

READ MORE

Building Resource Adaptive Software Systems (BRASS): objectives and system evaluation

Summary

As modern software systems continue inexorably to increase in complexity and capability, users have become accustomed to periodic cycles of updating and upgrading to avoid obsolescence—if at some cost in terms of frustration. In the case of the U.S. military, having access to well-functioning software systems and underlying content is critical to national security, but updates are no less problematic than among civilian users and often demand considerable time and expense. To address these challenges, DARPA has announced a new four-year research project to investigate the fundamental computational and algorithmic requirements necessary for software systems and data to remain robust and functional in excess of 100 years. The Building Resource Adaptive Software Systems, or BRASS, program seeks to realize foundational advances in the design and implementation of long-lived software systems that can dynamically adapt to changes in the resources they depend upon and environments in which they operate. MIT Lincoln Laboratory will provide the test framework and evaluation of proposed software tools in support of this revolutionary vision.
READ LESS

Summary

As modern software systems continue inexorably to increase in complexity and capability, users have become accustomed to periodic cycles of updating and upgrading to avoid obsolescence—if at some cost in terms of frustration. In the case of the U.S. military, having access to well-functioning software systems and underlying content is...

READ MORE

Spyglass: demand-provisioned Linux containers for private network access

Published in:
Proc. 29th Large Installation System Administration Conf., LISA, 8-13 November 2015.

Summary

System administrators are required to access the privileged, or "super-user," interfaces of computing, networking, and storage resources they support. This low-level infrastructure underpins most of the security tools and features common today and is assumed to be secure. A malicious system administrator or malware on the system administrator's client system can silently subvert this computing infrastructure. In the case of cloud system administrators, unauthorized privileged access has the potential to cause grave damage to the cloud provider and their customers. In this paper, we describe Spyglass, a tool for managing, securing, and auditing administrator access to private or sensitive infrastructure networks by creating on-demand bastion hosts inside of Linux containers. These on-demand bastion containers differ from regular bastion hosts in that they are nonpersistent and last only for the duration of the administrator's access. Spyglass also captures command input and screen output of all administrator activities from outside the container, allowing monitoring of sensitive infrastructure and understanding of the actions of an adversary in the event of a compromise. Through our evaluation of Spyglass for remote network access, we show that it is more difficult to penetrate than existing solutions, does not introduce delays or major workflow changes, and increases the amount of tamper-resistant auditing information that is captured about a system administrator's access.
READ LESS

Summary

System administrators are required to access the privileged, or "super-user," interfaces of computing, networking, and storage resources they support. This low-level infrastructure underpins most of the security tools and features common today and is assumed to be secure. A malicious system administrator or malware on the system administrator's client system...

READ MORE

Sampling operations on big data

Published in:
2015 Asilomar Conf. on Signals, Systems and Computers, 8-11 November 2015.

Summary

The 3Vs -- Volume, Velocity and Variety -- of Big Data continues to be a large challenge for systems and algorithms designed to store, process and disseminate information for discovery and exploration under real-time constraints. Common signal processing operations such as sampling and filtering, which have been used for decades to compress signals are often undefined in data that is characterized by heterogeneity, high dimensionality, and lack of known structure. In this article, we describe and demonstrate an approach to sample large datasets such as social media data. We evaluate the effect of sampling on a common predictive analytic: link prediction. Our results indicate that greatly sampling a dataset can still yield meaningful link prediction results.
READ LESS

Summary

The 3Vs -- Volume, Velocity and Variety -- of Big Data continues to be a large challenge for systems and algorithms designed to store, process and disseminate information for discovery and exploration under real-time constraints. Common signal processing operations such as sampling and filtering, which have been used for decades...

READ MORE

Percolation model of insider threats to assess the optimum number of rules

Published in:
Environ. Syst. Decis., Vol. 35, 2015, pp. 504-10.

Summary

Rules, regulations, and policies are the basis of civilized society and are used to coordinate the activities of individuals who have a variety of goals and purposes. History has taught that over-regulation (too many rules) makes it difficult to compete and under-regulation (too few rules) can lead to crisis. This implies an optimal number of rules that avoids these two extremes. Rules create boundaries that define the latitude at which an individual has to perform their activities. This paper creates a Toy Model of a work environment and examines it with respect to the latitude provided to a normal individual and the latitude provided to an insider threat. Simulations with the Toy Model illustrate four regimes with respect to an insider threat: under-regulated, possibly optimal, tipping point, and over-regulated. These regimes depend upon the number of rules (N) and the minimum latitude (Lmin) required by a normal individual to carry out their activities. The Toy Model is then mapped onto the standard 1D Percolation Model from theoretical physics, and the same behavior is observed. This allows the Toy Model to be generalized to a wide array of more complex models that have been well studied by the theoretical physics community and also show the same behavior. Finally, by estimating N and Lmin, it should be possible to determine the regime of any particular environment.
READ LESS

Summary

Rules, regulations, and policies are the basis of civilized society and are used to coordinate the activities of individuals who have a variety of goals and purposes. History has taught that over-regulation (too many rules) makes it difficult to compete and under-regulation (too few rules) can lead to crisis. This...

READ MORE

Control jujutsu: on the weaknesses of fine-grained control flow integrity

Published in:
22nd ACM Conf. on Computer and Communications Security, 12-16 October 2015.

Summary

Control flow integrity (CFI) has been proposed as an approach to defend against control-hijacking memory corruption attacks. CFI works by assigning tags to indirect branch targets statically and checking them at runtime. Coarse-grained enforcements of CFI that use a small number of tags to improve the performance overhead have been shown to be ineffective. As a result, a number of recent efforts have focused on fine-grained enforcement of CFI as it was originally proposed. In this work, we show that even a finegrained form of CFI with unlimited number of tags and a shadow stack (to check calls and returns) is ineffective in protecting against malicious attacks. We show that many popular code bases such as Apache and Nginx use coding practices that create flexibility in their intended control flow graph (CFG) even when a strong static analyzer is used to construct the CFG. These flexibilities allow an attacker to gain control of the execution while strictly adhering to a fine-grained CFI. We then construct two proof-of-concept exploits that attack an unlimited tag CFI system with a shadow stack. We also evaluate the difficulties of generating a precise CFG using scalable static analysis for real-world applications. Finally, we perform an analysis on a number of popular applications that highlights the availability of such attacks.
READ LESS

Summary

Control flow integrity (CFI) has been proposed as an approach to defend against control-hijacking memory corruption attacks. CFI works by assigning tags to indirect branch targets statically and checking them at runtime. Coarse-grained enforcements of CFI that use a small number of tags to improve the performance overhead have been...

READ MORE

Timely rerandomization for mitigating memory disclosures

Published in:
22nd ACM Conf. on Computer and Communications Security, 12-16 October 2015.

Summary

Address Space Layout Randomization (ASLR) can increase the cost of exploiting memory corruption vulnerabilities. One major weakness of ASLR is that it assumes the secrecy of memory addresses and is thus ineffective in the face of memory disclosure vulnerabilities. Even fine-grained variants of ASLR are shown to be ineffective against memory disclosures. In this paper we present an approach that synchronizes randomization with potential runtime disclosure. By applying rerandomization to the memory layout of a process every time it generates an output, our approach renders disclosures stale by the time they can be used by attackers to hijack control flow. We have developed a fully functioning prototype for x86_64 C programs by extending the Linux kernel, GCC, and the libc dynamic linker. The prototype operates on C source code and recompiles programs with a set of augmented information required to track pointer locations and support runtime rerandomization. Using this augmented information we dynamically relocate code segments and update code pointer values during runtime. Our evaluation on the SPEC CPU2006 benchmark, along with other applications, show that our technique incurs a very low performance overhead (2.1% on average).
READ LESS

Summary

Address Space Layout Randomization (ASLR) can increase the cost of exploiting memory corruption vulnerabilities. One major weakness of ASLR is that it assumes the secrecy of memory addresses and is thus ineffective in the face of memory disclosure vulnerabilities. Even fine-grained variants of ASLR are shown to be ineffective against...

READ MORE

Very large graphs for information extraction (VLG) - detection and inference in the presence of uncertainty

Summary

In numerous application domains relevant to the Department of Defense and the Intelligence Community, data of interest take the form of entities and the relationships between them, and these data are commonly represented as graphs. Under the Very Large Graphs for Information Extraction effort--a one year proof-of-concept study--MIT LL developed novel techniques for anomalous subgraph detection, building on tools in the signal processing research literature. This report documents the technical results of this effort. Two datasets--a snapshot of Thompson Reuters' Web of Science database and a stream of web proxy logs--were parsed, and graphs were constructed from the raw data. From the phenomena in these datasets, several algorithms were developed to model the dynamic graph behavior, including a preferential attachment mechanism with memory, a streaming filter to model a graph as a weighted average of its past connections, and a generalized linear model for graphs where connection probabilities are determined by additional side information or metadata. A set of metrics was also constructed to facilitate comparison of techniques. The study culminated in a demonstration of the algorithms on the datasets of interest, in addition to simulated data. Performance in terms of detection, estimation, and computational burden was measured according to the metrics. Among the highlights of this demonstration were the detection of emerging coauthor clusters in the Web of Science data, detection of botnet activity in the web proxy data after 15 minutes (which took 10 days to detect using state-of-the-practice techniques), and demonstration of the core algorithm on a simulated 1-billion-vertex graph using a commodity computing cluster.
READ LESS

Summary

In numerous application domains relevant to the Department of Defense and the Intelligence Community, data of interest take the form of entities and the relationships between them, and these data are commonly represented as graphs. Under the Very Large Graphs for Information Extraction effort--a one year proof-of-concept study--MIT LL developed...

READ MORE

Sampling large graphs for anticipatory analytics

Published in:
HPEC 2015: IEEE Conf. on High Performance Extreme Computing, 15-17 September 2015.

Summary

The characteristics of Big Data - often dubbed the 3V's for volume, velocity, and variety - will continue to outpace the ability of computational systems to process, store, and transmit meaningful results. Traditional techniques for dealing with large datasets often include the purchase of larger systems, greater human-in-the-loop involvement, or more complex algorithms. We are investigating the use of sampling to mitigate these challenges, specifically sampling large graphs. Often, large datasets can be represented as graphs where data entries may be edges, and vertices may be attributes of the data. In particular, we present the results of sampling for the task of link prediction. Link prediction is a process to estimate the probability of a new edge forming between two vertices of a graph, and it has numerous application areas in understanding social or biological networks. In this paper we propose a series of techniques for the sampling of large datasets. In order to quantify the effect of these techniques, we present the quality of link prediction tasks on sampled graphs, and the time saved in calculating link prediction statistics on these sampled graphs.
READ LESS

Summary

The characteristics of Big Data - often dubbed the 3V's for volume, velocity, and variety - will continue to outpace the ability of computational systems to process, store, and transmit meaningful results. Traditional techniques for dealing with large datasets often include the purchase of larger systems, greater human-in-the-loop involvement, or...

READ MORE

Secure architecture for embedded systems

Summary

Devices connected to the internet are increasingly the targets of deliberate and sophisticated attacks. Embedded system engineers tend to focus on well-defined functional capabilities rather than "obscure" security and resilience. However, "after-the-fact" system hardening could be prohibitively expensive or even impossible. The co-design of security and resilience with functionality has to overcome a major challenge; rarely can the security and resilience requirements be accurately identified when the design begins. This paper describes an embedded system architecture that decouples secure and functional design aspects.
READ LESS

Summary

Devices connected to the internet are increasingly the targets of deliberate and sophisticated attacks. Embedded system engineers tend to focus on well-defined functional capabilities rather than "obscure" security and resilience. However, "after-the-fact" system hardening could be prohibitively expensive or even impossible. The co-design of security and resilience with functionality has...

READ MORE