Publications

Refine Results

(Filters Applied) Clear All

Beyond expertise and roles: a framework to characterize the stakeholders of interpretable machine learning and their needs

Published in:
Proc. Conf. on Human Factors in Computing Systems, 8-13 May 2021, article no. 74.

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples stakeholders' knowledge from their interpretability needs. We characterize stakeholders by their formal, instrumental, and personal knowledge and how it manifests in the contexts of machine learning, the data domain, and the general milieu. We additionally distill a hierarchical typology of stakeholder needs that distinguishes higher-level domain goals from lower-level interpretability tasks. In assessing the descriptive, evaluative, and generative powers of our framework, we find our more nuanced treatment of stakeholders reveals gaps and opportunities in the interpretability literature, adds precision to the design and comparison of user studies, and facilitates a more reflexive approach to conducting this research.
READ LESS

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples...

READ MORE

Towards a distributed framework for multi-agent reinforcement learning research

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into the limits of these approaches as computation increases. In this paper, we present a distributed RL training framework designed for super computing infrastructures such as the MIT SuperCloud. We review a collection of challenging learning environments—such as Google Research Football, StarCraft II, and Multi-Agent Mujoco— which are at the frontier of reinforcement learning research. We provide results on these environments that illustrate the current state of the field on these problems. Finally, we also quantify and discuss the computational requirements needed for performing RL research by enumerating all experiments performed on these environments.
READ LESS

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into...

READ MORE

Automated discovery of cross-plane event-based vulnerabilities in software-defined networking

Summary

Software-defined networking (SDN) achieves a programmable control plane through the use of logically centralized, event-driven controllers and through network applications (apps) that extend the controllers' functionality. As control plane decisions are often based on the data plane, it is possible for carefully crafted malicious data plane inputs to direct the control plane towards unwanted states that bypass network security restrictions (i.e., cross-plane attacks). Unfortunately, because of the complex interplay among controllers, apps, and data plane inputs, at present it is difficult to systematically identify and analyze these cross-plane vulnerabilities. We present EVENTSCOPE, a vulnerability detection tool that automatically analyzes SDN control plane event usage, discovers candidate vulnerabilities based on missing event-handling routines, and validates vulnerabilities based on data plane effects. To accurately detect missing event handlers without ground truth or developer aid, we cluster apps according to similar event usage and mark inconsistencies as candidates. We create an event flow graph to observe a global view of events and control flows within the control plane and use it to validate vulnerabilities that affect the data plane. We applied EVENTSCOPE to the ONOS SDN controller and uncovered 14 new vulnerabilities.
READ LESS

Summary

Software-defined networking (SDN) achieves a programmable control plane through the use of logically centralized, event-driven controllers and through network applications (apps) that extend the controllers' functionality. As control plane decisions are often based on the data plane, it is possible for carefully crafted malicious data plane inputs to direct the...

READ MORE

The leakage-resilience dilemma

Published in:
Proc. European Symp. on Research in Computer Security, ESORICS 2019, pp. 87-106.

Summary

Many control-flow-hijacking attacks rely on information leakage to disclose the location of gadgets. To address this, several leakage-resilient defenses, have been proposed that fundamentally limit the power of information leakage. Examples of such defenses include address-space re-randomization, destructive code reads, and execute-only code memory. Underlying all of these defenses is some form of code randomization. In this paper, we illustrate that randomization at the granularity of a page or coarser is not secure, and can be exploited by generalizing the idea of partial pointer overwrites, which we call the Relative ROP (RelROP) attack. We then analyzed more that 1,300 common binaries and found that 94% of them contained sufficient gadgets for an attacker to spawn a shell. To demonstrate this concretely, we built a proof-of-concept exploit against PHP 7.0.0. Furthermore, randomization at a granularity finer than a memory page faces practicality challenges when applied to shared libraries. Our findings highlight the dilemma that faces randomization techniques: course-grained techniques are efficient but insecure and fine-grained techniques are secure but impractical.
READ LESS

Summary

Many control-flow-hijacking attacks rely on information leakage to disclose the location of gadgets. To address this, several leakage-resilient defenses, have been proposed that fundamentally limit the power of information leakage. Examples of such defenses include address-space re-randomization, destructive code reads, and execute-only code memory. Underlying all of these defenses is...

READ MORE

Artificial intelligence: short history, present developments, and future outlook, final report

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the AI field is evolving so rapidly, the study scope was to look at the recent past and ongoing developments to lead to a set of findings and recommendations. It was important to begin with a short AI history and a lay-of-the-land on representative developments across the Department of Defense (DoD), intelligence communities (IC), and Homeland Security. These areas are addressed in more detail within the report. A main deliverable from the study was to formulate an end-to-end AI canonical architecture that was suitable for a range of applications. The AI canonical architecture, formulated in the study, serves as the guiding framework for all the sections in this report. Even though the study primarily focused on cyber security and information sciences, the enabling technologies are broadly applicable to many other areas. Therefore, we dedicate a full section on enabling technologies in Section 3. The discussion on enabling technologies helps the reader clarify the distinction among AI, machine learning algorithms, and specific techniques to make an end-to-end AI system viable. In order to understand what is the lay-of-the-land in AI, study participants performed a fairly wide reach within MIT LL and external to the Laboratory (government, commercial companies, defense industrial base, peers, academia, and AI centers). In addition to the study participants (shown in the next section under acknowledgements), we also assembled an internal review team (IRT). The IRT was extremely helpful in providing feedback and in helping with the formulation of the study briefings, as we transitioned from datagathering mode to the study synthesis. The format followed throughout the study was to highlight relevant content that substantiates the study findings, and identify a set of recommendations. An important finding is the significant AI investment by the so-called "big 6" commercial companies. These major commercial companies are Google, Amazon, Facebook, Microsoft, Apple, and IBM. They dominate in the AI ecosystem research and development (R&D) investments within the U.S. According to a recent McKinsey Global Institute report, cumulative R&D investment in AI amounts to about $30 billion per year. This amount is substantially higher than the R&D investment within the DoD, IC, and Homeland Security. Therefore, the DoD will need to be very strategic about investing where needed, while at the same time leveraging the technologies already developed and available from a wide range of commercial applications. As we will discuss in Section 1 as part of the AI history, MIT LL has been instrumental in developing advanced AI capabilities. For example, MIT LL has a long history in the development of human language technologies (HLT) by successfully applying machine learning algorithms to difficult problems in speech recognition, machine translation, and speech understanding. Section 4 elaborates on prior applications of these technologies, as well as newer applications in the context of multi-modalities (e.g., speech, text, images, and video). An end-to-end AI system is very well suited to enhancing the capabilities of human language analysis. Section 5 discusses AI's nascent role in cyber security. There have been cases where AI has already provided important benefits. However, much more research is needed in both the application of AI to cyber security and the associated vulnerability to the so-called adversarial AI. Adversarial AI is an area very critical to the DoD, IC, and Homeland Security, where malicious adversaries can disrupt AI systems and make them untrusted in operational environments. This report concludes with specific recommendations by formulating the way forward for Division 5 and a discussion of S&T challenges and opportunities. The S&T challenges and opportunities are centered on the key elements of the AI canonical architecture to strengthen the AI capabilities across the DoD, IC, and Homeland Security in support of national security.
READ LESS

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the...

READ MORE

Balancing security and performance for agility in dynamic threat environments

Published in:
46th IEEE/IFIP Int. Conf. on Dependable Systems and Networks, DSN 2016, 28 June - 1 July 2016.

Summary

In cyber security, achieving the desired balance between system security and system performance in dynamic threat environments is a long-standing open challenge for cyber defenders. Typically an increase in system security comes at the price of decreased system performance, and vice versa, easily resulting in systems that are misaligned to operator specified requirements for system security and performance as the threat environment evolves. We develop an online, reinforcement learning based methodology to automatically discover and maintain desired operating postures in security-performance space even as the threat environment changes. We demonstrate the utility of our approach and discover parameters enabling an agile response to a dynamic adversary in a simulated security game involving prototype cyber moving target defenses.
READ LESS

Summary

In cyber security, achieving the desired balance between system security and system performance in dynamic threat environments is a long-standing open challenge for cyber defenders. Typically an increase in system security comes at the price of decreased system performance, and vice versa, easily resulting in systems that are misaligned to...

READ MORE

Simulation based evaluation of a code diversification strategy

Published in:
5th Int. Conf. on Simulation and Modeling Methodologies, Technologies, and Applications, SIMULTECH 2015, 21-23 July 2015.

Summary

Periodic randomization of a computer program's binary code is an attractive technique for defending against several classes of advanced threats. In this paper we describe a model of attacker-defender interaction in which the defender employs such a technique against an attacker who is actively constructing an exploit using Return Oriented Programming (ROP). In order to successfully build a working exploit, the attacker must guess the locations of several small chunks of program code (i.e., gadgets) in the defended program's memory space. As the attacker continually guesses, the defender periodically rotates to a newly randomized variant of the program, effectively negating any gains the attacker made since the last rotation. Although randomization makes the attacker's task more difficult, it also incurs a cost to the defender. As such, the defender's goal is to find an acceptable balance between utility degradation (cost) and security (benefit). One way to measure these two competing factors is the total task latency introduced by both the attacker and any defensive measures taken to thwart him. We simulated a number of diversity strategies under various threat scenarios and present the measured impact on the defender's task.
READ LESS

Summary

Periodic randomization of a computer program's binary code is an attractive technique for defending against several classes of advanced threats. In this paper we describe a model of attacker-defender interaction in which the defender employs such a technique against an attacker who is actively constructing an exploit using Return Oriented...

READ MORE

Guaranteeing spoof-resilient multi-robot networks

Published in:
2015 Robotics: Science and Systems Conf., 13-17 July 2015.

Summary

Multi-robot networks use wireless communication to provide wide-ranging services such as aerial surveillance and unmanned delivery. However, effective coordination between multiple robots requires trust, making them particularly vulnerable to cyber-attacks. Specifically, such networks can be gravely disrupted by the Sybil attack, where even a single malicious robot can spoof a large number of fake clients. This paper proposes a new solution to defend against the Sybil attack, without requiring expensive cryptographic key-distribution. Our core contribution is a novel algorithm implemented on commercial Wi-Fi radios that can "sense" spoofers using the physics of wireless signals. We derive theoretical guarantees on how this algorithm bounds the impact of the Sybil Attack on a broad class of robotic coverage problems. We experimentally validate our claims using a team of AscTec quadrotor servers and iRobot Create ground clients, and demonstrate spoofer detection rates over 96%.
READ LESS

Summary

Multi-robot networks use wireless communication to provide wide-ranging services such as aerial surveillance and unmanned delivery. However, effective coordination between multiple robots requires trust, making them particularly vulnerable to cyber-attacks. Specifically, such networks can be gravely disrupted by the Sybil attack, where even a single malicious robot can spoof a...

READ MORE

Analyzing Mission Impacts of Cyber Actions (AMICA)

Published in:
Proc. NATO S&T Workshop on Cyber Attack, Detection, Forensics and Attribution for Assessment of Mission Impact, 15 June 2015.

Summary

This paper describes AMICA (Analyzing Mission Impacts of Cyber Actions), an integrated approach for understanding mission impacts of cyber attacks. AMICA combines process modeling, discrete-event simulation, graph-based dependency modeling, and dynamic visualizations. This is a novel convergence of two lines of research: process modeling/simulation and attack graphs. AMICA captures process flows for mission tasks as well as cyber attacker and defender tactics, techniques, and procedures (TTPs). Vulnerability dependency graphs map network attack paths, and mission-dependency graphs define the hierarchy of high-to-low-level mission requirements mapped to cyber assets. Through simulation of the resulting integrated model, we quantify impacts in terms of mission-based measures, for various mission and threat scenarios. Dynamic visualization of simulation runs provides deeper understanding of cyber warfare dynamics, for situational awareness in the context of simulated conflicts. We demonstrate our approach through a prototype tool that combines operational and systems views for rapid analysis.
READ LESS

Summary

This paper describes AMICA (Analyzing Mission Impacts of Cyber Actions), an integrated approach for understanding mission impacts of cyber attacks. AMICA combines process modeling, discrete-event simulation, graph-based dependency modeling, and dynamic visualizations. This is a novel convergence of two lines of research: process modeling/simulation and attack graphs. AMICA captures process...

READ MORE

Quantitative evaluation of moving target technology

Published in:
HST 2015, IEEE Int. Symp. on Technologies for Homeland Security, 14-16 April 2015.

Summary

Robust, quantitative measurement of cyber technology is critically needed to measure the utility, impact and cost of cyber technologies. Our work addresses this need by developing metrics and experimental methodology for a particular type of technology, moving target technology. In this paper, we present an approach to quantitative evaluation, including methodology and metrics, results of analysis, simulation and experiments, and a series of lessons learned.
READ LESS

Summary

Robust, quantitative measurement of cyber technology is critically needed to measure the utility, impact and cost of cyber technologies. Our work addresses this need by developing metrics and experimental methodology for a particular type of technology, moving target technology. In this paper, we present an approach to quantitative evaluation, including...

READ MORE