Publications

Refine Results

(Filters Applied) Clear All

Artificial intelligence: short history, present developments, and future outlook, final report

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the AI field is evolving so rapidly, the study scope was to look at the recent past and ongoing developments to lead to a set of findings and recommendations. It was important to begin with a short AI history and a lay-of-the-land on representative developments across the Department of Defense (DoD), intelligence communities (IC), and Homeland Security. These areas are addressed in more detail within the report. A main deliverable from the study was to formulate an end-to-end AI canonical architecture that was suitable for a range of applications. The AI canonical architecture, formulated in the study, serves as the guiding framework for all the sections in this report. Even though the study primarily focused on cyber security and information sciences, the enabling technologies are broadly applicable to many other areas. Therefore, we dedicate a full section on enabling technologies in Section 3. The discussion on enabling technologies helps the reader clarify the distinction among AI, machine learning algorithms, and specific techniques to make an end-to-end AI system viable. In order to understand what is the lay-of-the-land in AI, study participants performed a fairly wide reach within MIT LL and external to the Laboratory (government, commercial companies, defense industrial base, peers, academia, and AI centers). In addition to the study participants (shown in the next section under acknowledgements), we also assembled an internal review team (IRT). The IRT was extremely helpful in providing feedback and in helping with the formulation of the study briefings, as we transitioned from datagathering mode to the study synthesis. The format followed throughout the study was to highlight relevant content that substantiates the study findings, and identify a set of recommendations. An important finding is the significant AI investment by the so-called "big 6" commercial companies. These major commercial companies are Google, Amazon, Facebook, Microsoft, Apple, and IBM. They dominate in the AI ecosystem research and development (R&D) investments within the U.S. According to a recent McKinsey Global Institute report, cumulative R&D investment in AI amounts to about $30 billion per year. This amount is substantially higher than the R&D investment within the DoD, IC, and Homeland Security. Therefore, the DoD will need to be very strategic about investing where needed, while at the same time leveraging the technologies already developed and available from a wide range of commercial applications. As we will discuss in Section 1 as part of the AI history, MIT LL has been instrumental in developing advanced AI capabilities. For example, MIT LL has a long history in the development of human language technologies (HLT) by successfully applying machine learning algorithms to difficult problems in speech recognition, machine translation, and speech understanding. Section 4 elaborates on prior applications of these technologies, as well as newer applications in the context of multi-modalities (e.g., speech, text, images, and video). An end-to-end AI system is very well suited to enhancing the capabilities of human language analysis. Section 5 discusses AI's nascent role in cyber security. There have been cases where AI has already provided important benefits. However, much more research is needed in both the application of AI to cyber security and the associated vulnerability to the so-called adversarial AI. Adversarial AI is an area very critical to the DoD, IC, and Homeland Security, where malicious adversaries can disrupt AI systems and make them untrusted in operational environments. This report concludes with specific recommendations by formulating the way forward for Division 5 and a discussion of S&T challenges and opportunities. The S&T challenges and opportunities are centered on the key elements of the AI canonical architecture to strengthen the AI capabilities across the DoD, IC, and Homeland Security in support of national security.
READ LESS

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the...

READ MORE

Secure input validation in Rust with parsing-expression grammars

Published in:
Thesis (M.E.)--Massachusetts Institute of Technology, 2019.

Summary

Accepting input from the outside world is one of the most dangerous things a system can do. Since type information is lost across system boundaries, systems must perform type-specific input handling routines to recover this information. Adversaries can carefully craft input data to exploit any bugs or vulnerabilities in these routines, thereby causing dangerous memory errors. Including input validation routines in kernels is especially risky. Sensitive memory contents and powerful privileges make kernels a preferred target of attackers. Furthermore, the fact that kernels must process user input, network data, as well as input from a wide array of peripheral devices means that including such input validation schemes is unavoidable. In this thesis we present Automatic Validation of Input Data (AVID), which helps solve the issue of input validation within kernels by automatically generating parser implementations for developer-defined structs. AVID leverages not only the unambiguity guarantees of parsing expression grammars but also the type safety guarantees of Rust. We show how AVID can be used to resolve a manufactured vulnerability in Tock, an operating system written in Rust for embedded systems. Using Rust’s procedural macro system, AVID generates parser implementations at compile time based on existing Rust struct definitions. AVID exposes a simple and convenient parser API that is able to validate input and then instantiate structs from the validated input. AVID's simple interface makes it easy for developers to use and to integrate with existing codebases.
READ LESS

Summary

Accepting input from the outside world is one of the most dangerous things a system can do. Since type information is lost across system boundaries, systems must perform type-specific input handling routines to recover this information. Adversaries can carefully craft input data to exploit any bugs or vulnerabilities in these...

READ MORE

Detecting food safety risks and human trafficking using interpretable machine learning methods

Author:
Published in:
Thesis (M.S.)--Massachusetts Institute of Technology, 2019.

Summary

Black box machine learning methods have allowed researchers to design accurate models using large amounts of data at the cost of interpretability. Model interpretability not only improves user buy-in, but in many cases provides users with important information. Especially in the case of the classification problems addressed in this thesis, the ideal model should not only provide accurate predictions, but should also inform users of how features affect the results. My research goal is to solve real-world problems and compare how different classification models affect the outcomes and interpretability. To this end, this thesis is divided into two parts: food safety risk analysis and human trafficking detection. The first half analyzes the characteristics of supermarket suppliers in China that indicate a high risk of food safety violations. Contrary to expectations, supply chain dispersion, internal inspections, and quality certification systems are not found to be predictive of food safety risk in our data. The second half focuses on identifying human trafficking, specifically sex trafficking, advertisements hidden amongst online classified escort service advertisements. We propose a novel but interpretable keyword detection and modeling pipeline that is more accurate and actionable than current neural network approaches. The algorithms and applications presented in this thesis succeed in providing users with not just classifications but also the characteristics that indicate food safety risk and human trafficking ads.
READ LESS

Summary

Black box machine learning methods have allowed researchers to design accurate models using large amounts of data at the cost of interpretability. Model interpretability not only improves user buy-in, but in many cases provides users with important information. Especially in the case of the classification problems addressed in this thesis...

READ MORE

Rulemaking for insider threat mitigation

Published in:
Chapter 12, Cyber Resilience of Systems and Networks, 2019, pp. 265-86.

Summary

This chapter continues the topic we started to discuss in the previous chapter – the human factors. However, it focuses on a specific method of enhancing cyber resilience via establishing appropriate rules for employees of an organization under consideration. Such rules aim at reducing threats from, for example, current or former employees, contractors, and business partners who intentionally use their authorized access to an organization to harm the organization. System users can also unintentionally contribute to cyber-attacks, or themselves become a passive target of a cyber-attack. The implementation of work-related rules is intended to decrease such risks. However, rules implementation can also increase the risks that arise from employee disregard for rules. This can occur when the rules become too restrictive, and employees become more likely to disregard the rules. Furthermore, the more often employees disregard the rules both intentionally and unintentionally, the more likely insider threats are able to observe and mimic employee behavior. This chapter shows how to find an intermediate, optimal collection of rules between the two extremes of "too many rules" and "not enough rules."
READ LESS

Summary

This chapter continues the topic we started to discuss in the previous chapter – the human factors. However, it focuses on a specific method of enhancing cyber resilience via establishing appropriate rules for employees of an organization under consideration. Such rules aim at reducing threats from, for example, current or...

READ MORE

Chip-scale molecular clock

Published in:
IEEE J. Solid-State Circuits, Vol. 54, No. 4, April 2019, pp. 914-26.

Summary

An ultra-stable time-keeping device is presented, which locks its output clock frequency to the rotational-mode transition of polar gaseous molecules. Based on a high-precision spectrometer in the sub-terahertz (THz) range, our new clocking scheme realizes not only fully electronic operation but also implementations using mainstream CMOS technology. Meanwhile, the small wavelength of probing wave and high absorption intensity of our adopted molecules (carbonyl sulfide, 16O12C32S) also enable miniaturization of the gas cell. All these result in an "atomic-clock-grade" frequency reference with small size, power, and cost. This paper provides the architectural and chip-design details of the first proof-of-concept molecular clock using a 65-nm CMOS bulk technology. Using a 231.061-GHz phase-locked loop (PLL) with frequency-shift keying (FSK) modulation and a sub-THz FET detector with integrated lock-in function, the chip probes the accurate transition frequency of carbonyl sulfide (OCS) gas inside a single-mode waveguide, and accordingly adjusts the 80-MHz output of a crystal oscillator. The clock consumes only 66 mW of dc power and has a measured Allan deviation of 3.8 × 10^−10 at an averaging time of tau = 1000 s.
READ LESS

Summary

An ultra-stable time-keeping device is presented, which locks its output clock frequency to the rotational-mode transition of polar gaseous molecules. Based on a high-precision spectrometer in the sub-terahertz (THz) range, our new clocking scheme realizes not only fully electronic operation but also implementations using mainstream CMOS technology. Meanwhile, the small...

READ MORE

Detection and characterization of human trafficking networks using unsupervised scalable text template matching

Summary

Human trafficking is a form of modern-day slavery affecting an estimated 40 million victims worldwide, primarily through the commercial sexual exploitation of women and children. In the last decade, the advertising of victims has moved from the streets to websites on the Internet, providing greater efficiency and anonymity for sex traffickers. This shift has allowed traffickers to list their victims in multiple geographic areas simultaneously, while also improving operational security by using multiple methods of electronic communication with buyers; complicating the ability of law enforcement to disrupt these illicit organizations. In this paper, we address this issue and present a novel unsupervised and scalable template matching algorithm for analyzing and detecting complex organizations operating on adult service websites. The algorithm uses only the advertisement content to uncover signature patterns in text that are indicative of organized activities and organizational structure. We apply this method to a large corpus of adult service advertisements retrieved from backpage.com, and show that the networks identified through the algorithm match well with surrogate truth data derived from phone number networks in the same corpus. Further exploration of the results show that the proposed method provides deeper insights into the complex structures of sex trafficking organizations, not possible through networks derived from phone numbers alone. This method provides a powerful new capability for law enforcement to more completely identify and gather evidence about trafficking networks and their operations.
READ LESS

Summary

Human trafficking is a form of modern-day slavery affecting an estimated 40 million victims worldwide, primarily through the commercial sexual exploitation of women and children. In the last decade, the advertising of victims has moved from the streets to websites on the Internet, providing greater efficiency and anonymity for sex...

READ MORE

Leveraging Intel SGX technology to protect security-sensitive applications

Published in:
17th IEEE Int. Symp. on Network Computing and Applications, NCA, 1-3 November 2018.

Summary

This paper explains the process by which Intel Software Guard Extensions (SGX) can be leveraged into an existing codebase to protect a security-sensitive application. Intel SGX provides user-level applications with hardware-enforced confidentiality and integrity protections and incurs manageable impact on performance. These protections apply to all three phases of the operational data lifecycle: at rest, in use, and in transit. SGX shrinks the trusted computing base (and therefore the attack surface) of the application to only the hardware on the CPU chip and the portion of the application's software that is executed within the protected enclave. The SDK enables SGX integration into existing C/C++ codebases while still ensuring program support for legacy and non-Intel platforms. This paper is the first published work to walk through the step-by-step process of Intel SGX integration with examples and performance results from an actual cryptographic application produced in a standard Linux development environment.
READ LESS

Summary

This paper explains the process by which Intel Software Guard Extensions (SGX) can be leveraged into an existing codebase to protect a security-sensitive application. Intel SGX provides user-level applications with hardware-enforced confidentiality and integrity protections and incurs manageable impact on performance. These protections apply to all three phases of the...

READ MORE

OS independent and hardware-assisted insider threat detection and prevention framework

Summary

Governmental and military institutions harbor critical infrastructure and highly confidential information. Although institutions are investing a lot for protecting their data and assets from possible outsider attacks, insiders are still a distrustful source of information leakage. As malicious software injection is one among many attacks, turning innocent employees into malicious attackers through social attacks is the most impactful one. Malicious insiders or uneducated employees are dangerous for organizations that they are already behind the perimeter protections that guard the digital assets; actually, they are trojans on their own. For an insider, the easiest possible way for creating a hole in security is using the popular and ubiquitous Universal Serial Bus (USB) devices due to its versatile and easy to use plug-and-play nature. USB type storage devices are the biggest threats for contaminating mission critical infrastructure with viruses, malware, and trojans. USB human interface devices are also dangerous as they may connect to a host with destructive hidden functionalities. In this paper, we propose a novel hardware-assisted insider threat detection and prevention framework for the USB case. Our novel framework is also OS independent. We implemented a proof-of-concept design on an FPGA board which is widely used in military settings supporting critical missions, and demonstrated the results considering different experiments. Based on the results of these experiments, we show that our framework can identify rapid-keyboard key-stroke attacks and can easily detect the functionality of the USB device plugged in. We present the resource consumption of our framework on the FPGA for its utilization on a host controller device. We show that our hard-to-tamper framework introduces no overhead in USB communication in terms of user experience.
READ LESS

Summary

Governmental and military institutions harbor critical infrastructure and highly confidential information. Although institutions are investing a lot for protecting their data and assets from possible outsider attacks, insiders are still a distrustful source of information leakage. As malicious software injection is one among many attacks, turning innocent employees into malicious...

READ MORE

Cross-app poisoning in software-defined networking

Published in:
Proc. ACM Conf. on Computer and Communications Security, CCS, 15-18 October 2018, pp. 648-63.

Summary

Software-defined networking (SDN) continues to grow in popularity because of its programmable and extensible control plane realized through network applications (apps). However, apps introduce significant security challenges that can systemically disrupt network operations, since apps must access or modify data in a shared control plane state. If our understanding of how such data propagate within the control plane is inadequate, apps can co-opt other apps, causing them to poison the control plane's integrity. We present a class of SDN control plane integrity attacks that we call cross-app poisoning (CAP), in which an unprivileged app manipulates the shared control plane state to trick a privileged app into taking actions on its behalf. We demonstrate how role-based access control (RBAC) schemes are insufficient for preventing such attacks because they neither track information flow nor enforce information flow control (IFC). We also present a defense, ProvSDN, that uses data provenance to track information flow and serves as an online reference monitor to prevent CAP attacks. We implement ProvSDN on the ONOS SDN controller and demonstrate that information flow can be tracked with low-latency overheads.
READ LESS

Summary

Software-defined networking (SDN) continues to grow in popularity because of its programmable and extensible control plane realized through network applications (apps). However, apps introduce significant security challenges that can systemically disrupt network operations, since apps must access or modify data in a shared control plane state. If our understanding of...

READ MORE

Component standards for stable microgrids

Published in:
IEEE Trans. Power Syst., Vol. 34, No. 2, pp. 852-863. 2018.
R&D group:

Summary

This paper is motivated by the need to ensure fast microgrid stability. Modeling for purposes of establishing stability criterion and possible implementations are described. In particular, this paper proposes that highly heterogeneous microgrids comprising both conventional equipment and equipment based on rapidly emerging new technologies can be modeled as purely electric networks in order to provide intuitive insight into the issues of network stability. It is shown that the proposed model is valid for representing fast primary dynamics of diverse components (gensets, loads, PVs), assuming that slower variables are regulated by the higher-level controllers. Based on this modeling approach, an intuitively-appealing criterion is introduced requiring that components or their combined representations must behave as closed-loop passive electrical circuits. Implementing this criterion is illustrated using typical commercial feeder microgrid. Notably, these set the basis for standards which should be required for groups of components (sub grids) to ensure no fast instabilities in complex microgrids. Building the need for incrementally passive and monotonic characteristics into standards for network components may clarify the system level analysis and integration of microgrids.
READ LESS

Summary

This paper is motivated by the need to ensure fast microgrid stability. Modeling for purposes of establishing stability criterion and possible implementations are described. In particular, this paper proposes that highly heterogeneous microgrids comprising both conventional equipment and equipment based on rapidly emerging new technologies can be modeled as purely...

READ MORE