Publications

Refine Results

(Filters Applied) Clear All

Artificial intelligence: short history, present developments, and future outlook, final report

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the AI field is evolving so rapidly, the study scope was to look at the recent past and ongoing developments to lead to a set of findings and recommendations. It was important to begin with a short AI history and a lay-of-the-land on representative developments across the Department of Defense (DoD), intelligence communities (IC), and Homeland Security. These areas are addressed in more detail within the report. A main deliverable from the study was to formulate an end-to-end AI canonical architecture that was suitable for a range of applications. The AI canonical architecture, formulated in the study, serves as the guiding framework for all the sections in this report. Even though the study primarily focused on cyber security and information sciences, the enabling technologies are broadly applicable to many other areas. Therefore, we dedicate a full section on enabling technologies in Section 3. The discussion on enabling technologies helps the reader clarify the distinction among AI, machine learning algorithms, and specific techniques to make an end-to-end AI system viable. In order to understand what is the lay-of-the-land in AI, study participants performed a fairly wide reach within MIT LL and external to the Laboratory (government, commercial companies, defense industrial base, peers, academia, and AI centers). In addition to the study participants (shown in the next section under acknowledgements), we also assembled an internal review team (IRT). The IRT was extremely helpful in providing feedback and in helping with the formulation of the study briefings, as we transitioned from datagathering mode to the study synthesis. The format followed throughout the study was to highlight relevant content that substantiates the study findings, and identify a set of recommendations. An important finding is the significant AI investment by the so-called "big 6" commercial companies. These major commercial companies are Google, Amazon, Facebook, Microsoft, Apple, and IBM. They dominate in the AI ecosystem research and development (R&D) investments within the U.S. According to a recent McKinsey Global Institute report, cumulative R&D investment in AI amounts to about $30 billion per year. This amount is substantially higher than the R&D investment within the DoD, IC, and Homeland Security. Therefore, the DoD will need to be very strategic about investing where needed, while at the same time leveraging the technologies already developed and available from a wide range of commercial applications. As we will discuss in Section 1 as part of the AI history, MIT LL has been instrumental in developing advanced AI capabilities. For example, MIT LL has a long history in the development of human language technologies (HLT) by successfully applying machine learning algorithms to difficult problems in speech recognition, machine translation, and speech understanding. Section 4 elaborates on prior applications of these technologies, as well as newer applications in the context of multi-modalities (e.g., speech, text, images, and video). An end-to-end AI system is very well suited to enhancing the capabilities of human language analysis. Section 5 discusses AI's nascent role in cyber security. There have been cases where AI has already provided important benefits. However, much more research is needed in both the application of AI to cyber security and the associated vulnerability to the so-called adversarial AI. Adversarial AI is an area very critical to the DoD, IC, and Homeland Security, where malicious adversaries can disrupt AI systems and make them untrusted in operational environments. This report concludes with specific recommendations by formulating the way forward for Division 5 and a discussion of S&T challenges and opportunities. The S&T challenges and opportunities are centered on the key elements of the AI canonical architecture to strengthen the AI capabilities across the DoD, IC, and Homeland Security in support of national security.
READ LESS

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the...

READ MORE

SoK: privacy on mobile devices - it's complicated

Summary

Modern mobile devices place a wide variety of sensors and services within the personal space of their users. As a result, these devices are capable of transparently monitoring many sensitive aspects of these users' lives (e.g., location, health, or correspondences). Users typically trade access to this data for convenient applications and features, in many cases without a full appreciation of the nature and extent of the information that they are exposing to a variety of third parties. Nevertheless, studies show that users remain concerned about their privacy and vendors have similarly been increasing their utilization of privacy-preserving technologies in these devices. Still, despite significant efforts, these technologies continue to fail in fundamental ways, leaving users' private data exposed. In this work, we survey the numerous components of mobile devices, giving particular attention to those that collect, process, or protect users' private data. Whereas the individual components have been generally well studied and understood, examining the entire mobile device ecosystem provides significant insights into its overwhelming complexity. The numerous components of this complex ecosystem are frequently built and controlled by different parties with varying interests and incentives. Moreover, most of these parties are unknown to the typical user. The technologies that are employed to protect the users' privacy typically only do so within a small slice of this ecosystem, abstracting away the greater complexity of the system. Our analysis suggests that this abstracted complexity is the major cause of many privacy-related vulnerabilities, and that a fundamentally new, holistic, approach to privacy is needed going forward. We thus highlight various existing technology gaps and propose several promising research directions for addressing and reducing this complexity.
READ LESS

Summary

Modern mobile devices place a wide variety of sensors and services within the personal space of their users. As a result, these devices are capable of transparently monitoring many sensitive aspects of these users' lives (e.g., location, health, or correspondences). Users typically trade access to this data for convenient applications...

READ MORE

Threat-based risk assessment for enterprise networks

Published in:
Lincoln Laboratory Journal, Vol. 22, No. 1, 2016, pp. 33-45.

Summary

Protecting enterprise networks requires continuous risk assessment that automatically identifies and prioritizes cyber security risks, enables efficient allocation of cyber security resources, and enhances protection against modern cyber threats. Lincoln Laboratory created a network security model to guide the development of such risk assessments and, for the most important cyber threats, designed practical risk metrics that can be computed automatically and continuously from security-relevant network data.
READ LESS

Summary

Protecting enterprise networks requires continuous risk assessment that automatically identifies and prioritizes cyber security risks, enables efficient allocation of cyber security resources, and enhances protection against modern cyber threats. Lincoln Laboratory created a network security model to guide the development of such risk assessments and, for the most important cyber...

READ MORE

Repeatable reverse engineering for the greater good with PANDA

Published in:
37th Int. Conf. on Software Engineering, 16 May 2015.

Summary

We present PANDA, an open-source tool that has been purpose-built to support whole system reverse engineering. It is built upon the QEMU whole system emulator, and so analyses have access to all code executing in the guest and all data. PANDA adds the ability to record and replay executions, enabling iterative, deep, whole system analyses. Further, the replay log files are compact and shareable, allowing for repeatable experiments. A nine billion instruction boot of FreeBSD, e.g., is represented by only a few hundred MB. Furhter, PANDA leverages QEMU's support of thirteen different CPU architectures to make analyses of those diverse instruction sets possible within the LLVM IR. In this way, PANDA can have a single dynamic taint analysis, for example, that precisely supports many CPUs. PANDA analyses are written in a simple plugin architecture which includes a mechanism to share functionality between plugins, increasing analysis code re-use and simplifying complex analysis development. We demonstrate PANDA's effectiveness via a number of use cases, including enabling an old but legitimate version of Starcraft to rund espite a lost CD key, in-depth diagnosis of an Internet Explorer crash, and uncovering the censorship activities and mechanisms of a Chinese IM client.
READ LESS

Summary

We present PANDA, an open-source tool that has been purpose-built to support whole system reverse engineering. It is built upon the QEMU whole system emulator, and so analyses have access to all code executing in the guest and all data. PANDA adds the ability to record and replay executions, enabling...

READ MORE

Automated assessment of secure search systems

Summary

This work presents the results of a three-year project that assessed nine different privacy-preserving data search systems. We detail the design of a software assessment framework that focuses on low system footprint, repeatability, and reusability. A unique achievement of this project was the automation and integration of the entire test process, from the production and execution of tests to the generation of human-readable evaluation reports. We synthesize our experiences into a set of simple mantras that we recommend following in the design of any assessment framework.
READ LESS

Summary

This work presents the results of a three-year project that assessed nine different privacy-preserving data search systems. We detail the design of a software assessment framework that focuses on low system footprint, repeatability, and reusability. A unique achievement of this project was the automation and integration of the entire test...

READ MORE

Architecture-independent dynamic information flow tracking

Author:
Published in:
CC 2013: 22nd Int. Conf. on Compiler Construction, 16-24 March 2013, pp. 144-163.

Summary

Dynamic information flow tracking is a well-known dynamic software analysis technique with a wide variety of applications that range from making systems more secure, to helping developers and analysts better understand the code that systems are executing. Traditionally, the fine-grained analysis capabilities that are desired for the class of these systems which operate at the binary level require tight coupling to a specific ISA. This places a heavy burden on developers of these systems since significant domain knowledge is required to support each ISA, and the ability to amortize the effort expended on one ISA implementation cannot be leveraged to support other ISAs. Further, the correctness of the system must carefully evaluated for each new ISA. In this paper, we present a general approach to information flow tracking that allows us to support multiple ISAs without mastering the intricate details of each ISA we support, and without extensive verification. Our approach leverages binary translation to an intermediate representation where we have developed detailed, architecture-neutral information flow models. To support advanced instructions that are typically implemented in C code in binary translators, we also present a combined static/dynamic analysis that allows us to accurately and automatically support these instructions. We demonstrate the utility of our system in three different application settings: enforcing information flow policies, classifying algorithms by information flow properties, and characterizing types of programs which may exhibit excessive information flow in an information flow tracking system.
READ LESS

Summary

Dynamic information flow tracking is a well-known dynamic software analysis technique with a wide variety of applications that range from making systems more secure, to helping developers and analysts better understand the code that systems are executing. Traditionally, the fine-grained analysis capabilities that are desired for the class of these...

READ MORE

Nanosatellites for Earth environmental monitoring: the MicroMAS project

Summary

The Micro-sized Microwave Atmospheric Satellite (MicroMAS) is a 3U cubesat (34x10x10 cm, 4.5 kg) hosting a passive microwave spectrometer operating near the 118.75-GHz oxygen absorption line. The focus of the first MicroMAS mission (hereafter, MicroMAS-1) is to observe convective thunderstorms, tropical cyclones, and hurricanes from a near-equatorial orbit at approximately 500-km altitude. A MicroMAS flight unit is currently being developed in anticipation of a 2014 launch. A parabolic reflector is mechanically rotated as the spacecraft orbits the earth, thus directing a cross-track scanned beam with FWHM beamwidth of 2.4-degrees, yielding an approximately 20-km diameter footprint at nadir incidence from a nominal altitude of 500 km. Radiometric calibration is carried out using observations of cold space, the earth?s limb, and an internal noise diode that is weakly coupled through the RF front-end electronics. A key technology feature is the development of an ultra-compact intermediate frequency processor module for channelization, detection, and A-to-D conversion. The antenna system and RF front-end electronics are highly integrated and miniaturized. A MicroMAS-2 mission is currently being planned using a multiband spectrometer operating near 118 and 183 GHz in a sunsynchronous orbit of approximately 800-km altitude. A HyMAS- 1 (Hyperspectral Microwave Atmospheric Satellite) mission with approximately 50 channels near 118 and 183 GHz is also being planned. In this paper, the mission concept of operations will be discussed, the radiometer payload will be described, and the spacecraft subsystems (avionics, power, communications, attitude determination and control, and mechanical structures) will be summarized.
READ LESS

Summary

The Micro-sized Microwave Atmospheric Satellite (MicroMAS) is a 3U cubesat (34x10x10 cm, 4.5 kg) hosting a passive microwave spectrometer operating near the 118.75-GHz oxygen absorption line. The focus of the first MicroMAS mission (hereafter, MicroMAS-1) is to observe convective thunderstorms, tropical cyclones, and hurricanes from a near-equatorial orbit at approximately...

READ MORE

Experiences in cyber security education: the MIT Lincoln Laboratory Capture-the-Flag exercise

Published in:
Proc. 4th Cyber Security Experimentation Test, 8 August 2011.

Summary

Many popular and well-established cyber security Capture the Flag (CTF) exercises are held each year in a variety of settings, including universities and semi-professional security conferences. CTF formats also vary greatly, ranging from linear puzzle-like challenges to team-based offensive and defensive free-for-all hacking competitions. While these events are exciting and important as contests of skill, they offer limited educational opportunities. In particular, since participation requires considerable a priori domain knowledge and practical computer security expertise, the majority of typical computer science students are excluded from taking part in these events. Our goal in designing and running the MIT/LL CTF was to make the experience accessible to a wider community by providing an environment that would not only test and challenge the computer security skills of the participants, but also educate and prepare those without an extensive prior expertise. This paper describes our experience in designing, organizing, and running an education-focused CTF, and discusses our teaching methods, game design, scoring measures, logged data, and lessons learned.
READ LESS

Summary

Many popular and well-established cyber security Capture the Flag (CTF) exercises are held each year in a variety of settings, including universities and semi-professional security conferences. CTF formats also vary greatly, ranging from linear puzzle-like challenges to team-based offensive and defensive free-for-all hacking competitions. While these events are exciting and...

READ MORE

Virtuoso: narrowing the semantic gap in virtual machine introspection

Published in:
2011 IEEE Symp. on Security and Privacy, 22-25 May 2011, pp. 297-312.

Summary

Introspection has featured prominently in many recent security solutions, such as virtual machine-based intrusion detection, forensic memory analysis, and low-artifact malware analysis. Widespread adoption of these approaches, however, has been hampered by the semantic gap: in order to extract meaningful information about the current state of a virtual machine, detailed knowledge of the guest operating system's inner workings is required. In this paper, we present a novel approach for automatically creating introspection tools for security applications with minimal human effort. By analyzing dynamic traces of small, in-guest programs that compute the desired introspection information, we can produce new programs that retrieve the same information from outside the guest virtual machine. We demonstrate the efficacy of our techniques by automatically generating 17 programs that retrieve security information across 3 different operating systems, and show that their functionality is unaffected by the compromise of the guest system. Our technique allows introspection tools to be effortlessly generated for multiple platforms, and enables the development of rich introspection-based security applications.
READ LESS

Summary

Introspection has featured prominently in many recent security solutions, such as virtual machine-based intrusion detection, forensic memory analysis, and low-artifact malware analysis. Widespread adoption of these approaches, however, has been hampered by the semantic gap: in order to extract meaningful information about the current state of a virtual machine, detailed...

READ MORE

Generating client workloads and high-fidelity network traffic for controllable, repeatable experiments in computer security

Published in:
13th Int. Symp. on Recent Advances in Intrusion Detection, 14 September 2010, pp. 218-237.

Summary

Rigorous scientific experimentation in system and network security remains an elusive goal. Recent work has outlined three basic requirements for experiments, namely that hypotheses must be falsifiable, experiments must be controllable, and experiments must be repeatable and reproducible. Despite their simplicity, these goals are difficult to achieve, especially when dealing with client-side threats and defenses, where often user input is required as part of the experiment. In this paper, we present techniques for making experiments involving security and client-side desktop applications like web browsers, PDF readers, or host-based firewalls or intrusion detection systems more controllable and more easily repeatable. First, we present techniques for using statistical models of user behavior to drive real, binary, GUI-enabled application programs in place of a human user. Second, we present techniques based on adaptive replay of application dialog that allow us to quickly and efficiently reproduce reasonable mock-ups of remotely-hosted applications to give the illusion of Internet connectedness on an isolated testbed. We demonstrate the utility of these techniques in an example experiment comparing the system resource consumption of a Windows machine running anti-virus protection versus an unprotected system.
READ LESS

Summary

Rigorous scientific experimentation in system and network security remains an elusive goal. Recent work has outlined three basic requirements for experiments, namely that hypotheses must be falsifiable, experiments must be controllable, and experiments must be repeatable and reproducible. Despite their simplicity, these goals are difficult to achieve, especially when dealing...

READ MORE

Showing Results

1-10 of 12