Publications

Refine Results

(Filters Applied) Clear All

Very large graphs for information extraction (VLG) - detection and inference in the presence of uncertainty

Summary

In numerous application domains relevant to the Department of Defense and the Intelligence Community, data of interest take the form of entities and the relationships between them, and these data are commonly represented as graphs. Under the Very Large Graphs for Information Extraction effort--a one year proof-of-concept study--MIT LL developed novel techniques for anomalous subgraph detection, building on tools in the signal processing research literature. This report documents the technical results of this effort. Two datasets--a snapshot of Thompson Reuters' Web of Science database and a stream of web proxy logs--were parsed, and graphs were constructed from the raw data. From the phenomena in these datasets, several algorithms were developed to model the dynamic graph behavior, including a preferential attachment mechanism with memory, a streaming filter to model a graph as a weighted average of its past connections, and a generalized linear model for graphs where connection probabilities are determined by additional side information or metadata. A set of metrics was also constructed to facilitate comparison of techniques. The study culminated in a demonstration of the algorithms on the datasets of interest, in addition to simulated data. Performance in terms of detection, estimation, and computational burden was measured according to the metrics. Among the highlights of this demonstration were the detection of emerging coauthor clusters in the Web of Science data, detection of botnet activity in the web proxy data after 15 minutes (which took 10 days to detect using state-of-the-practice techniques), and demonstration of the core algorithm on a simulated 1-billion-vertex graph using a commodity computing cluster.
READ LESS

Summary

In numerous application domains relevant to the Department of Defense and the Intelligence Community, data of interest take the form of entities and the relationships between them, and these data are commonly represented as graphs. Under the Very Large Graphs for Information Extraction effort--a one year proof-of-concept study--MIT LL developed...

READ MORE

Advisory services for user composition tools

Summary

We have developed an ontology based framework that evaluates compatibility between processing modules within an end user development framework, using MIT Lincoln Laboratory's Composable Analytics environment as a test case. In particular, we focus on inter-module semantic compatibility as well as compatibility between data and modules. Our framework includes a core ontology that provides an extendible vocabulary that can describe module attributes, module input and output requirements and preferences, and data characteristics that are pertinent to selecting appropriate modules in a given situation. Based on the ontological description of the modules and data, we first present a framework that takes a rule based approach in measuring semantic compatibility. Later, we extend the rule based approach to a flexible fuzzy logic based semantic compatibility evaluator. We have built an initial simulator to test module compatibility under varying situations. The simulator takes in the ontological description of the modules and data and calculates semantic compatibility. We believe the framework and simulation environment together will help both the developers test new modules they create as well as support end users in composing new capabilities. In this paper, we describe the details of the framework, the simulation environment, and our iterative process in developing the module ontology.
READ LESS

Summary

We have developed an ontology based framework that evaluates compatibility between processing modules within an end user development framework, using MIT Lincoln Laboratory's Composable Analytics environment as a test case. In particular, we focus on inter-module semantic compatibility as well as compatibility between data and modules. Our framework includes a...

READ MORE

Bayesian discovery of threat networks

Published in:
IEEE Trans. Signal Process., Vol. 62, No. 20, 15 October 2014, pp. 5324-38.

Summary

A novel unified Bayesian framework for network detection is developed, under which a detection algorithm is derived based on random walks on graphs. The algorithm detects threat networks using partial observations of their activity, and is proved to be optimum in the Neyman-Pearson sense. The algorithm is defined by a graph, at least one observation, and a diffusion model for threat. A link to well-known spectral detection methods is provided, and the equivalence of the random walk and harmonic solutions to the Bayesian formulation is proven. A general diffusion model is introduced that utilizes spatio-temporal relationships between vertices, and is used for a specific space-time formulation that leads to significant performance improvements on coordinated covert networks. This performance is demonstrated using a new hybrid mixed-membership blockmodel introduced to simulate random covert networks with realistic properties.
READ LESS

Summary

A novel unified Bayesian framework for network detection is developed, under which a detection algorithm is derived based on random walks on graphs. The algorithm detects threat networks using partial observations of their activity, and is proved to be optimum in the Neyman-Pearson sense. The algorithm is defined by a...

READ MORE

Geospatial analysis based on GIS integrated with LADAR

Summary

In this work, we describe multi-layered analyses of a high-resolution broad-area LADAR data set in support of expeditionary activities. High-level features are extracted from the LADAR data, such as the presence and location of buildings and cars, and then these features are used to populate a GIS (geographic information system) tool. We also apply line-of-sight (LOS) analysis to develop a path-planning module. Finally, visualization is addressed and enhanced with a gesture-based control system that allows the user to navigate through the enhanced data set in a virtual immersive experience. This work has operational applications including military, security, disaster relief, and task-based robotic path planning.
READ LESS

Summary

In this work, we describe multi-layered analyses of a high-resolution broad-area LADAR data set in support of expeditionary activities. High-level features are extracted from the LADAR data, such as the presence and location of buildings and cars, and then these features are used to populate a GIS (geographic information system)...

READ MORE

Leading the charge - microgrids for domestic military installations

Published in:
IEEE Power & Energy Magazine, Vol. 11, No. 4, July/August 2013, pp. 40-5.

Summary

In today's interconnected battlefield, our war fighters are increasingly reliant on capabilities at domestic military installations to support critical missions, often in near real time. Many of the domestic installations of the U.S. Department of Defense (DoD) also support everything from sensitive research and development facilities such as microelectronics and biological laboratories to large industrial plants such as shipyards and aviation depots. These facilities depend on the electricity provided by the commercial electric grid. Extended-duration outages on the domestic electric grid will therefore both significantly affect the operational mission of the DoD and bring substantial economic consequences. The changing nature of electricity markets presents new opportunities for the DoD to reduce electricity costs while addressing its energy security needs. Demand response, ancillary service markets, and real-time pricing offer large consumers of electricity such as military installations a significant opportunity to use installation assets during grid-tied operation. Nevertheless, this is an opportunity the DoD can only exploit if it does so in a secure fashion, well protected from cyber threats.
READ LESS

Summary

In today's interconnected battlefield, our war fighters are increasingly reliant on capabilities at domestic military installations to support critical missions, often in near real time. Many of the domestic installations of the U.S. Department of Defense (DoD) also support everything from sensitive research and development facilities such as microelectronics and...

READ MORE

Estimation of Causal Peer Influence Effects

Author:
Published in:
International Conference on Machine Learning, 17-19 June 2013

Summary

The broad adoption of social media has generated interest in leveraging peer influence for inducing desired user behavior. Quantifying the causal effect of peer influence presents technical challenges, however, including how to deal with social interference, complex response functions and network uncertainty. In this paper, we extend potential outcomes to allow for interference, we introduce welldefined causal estimands of peer-influence, and we develop two estimation procedures: a frequentist procedure relying on a sequential randomization design that requires knowledge of the network but operates under complicated response functions, and a Bayesian procedure which accounts for network uncertainty but relies on a linear response assumption to increase estimation precision. Our results show the advantages and disadvantages of the proposed methods in a number of situations.
READ LESS

Summary

The broad adoption of social media has generated interest in leveraging peer influence for inducing desired user behavior. Quantifying the causal effect of peer influence presents technical challenges, however, including how to deal with social interference, complex response functions and network uncertainty. In this paper, we extend potential outcomes to...

READ MORE

Interdependence of the electricity generation system and the natural gas system and implications for energy security

Published in:
MIT Lincoln Laboratory Report TR-1173

Summary

Concern about energy security on domestic Department of Defense installations has led to the possibility of using natural gas-fired electricity generators to provide power in the event of electric grid failures. The natural gas system in the United States is partly dependent on electricity for its ability to deliver natural gas from the well-head to the consumer, but it also uses natural gas from the system itself to fuel some of the drilling rigs, processing units, and pipeline compressors. The vulnerability of the system to a disruption in the national electricity supply network varies depending on the cause and breadth of the disruption and where in the country one is located relative to that disruption, as the interconnected nature of transmission pipelines, the penetration of electric motor-driven compressors and other equipment, and the availability of nearby gas production, import terminals, or storage varies. In general, the gas supply system is reliable for short-term, limited-area disruptions in the electricity supply, and firm delivery contracts for natural gas increase the likelihood of continued operation, but for disruptions that cover large sections of the electric grid encompassing areas from extraction wells to customers and which last longer than available gas in storage or transmission pipeline constraints from elsewhere, contractual force majeure limits will come into play rendering the firm delivery contracts void; operation of gas-fueled power generation systems that are not dual-fuel capable for longer than weeks to a few months (depending on time of year) will be unlikely. Several weather-related outages in recent years have provided limited case studies showing the system's resilience, but no long-term, widespread electricity grid failures have occurred.
READ LESS

Summary

Concern about energy security on domestic Department of Defense installations has led to the possibility of using natural gas-fired electricity generators to provide power in the event of electric grid failures. The natural gas system in the United States is partly dependent on electricity for its ability to deliver natural...

READ MORE

Tower Flight Data Manager benefits assessment: initial investment decision interim report

Summary

This document provides an overview of MIT Lincoln Laboratory's activities in support of the interim stage of the Initial Investment Decision benefits assessment for the Tower Flight Data Manager. It outlines the rationale for the focus areas, and the background, methodology, and scope in the focus areas of departure metering, sequence optimization, airport configuration optimization, and safety assessment. Estimates of the potential benefits enabled by TFDM deployment are presented for each of these areas for a subset of airports and conditions considered within the scope of the analyses. These benefits are monetized where possible. Recommendations for follow-on work, for example, to support future benefits assessment efforts for TFDM, are also discussed.
READ LESS

Summary

This document provides an overview of MIT Lincoln Laboratory's activities in support of the interim stage of the Initial Investment Decision benefits assessment for the Tower Flight Data Manager. It outlines the rationale for the focus areas, and the background, methodology, and scope in the focus areas of departure metering...

READ MORE

Microgrid study: energy security for DoD installations

Summary

Growing concerns about the vulnerability of the electric grid, uncertainty about the cost of oil, and an increase in the deployment of renewable generation on domestic military installations have all led the Department of Defense (DoD) to reconsider its strategy for providing energy security for critical domestic operations. Existing solutions typically use dedicated backup generators to service each critical load. For large installations, this can result in over 50 small generators, each servicing a low voltage feeder to an individual building. The system as a whole is typically not well integrated either internally, with nearby renewable assets, or to the larger external grid. As a result, system performance is not optimized for efficient, reactive, and sustainable operations across the installation in the event of a power outage or in response to periods of high stress on the grid. Recent advances in energy management systems and power electronics provide an opportunity to interconnect multiple sources and loads into an integrated system that can then be optimized for reliability, efficiency, and/or cost. These integrated energy systems, or microgrids, are the focus of this study. The study was performed with the goals of (1) achieving a better understanding of the current microgrid efforts across DoD installations, specifically those that were in place or underway by the end of FY11, (2) categorizing the efforts with a consistent typology based on common, measurable parameters, and (3) performing cost-benefit trades for different microgrid architectures. This report summarizes the results of several months of analysis and provides insight into opportunities for increased energy security, efficiency, and the incorporation of renewable and distributed energy resources into microgrids, as well as the factors that might facilitate or impede implementation.
READ LESS

Summary

Growing concerns about the vulnerability of the electric grid, uncertainty about the cost of oil, and an increase in the deployment of renewable generation on domestic military installations have all led the Department of Defense (DoD) to reconsider its strategy for providing energy security for critical domestic operations. Existing solutions...

READ MORE

Dynamic Distributed Dimensional Data Model (D4M) database and computation system

Summary

A crucial element of large web companies is their ability to collect and analyze massive amounts of data. Tuple store databases are a key enabling technology employed by many of these companies (e.g., Google Big Table and Amazon Dynamo). Tuple stores are highly scalable and run on commodity clusters, but lack interfaces to support efficient development of mathematically based analytics. D4M (Dynamic Distributed Dimensional Data Model) has been developed to provide a mathematically rich interface to tuple stores (and structured query language "SQL" databases). D4M allows linear algebra to be readily applied to databases. Using D4M, it is possible to create composable analytics with significantly less effort than using traditional approaches. This work describes the D4M technology and its application and performance.
READ LESS

Summary

A crucial element of large web companies is their ability to collect and analyze massive amounts of data. Tuple store databases are a key enabling technology employed by many of these companies (e.g., Google Big Table and Amazon Dynamo). Tuple stores are highly scalable and run on commodity clusters, but...

READ MORE