Publications

Refine Results

(Filters Applied) Clear All

Artificial intelligence: short history, present developments, and future outlook, final report

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the AI field is evolving so rapidly, the study scope was to look at the recent past and ongoing developments to lead to a set of findings and recommendations. It was important to begin with a short AI history and a lay-of-the-land on representative developments across the Department of Defense (DoD), intelligence communities (IC), and Homeland Security. These areas are addressed in more detail within the report. A main deliverable from the study was to formulate an end-to-end AI canonical architecture that was suitable for a range of applications. The AI canonical architecture, formulated in the study, serves as the guiding framework for all the sections in this report. Even though the study primarily focused on cyber security and information sciences, the enabling technologies are broadly applicable to many other areas. Therefore, we dedicate a full section on enabling technologies in Section 3. The discussion on enabling technologies helps the reader clarify the distinction among AI, machine learning algorithms, and specific techniques to make an end-to-end AI system viable. In order to understand what is the lay-of-the-land in AI, study participants performed a fairly wide reach within MIT LL and external to the Laboratory (government, commercial companies, defense industrial base, peers, academia, and AI centers). In addition to the study participants (shown in the next section under acknowledgements), we also assembled an internal review team (IRT). The IRT was extremely helpful in providing feedback and in helping with the formulation of the study briefings, as we transitioned from datagathering mode to the study synthesis. The format followed throughout the study was to highlight relevant content that substantiates the study findings, and identify a set of recommendations. An important finding is the significant AI investment by the so-called "big 6" commercial companies. These major commercial companies are Google, Amazon, Facebook, Microsoft, Apple, and IBM. They dominate in the AI ecosystem research and development (R&D) investments within the U.S. According to a recent McKinsey Global Institute report, cumulative R&D investment in AI amounts to about $30 billion per year. This amount is substantially higher than the R&D investment within the DoD, IC, and Homeland Security. Therefore, the DoD will need to be very strategic about investing where needed, while at the same time leveraging the technologies already developed and available from a wide range of commercial applications. As we will discuss in Section 1 as part of the AI history, MIT LL has been instrumental in developing advanced AI capabilities. For example, MIT LL has a long history in the development of human language technologies (HLT) by successfully applying machine learning algorithms to difficult problems in speech recognition, machine translation, and speech understanding. Section 4 elaborates on prior applications of these technologies, as well as newer applications in the context of multi-modalities (e.g., speech, text, images, and video). An end-to-end AI system is very well suited to enhancing the capabilities of human language analysis. Section 5 discusses AI's nascent role in cyber security. There have been cases where AI has already provided important benefits. However, much more research is needed in both the application of AI to cyber security and the associated vulnerability to the so-called adversarial AI. Adversarial AI is an area very critical to the DoD, IC, and Homeland Security, where malicious adversaries can disrupt AI systems and make them untrusted in operational environments. This report concludes with specific recommendations by formulating the way forward for Division 5 and a discussion of S&T challenges and opportunities. The S&T challenges and opportunities are centered on the key elements of the AI canonical architecture to strengthen the AI capabilities across the DoD, IC, and Homeland Security in support of national security.
READ LESS

Summary

The Director's Office at MIT Lincoln Laboratory (MIT LL) requested a comprehensive study on artificial intelligence (AI) focusing on present applications and future science and technology (S&T) opportunities in the Cyber Security and Information Sciences Division (Division 5). This report elaborates on the main results from the study. Since the...

READ MORE

Classifier performance estimation with unbalanced, partially labeled data

Published in:
Proc. Machine Learning Research, Vol. 88, 2018, pp. 4-16.

Summary

Class imbalance and lack of ground truth are two significant problems in modern machine learning research. These problems are especially pressing in operational contexts where the total number of data points is extremely large and the cost of obtaining labels is very high. In the face of these issues, accurate estimation of the performance of a detection or classification system is crucial to inform decisions based on the observations. This paper presents a framework for estimating performance of a binary classifier in such a context. We focus on the scenario where each set of measurements has been reduced to a score, and the operator only investigates data when the score exceeds a threshold. The operator is blind to the number of missed detections, so performance estimation targets two quantities: recall and the derivative of precision with respect to recall. Measuring with respect to error in these two metrics, simulations in this context demonstrate that labeling outliers not only outperforms random labeling, but often matches performance of an adaptive method that attempts to choose the optimal data for labeling. Application to real anomaly detection data confirms the utility of the approach, and suggests direction for future work.
READ LESS

Summary

Class imbalance and lack of ground truth are two significant problems in modern machine learning research. These problems are especially pressing in operational contexts where the total number of data points is extremely large and the cost of obtaining labels is very high. In the face of these issues, accurate...

READ MORE

Intersection and convex combination in multi-source spectral planted cluster detection

Published in:
IEEE Global Conf. on Signal and Information Processing, GlobalSIP, 7-9 December 2016.

Summary

Planted cluster detection is an important form of signal detection when the data are in the form of a graph. When there are multiple graphs representing multiple connection types, the method of aggregation can have significant impact on the results of a detection algorithm. This paper addresses the tradeoff between two possible aggregation methods: convex combination and intersection. For a spectral detection method, convex combination dominates when the cluster is relatively sparse in at least one graph, while the intersection method dominates in cases where it is dense across graphs. Experimental results confirm the theory. We consider the context of adversarial cluster placement, and determine how an adversary would distribute connections among the graphs to best avoid detection.
READ LESS

Summary

Planted cluster detection is an important form of signal detection when the data are in the form of a graph. When there are multiple graphs representing multiple connection types, the method of aggregation can have significant impact on the results of a detection algorithm. This paper addresses the tradeoff between...

READ MORE

Analytical models and methods for anomaly detection in dynamic, attributed graphs

Published in:
Chapter 2, Computational Network Analysis with R: Applications in Biology, Medicine, and Chemistry, 2017, pp. 35-61.

Summary

This chapter is devoted to anomaly detection in dynamic, attributed graphs. There has been a great deal of research on anomaly detection in graphs over the last decade, with a variety of methods proposed. This chapter discusses recent methods for anomaly detection in graphs,with a specific focus on detection within backgrounds based on random graph models. This sort of analysis can be applied for a variety of background models, which can incorporate topological dynamics and attributes of vertices and edges. The authors have developed a framework for anomalous subgraph detection in random background models, based on linear algebraic features of a graph. This includes an implementation in R that exploits structure in the random graph model for computationally tractable analysis of residuals. This chapter outlines this framework within the context of analyzing dynamic, attributed graphs. The remainder of this chapter is organized as follows. Section 2.2 defines the notation used within the chapter. Section 2.3 briefly describes a variety of perspectives and techniques for anomaly detection in graph-based data. Section 2.4 provides an overview of models for graph behavior that can be used as backgrounds for anomaly detection. Section 2.5 describes our framework for anomalous subgraph detection via spectral analysis of residuals, after the data are integrated over time. Section 2.6 discusses how the method described in Section 2.5 can be efficiently implemented in R using open source packages. Section 2.7 demonstrates the power of this technique in controlled simulation, considering the effects of both dynamics and attributes on detection performance. Section 2.8 gives a data analysis example within this context, using an evolving citation graph based on a commercially available document database of public scientific literature. Section 2.9 summarizes the chapter and discusses ongoing research in this area.
READ LESS

Summary

This chapter is devoted to anomaly detection in dynamic, attributed graphs. There has been a great deal of research on anomaly detection in graphs over the last decade, with a variety of methods proposed. This chapter discusses recent methods for anomaly detection in graphs,with a specific focus on detection within...

READ MORE

Feedback-based social media filtering tool for improved situational awareness

Published in:
15th Annual IEEE Int. Symp. on Technologies for Homeland Security, HST 2016, 10-12 May 2016.

Summary

This paper describes a feature-rich model of data relevance, designed to aid first responder retrieval of useful information from social media sources during disasters or emergencies. The approach is meant to address the failure of traditional keyword-based methods to sufficiently suppress clutter during retrieval. The model iteratively incorporates relevance feedback to update feature space selection and classifier construction across a multimodal set of diverse content characterization techniques. This approach is advantageous because the aspects of the data (or even the modalities of the data) that signify relevance cannot always be anticipated ahead of time. Experiments with both microblog text documents and coupled imagery and text documents demonstrate the effectiveness of this model on sample retrieval tasks, in comparison to more narrowly focused models operating in a priori selected feature spaces. The experiments also show that even relatively low feedback levels (i.e., tens of examples) can lead to a significant performance boost during the interactive retrieval process.
READ LESS

Summary

This paper describes a feature-rich model of data relevance, designed to aid first responder retrieval of useful information from social media sources during disasters or emergencies. The approach is meant to address the failure of traditional keyword-based methods to sufficiently suppress clutter during retrieval. The model iteratively incorporates relevance feedback...

READ MORE

Assessing functional neural connectivity as an indicator of cognitive performance

Published in:
5th NIPS Workshop on Machine Learning and Interpretation in Neuroimaging, MLINI 2015, 11-12 December 2015.

Summary

Studies in recent years have demonstrated that neural organization and structure impact an individual's ability to perform a given task. Specifically, individuals with greater neural efficiency have been shown to outperform those with less organized functional structure. In this work, we compare the predictive ability of properties of neural connectivity on a working memory task. We provide two novel approaches for characterizing functional network connectivity from electroencephalography (EEG), and compare these features to the average power across frequency bands in EEG channels. Our first novel approach represents functional connectivity structure through the distribution of eigenvalues making up channel coherence matrices in multiple frequency bands. Our second approach creates a connectivity network at each frequency band, and assesses variability in average path lengths of connected components and degree across the network. Failures in digit and sentence recall on single trials are detected using a Gaussian classifier for each feature set, at each frequency band. The classifier results are then fused across frequency bands, with the resulting detection performance summarized using the area under the receiver operating characteristic curve (AUC) statistic. Fused AUC results of 0.63/0.58/0.61 for digit recall failure and 0.58/0.59/0.54 for sentence recall failure are obtained from the connectivity structure, graph variability, and channel power features respectively.
READ LESS

Summary

Studies in recent years have demonstrated that neural organization and structure impact an individual's ability to perform a given task. Specifically, individuals with greater neural efficiency have been shown to outperform those with less organized functional structure. In this work, we compare the predictive ability of properties of neural connectivity...

READ MORE

Improved hidden clique detection by optimal linear fusion of multiple adjacency matrices

Published in:
2015 Asilomar Conf. on Signals, Systems and Computers, 8-11 November 2015.

Summary

Graph fusion has emerged as a promising research area for addressing challenges associated with noisy, uncertain, multi-source data. While many ad-hoc graph fusion techniques exist in the current literature, an analytical approach for analyzing the fundamentals of the graph fusion problem is lacking. We consider the setting where we are given multiple Erdos-Renyi modeled adjacency matrices containing a common hidden or planted clique. The objective is to combine them linearly so that the principal eigenvectors of the resulting matrix best reveal the vertices associated with the clique. We utilize recent results from random matrix theory to derive the optimal weighting coefficients and use these insights to develop a data-driven fusion algorithm. We demonstrate the improved performance of the algorithm relative to other simple heuristics.
READ LESS

Summary

Graph fusion has emerged as a promising research area for addressing challenges associated with noisy, uncertain, multi-source data. While many ad-hoc graph fusion techniques exist in the current literature, an analytical approach for analyzing the fundamentals of the graph fusion problem is lacking. We consider the setting where we are...

READ MORE

Residuals-based subgraph detection with cue vertices

Published in:
2015 Asilomar Conf. on Signals, Systems and Computers, 8-11 November 2015.

Summary

A common problem in modern graph analysis is the detection of communities, an example of which is the detection of a single anomalously dense subgraph. Recent results have demonstrated a fundamental limit for this problem when using spectral analysis of modularity. In this paper, we demonstrate the implication of these results on subgraph detection when a cue vertex is provided, indicating one of the vertices in the community of interest. Several recent algorithms for local community detection are applied in this context, and we compare their empirical performance to that of the simple method used to derive the theoretical detection limits.
READ LESS

Summary

A common problem in modern graph analysis is the detection of communities, an example of which is the detection of a single anomalously dense subgraph. Recent results have demonstrated a fundamental limit for this problem when using spectral analysis of modularity. In this paper, we demonstrate the implication of these...

READ MORE

Sampling operations on big data

Published in:
2015 Asilomar Conf. on Signals, Systems and Computers, 8-11 November 2015.

Summary

The 3Vs -- Volume, Velocity and Variety -- of Big Data continues to be a large challenge for systems and algorithms designed to store, process and disseminate information for discovery and exploration under real-time constraints. Common signal processing operations such as sampling and filtering, which have been used for decades to compress signals are often undefined in data that is characterized by heterogeneity, high dimensionality, and lack of known structure. In this article, we describe and demonstrate an approach to sample large datasets such as social media data. We evaluate the effect of sampling on a common predictive analytic: link prediction. Our results indicate that greatly sampling a dataset can still yield meaningful link prediction results.
READ LESS

Summary

The 3Vs -- Volume, Velocity and Variety -- of Big Data continues to be a large challenge for systems and algorithms designed to store, process and disseminate information for discovery and exploration under real-time constraints. Common signal processing operations such as sampling and filtering, which have been used for decades...

READ MORE

Sampling large graphs for anticipatory analytics

Published in:
HPEC 2015: IEEE Conf. on High Performance Extreme Computing, 15-17 September 2015.

Summary

The characteristics of Big Data - often dubbed the 3V's for volume, velocity, and variety - will continue to outpace the ability of computational systems to process, store, and transmit meaningful results. Traditional techniques for dealing with large datasets often include the purchase of larger systems, greater human-in-the-loop involvement, or more complex algorithms. We are investigating the use of sampling to mitigate these challenges, specifically sampling large graphs. Often, large datasets can be represented as graphs where data entries may be edges, and vertices may be attributes of the data. In particular, we present the results of sampling for the task of link prediction. Link prediction is a process to estimate the probability of a new edge forming between two vertices of a graph, and it has numerous application areas in understanding social or biological networks. In this paper we propose a series of techniques for the sampling of large datasets. In order to quantify the effect of these techniques, we present the quality of link prediction tasks on sampled graphs, and the time saved in calculating link prediction statistics on these sampled graphs.
READ LESS

Summary

The characteristics of Big Data - often dubbed the 3V's for volume, velocity, and variety - will continue to outpace the ability of computational systems to process, store, and transmit meaningful results. Traditional techniques for dealing with large datasets often include the purchase of larger systems, greater human-in-the-loop involvement, or...

READ MORE