Publications

Refine Results

(Filters Applied) Clear All

Analytical models and methods for anomaly detection in dynamic, attributed graphs

Published in:
Chapter 2, Computational Network Analysis with R: Applications in Biology, Medicine, and Chemistry, 2017, pp. 35-61.

Summary

This chapter is devoted to anomaly detection in dynamic, attributed graphs. There has been a great deal of research on anomaly detection in graphs over the last decade, with a variety of methods proposed. This chapter discusses recent methods for anomaly detection in graphs,with a specific focus on detection within backgrounds based on random graph models. This sort of analysis can be applied for a variety of background models, which can incorporate topological dynamics and attributes of vertices and edges. The authors have developed a framework for anomalous subgraph detection in random background models, based on linear algebraic features of a graph. This includes an implementation in R that exploits structure in the random graph model for computationally tractable analysis of residuals. This chapter outlines this framework within the context of analyzing dynamic, attributed graphs. The remainder of this chapter is organized as follows. Section 2.2 defines the notation used within the chapter. Section 2.3 briefly describes a variety of perspectives and techniques for anomaly detection in graph-based data. Section 2.4 provides an overview of models for graph behavior that can be used as backgrounds for anomaly detection. Section 2.5 describes our framework for anomalous subgraph detection via spectral analysis of residuals, after the data are integrated over time. Section 2.6 discusses how the method described in Section 2.5 can be efficiently implemented in R using open source packages. Section 2.7 demonstrates the power of this technique in controlled simulation, considering the effects of both dynamics and attributes on detection performance. Section 2.8 gives a data analysis example within this context, using an evolving citation graph based on a commercially available document database of public scientific literature. Section 2.9 summarizes the chapter and discusses ongoing research in this area.
READ LESS

Summary

This chapter is devoted to anomaly detection in dynamic, attributed graphs. There has been a great deal of research on anomaly detection in graphs over the last decade, with a variety of methods proposed. This chapter discusses recent methods for anomaly detection in graphs,with a specific focus on detection within...

READ MORE

Collaborative Data Analysis and Discovery for Cyber Security

Published in:
Proceedings of the 12th Symposium on Usable Privacy and Security (SOUPS 2016)

Summary

In this paper, we present the Cyber Analyst Real-Time Integrated Notebook Application (CARINA). CARINA is a collaborative investigation system that aids in decision making by co-locating the analysis environment with centralized cyber data sources, and providing next generation analysts with increased visibility to the work of others.
READ LESS

Summary

In this paper, we present the Cyber Analyst Real-Time Integrated Notebook Application (CARINA). CARINA is a collaborative investigation system that aids in decision making by co-locating the analysis environment with centralized cyber data sources, and providing next generation analysts with increased visibility to the work of others.

READ MORE

Residuals-based subgraph detection with cue vertices

Published in:
2015 Asilomar Conf. on Signals, Systems and Computers, 8-11 November 2015.

Summary

A common problem in modern graph analysis is the detection of communities, an example of which is the detection of a single anomalously dense subgraph. Recent results have demonstrated a fundamental limit for this problem when using spectral analysis of modularity. In this paper, we demonstrate the implication of these results on subgraph detection when a cue vertex is provided, indicating one of the vertices in the community of interest. Several recent algorithms for local community detection are applied in this context, and we compare their empirical performance to that of the simple method used to derive the theoretical detection limits.
READ LESS

Summary

A common problem in modern graph analysis is the detection of communities, an example of which is the detection of a single anomalously dense subgraph. Recent results have demonstrated a fundamental limit for this problem when using spectral analysis of modularity. In this paper, we demonstrate the implication of these...

READ MORE

Global pattern search at scale

Summary

In recent years, data collection has far outpaced the tools for data analysis in the area of non-traditional GEOINT analysis. Traditional tools are designed to analyze small-scale numerical data, but there are few good interactive tools for processing large amounts of unstructured data such as raw text. In addition to the complexities of data processing, presenting the data in a way that is meaningful to the end user poses another challenge. In our work, we focused on analyzing a corpus of 35,000 news articles and creating an interactive geovisualization tool to reveal patterns to human analysts. Our comprehensive tool, Global Pattern Search at Scale (GPSS), addresses three major problems in data analysis: free text analysis, high volumes of data, and interactive visualization. GPSS uses an Accumulo database for high-volume data storage, and a matrix of word counts and event detection algorithms to process the free text. For visualization, the tool displays an interactive web application to the user, featuring a map overlaid with document clusters and events, search and filtering options, a timeline, and a word cloud. In addition, the GPSS tool can be easily adapted to process and understand other large free-text datasets.
READ LESS

Summary

In recent years, data collection has far outpaced the tools for data analysis in the area of non-traditional GEOINT analysis. Traditional tools are designed to analyze small-scale numerical data, but there are few good interactive tools for processing large amounts of unstructured data such as raw text. In addition to...

READ MORE

Showing Results

1-4 of 4