Anomalous subgraph detection via sparse principal component analysis
June 28, 2011
Network datasets have become ubiquitous in many fields of study in recent years. In this paper we investigate a problem with applicability to a wide variety of domains - detecting small, anomalous subgraphs in a background graph. We characterize the anomaly in a subgraph via the well-known notion of network modularity, and we show that the optimization problem formulation resulting from our setup is very similar to a recently introduced technique in statistics called Sparse Principal Component Analysis (Sparse PCA), which is an extension of the classical PCA algorithm. The exact version of our problem formulation is a hard combinatorial optimization problem, so we consider a recently introduced semidefinite programming relaxation of the Sparse PCA problem. We show via results on simulated data that the technique is very promising.