Publications

Refine Results

(Filters Applied) Clear All

Probabilistic threat propagation for malicious activity detection

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, 25-31 May 2013.

Summary

In this paper, we present a method for detecting malicious activity within networks of interest. We leverage prior community detection work by propagating threat probabilities across graph nodes, given an initial set of known malicious nodes. We enhance prior work by employing constraints which remove the adverse effect of cyclic propagation that is a byproduct of current methods. We demonstrate the effectiveness of Probabilistic Threat Propagation on the task of detecting malicious web destinations.
READ LESS

Summary

In this paper, we present a method for detecting malicious activity within networks of interest. We leverage prior community detection work by propagating threat probabilities across graph nodes, given an initial set of known malicious nodes. We enhance prior work by employing constraints which remove the adverse effect of cyclic...

READ MORE

An Expectation Maximization Approach to Detecting Compromised Remote Access Accounts(267.16 KB)

Published in:
Proceedings of FLAIRS 2013, St. Pete Beach, Fla.

Summary

Just as credit-card companies are able to detect aberrant transactions on a customer’s credit card, it would be useful to have methods that could automatically detect when a user’s login credentials for Virtual Private Network (VPN) access have been compromised. We present here a novel method for detecting that a VPN account has been compromised, in a manner that bootstraps a model of the second unauthorized user.
READ LESS

Summary

Just as credit-card companies are able to detect aberrant transactions on a customer’s credit card, it would be useful to have methods that could automatically detect when a user’s login credentials for Virtual Private Network (VPN) access have been compromised. We present here a novel method for detecting that a...

READ MORE

Characterization of traffic and structure in the U.S. airport network

Summary

In this paper we seek to characterize traffic in the U.S. air transportation system, and to subsequently develop improved models of traffic demand. We model the air traffic within the U.S. national airspace system as dynamic weighted network. We employ techniques advanced by work in complex networks over the past several years in characterizing the structure and dynamics of the U.S. airport network. We show that the airport network is more dynamic over successive days than has been previously reported. The network has some properties that appear stationary over time, while others exhibit a high degree of variation. We characterize the network and its dynamics using structural measures such as degree distributions and clustering coefficients. We employ spectral analysis to show that dominant eigenvectors of the network are nearly stationary with time. We use this observation to suggest how low dimensional models of traffic demand in the airport network can be fashioned.
READ LESS

Summary

In this paper we seek to characterize traffic in the U.S. air transportation system, and to subsequently develop improved models of traffic demand. We model the air traffic within the U.S. national airspace system as dynamic weighted network. We employ techniques advanced by work in complex networks over the past...

READ MORE

Probabilistic reasoning for streaming anomaly detection

Published in:
2012 SSP: 2012 IEEE Statistical Signal Processing Workshop, 5-8 August 2012, pp. 377-380.

Summary

In many applications it is necessary to determine whether an observation from an incoming high-volume data stream matches expectations or is anomalous. A common method for performing this task is to use an Exponentially Weighted Moving Average (EWMA), which smooths out the minor variations of the data stream. While EWMA is efficient at processing high-rate streams, it can be very volatile to abrupt transient changes in the data, losing utility for appropriately detecting anomalies. In this paper we present a probabilistic approach to EWMA which dynamically adapts the weighting based on the observation probability. This results in robustness to data anomalies yet quick adaptability to distributional data shifts.
READ LESS

Summary

In many applications it is necessary to determine whether an observation from an incoming high-volume data stream matches expectations or is anomalous. A common method for performing this task is to use an Exponentially Weighted Moving Average (EWMA), which smooths out the minor variations of the data stream. While EWMA...

READ MORE

Temporally oblivious anomaly detection on large networks using functional peers

Published in:
IMC'10, Proc. of the ACM SIGCOMM Internet Measurement Conf., 1 November 2010, pp. 465-471.

Summary

Previous methods of network anomaly detection have focused on defining a temporal model of what is "normal," and flagging the "abnormal" activity that does not fit into this pre-trained construct. When monitoring traffic to and from IP addresses on a large network, this problem can become computationally complex, and potentially intractable, as a state model must be maintained for each address. In this paper, we present a method of detecting anomalous network activity without providing any historical context. By exploiting the size of the network along with the minimal overhead of NetFlow data, we are able to model groups of hosts performing similar functions to discover anomalous behavior. As a collection, these anomalies can be further described with a few high-level characterizations and we provide a means for creating and labeling these categories. We demonstrate our method on a very large-scale network consisting of 30 million unique addresses, focusing specifically on traffic related to web servers.
READ LESS

Summary

Previous methods of network anomaly detection have focused on defining a temporal model of what is "normal," and flagging the "abnormal" activity that does not fit into this pre-trained construct. When monitoring traffic to and from IP addresses on a large network, this problem can become computationally complex, and potentially...

READ MORE

Generating client workloads and high-fidelity network traffic for controllable, repeatable experiments in computer security

Published in:
13th Int. Symp. on Recent Advances in Intrusion Detection, 14 September 2010, pp. 218-237.

Summary

Rigorous scientific experimentation in system and network security remains an elusive goal. Recent work has outlined three basic requirements for experiments, namely that hypotheses must be falsifiable, experiments must be controllable, and experiments must be repeatable and reproducible. Despite their simplicity, these goals are difficult to achieve, especially when dealing with client-side threats and defenses, where often user input is required as part of the experiment. In this paper, we present techniques for making experiments involving security and client-side desktop applications like web browsers, PDF readers, or host-based firewalls or intrusion detection systems more controllable and more easily repeatable. First, we present techniques for using statistical models of user behavior to drive real, binary, GUI-enabled application programs in place of a human user. Second, we present techniques based on adaptive replay of application dialog that allow us to quickly and efficiently reproduce reasonable mock-ups of remotely-hosted applications to give the illusion of Internet connectedness on an isolated testbed. We demonstrate the utility of these techniques in an example experiment comparing the system resource consumption of a Windows machine running anti-virus protection versus an unprotected system.
READ LESS

Summary

Rigorous scientific experimentation in system and network security remains an elusive goal. Recent work has outlined three basic requirements for experiments, namely that hypotheses must be falsifiable, experiments must be controllable, and experiments must be repeatable and reproducible. Despite their simplicity, these goals are difficult to achieve, especially when dealing...

READ MORE

Machine learning in adversarial environments

Published in:
Mach. Learn., Vol. 81, No. 2, November 2010, pp. 115-119.

Summary

Whenever machine learning is used to prevent illegal or unsanctioned activity and there is an economic incentive, adversaries will attempt to circumvent the protection provided. Constraints on how adversaries can manipulate training and test data for classifiers used to detect suspicious behavior make problems in this area tractable and interesting. This special issue highlights papers that span many disciplines including email spam detection, computer intrusion detection, and detection of web pages deliberately designed to manipulate the priorities of pages returned by modern search engines. The four papers in this special issue provide a standard taxonomy of the types of attacks that can be expected in an adversarial framework, demonstrate how to design classifiers that are robust to deleted or corrupted features, demonstrate the ability of modern polymorphic engines to rewrite malware so it evades detection by current intrusion detection and antivirus systems, and provide approaches to detect web pages designed to manipulate web page scores returned by search engines. We hope that these papers and this special issue encourages the multidisciplinary cooperation required to address many interesting problems in this relatively new area including predicting the future of the arms races created by adversarial learning, developing effective long-term defensive strategies, and creating algorithms that can process the massive amounts of training and test data available for internet-scale problems.
READ LESS

Summary

Whenever machine learning is used to prevent illegal or unsanctioned activity and there is an economic incentive, adversaries will attempt to circumvent the protection provided. Constraints on how adversaries can manipulate training and test data for classifiers used to detect suspicious behavior make problems in this area tractable and interesting...

READ MORE

Passive operating system identification from TCP/IP packet headers

Published in:
ICDM Workshop on Data Mining for Computer Security, DMSEC, 19 November 2003.

Summary

Accurate operating system (OS) identification by passive network traffic analysis can continuously update less-frequent active network scans and help interpret alerts from intrusion detection systems. The most recent open-source passive OS identification tool (ettercap) rejects 70% of all packets and has a high 75-class error rate of 30% for non-rejected packets on unseen test data. New classifiers were developed using machine-learning approaches including cross-validation testing, grouping OS names into fewer classes, and evaluating alternate classifier types. Nearest neighbor and binary tree classifiers provide a low 9-class OS identification error rate of roughly 10% on unseen data without rejecting packets. This error rate drops to nearly zero when 10% of the packets are rejected.
READ LESS

Summary

Accurate operating system (OS) identification by passive network traffic analysis can continuously update less-frequent active network scans and help interpret alerts from intrusion detection systems. The most recent open-source passive OS identification tool (ettercap) rejects 70% of all packets and has a high 75-class error rate of 30% for non-rejected...

READ MORE

Wordspotter training using figure-of-merit back propagation

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Speech Processing, 19-22 April 1994, pp. 389-392.

Summary

A new approach to wordspotter training is presented which directly maximizes the Figure of Merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the FOM of wordspotters tested using cross-validation on the credit-card speech corpus training conversations by 4 to 5 percentage points to roughly 70% This improved performance requires little extra complexity during wordspotting and only two extra passes through the training data during training. The FOM gradient is computed analytically for each putative hit, back-propagated through HMM word models using the Viterbi alignment, and used to adjust RBF hidden node centers and state-weights associated with every node in HMM keyword models.
READ LESS

Summary

A new approach to wordspotter training is presented which directly maximizes the Figure of Merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the...

READ MORE

Neural networks, Bayesian a posteriori probabilities, and pattern classification

Published in:
Chapter 4 in From Statistics to Neural Networks: Theory and Pattern Recognition Applications, 1994, pp. 83-104.

Summary

Researchers in the fields of neural networks, statistics, machine learning, and artificial intelligence have followed three basic approaches to developing new pattern classifiers. Probability Density Function (PDF) classifiers include Gaussian and Gaussian Mixture classifiers which estimate distributions or densities of input features separately for each class. Posterior probability classifiers include multilayer perceptron neural networks with sigmoid nonlinearities and radial basis function networks. These classifiers estimate minimum-error Bayesian a posteriori probabilities (hereafter referred to as posterior probabilities) simultaneously for all classes. Boundary forming classifiers include hard-limiting single-layer perceptrons, hypersphere classifiers, and nearest neighbor classifiers. These classifiers have binary indicator outputs which form decision regions that specify the class of any input pattern. Posterior probability and boundary-forming classifiers are trained using discriminant training. All training data is used simultaneously to estimate Bayesian posterior probabilities or minimize overall classification error rates. PDF classifiers are trained using maximum likelihood approaches which individually model class distributions without regard to overall classification performance. Analytic results are presented which demonstrate that many neural network classifiers can accurately estimate posterior probabilities and that these neural network classifiers can sometimes provide lower error rates than PDF classifiers using the same number of trainable parameters. Experiments also demonstrate how interpretation of network outputs as posterior probabilities makes it possible to estimate the confidence of a classification decision, compensate for differences in class prior probabilities between test and training data, and combine outputs of multiple classifiers over time for speech recognition.
READ LESS

Summary

Researchers in the fields of neural networks, statistics, machine learning, and artificial intelligence have followed three basic approaches to developing new pattern classifiers. Probability Density Function (PDF) classifiers include Gaussian and Gaussian Mixture classifiers which estimate distributions or densities of input features separately for each class. Posterior probability classifiers include...

READ MORE