Publications

Refine Results

(Filters Applied) Clear All

SARA: Survivable Autonomic Response Architecture

Published in:
DARPA Information Survivability Conf. and Exposition II, 12-14 June 2001, pp. 77-88.

Summary

This paper describes the architecture of a system being developed to defend information systems using coordinated autonomic responses. The system will also be used to test the hypothesis that an effective defense against fast, distributed information attacks requires rapid, coordinated, network-wide responses. The core components of the architecture are a run-time infrastructure (RTI), a communication language, a system model, and defensive components. The RTI incorporates a number of innovative design concepts and provides fast, reliable, exploitation-resistant communication and coordination services to the components defending the network, even when challenged by a distributed attack. The architecture can be tailored to provide scalable information assurance defenses for large, geographically distributed, heterogeneous networks with multiple domains, each of which uses different technologies and requires different policies. The architecture can form the basis of a field-deployable system. An initial version is being developed for evaluation in a testbed that will be used to test the autonomic coordination and response hypothesis.
READ LESS

Summary

This paper describes the architecture of a system being developed to defend information systems using coordinated autonomic responses. The system will also be used to test the hypothesis that an effective defense against fast, distributed information attacks requires rapid, coordinated, network-wide responses. The core components of the architecture are a...

READ MORE

Detecting low-profile probes and novel denial-of-service attacks

Summary

Attackers use probing attacks to discover host addresses and services available on each host. Once this information is known, an attacker can then issue a denial-of-service attack against the network, a host, or a service provided by a host. These attacks prevent access to the attacked part of the network. Until recently, only simple, easily defeated mechanisms were used for detecting probe attacks. Attackers defeat these mechanisms by creating stealthy low-profile attacks that include only a few, carefully crafted packets sent over an extended period of time. Furthermore, most mechanisms do not allow intrusion analysts to trade off detection rates for false alarm rates. We present an approach to detect stealthy attacks, an architecture for achieving real-time detections with a confidence measure, and the results of evaluating the system. Since the system outputs confidence values, an analyst can trade false alarm rate against detection rate.
READ LESS

Summary

Attackers use probing attacks to discover host addresses and services available on each host. Once this information is known, an attacker can then issue a denial-of-service attack against the network, a host, or a service provided by a host. These attacks prevent access to the attacked part of the network...

READ MORE

Analysis and results of the 1999 DARPA off-line intrusion detection evaluation

Published in:
Proc. Recent Advances in Intrusion Detection, RAID, 2-4 October 2000, pp. 162-182.

Summary

Eight sites participated in the second DARPA off-line intrusion detection evaluation in 1999. Three weeks of training and two weeks of test data were generated on a test bed that emulates a small government site. More than 200 instances of 58 attack types were launched against victim UNIX and Windows NT hosts. False alarm rates were low (less than 10 per day). Best detection was provided by network-based systems for old probe and old denial-of-service (DOS) attacks and by host-based systems for Solaris user-to-root (U2R) attacks. Best over-all performance would have been provided by a combined system that used both host- and network-based intrusion detection. Detection accuracy was poor for previously unseen new, stealthy, and Windows NT attacks. Ten of the 58 attack types were completely missed by all systems. Systems missed attacks because protocols and TCP services were not analyzed at all or to the depth required, because signatures for old attacks did not generalize to new attacks, and because auditing was not available on all hosts.
READ LESS

Summary

Eight sites participated in the second DARPA off-line intrusion detection evaluation in 1999. Three weeks of training and two weeks of test data were generated on a test bed that emulates a small government site. More than 200 instances of 58 attack types were launched against victim UNIX and Windows...

READ MORE

The 1999 DARPA Off-Line Intrusion Detection Evaluation

Published in:
Comput. Networks, Vol. 34, No. 4, October 2000, pp. 579-595.

Summary

Eight sites participated in the second Defense Advanced Research Projects Agency (DARPA) off-line intrusion detection evaluation in 1999. A test bed generated live background traffic similar to that on a government site containing hundreds of users on thousands of hosts. More than 200 instances of 58 attack types were launched against victim UNIX and Windows NT hosts in three weeks of training data and two weeks of test data. False-alarm rates were low (less than 10 per day). The best detection was provided by network-based systems for old probe and old denial-of-service (DOS) attacks and by host-based systems for Solaris user-to-root (U2R) attacks. The best overall performance would have been provided by a combined system that used both host- and network-based intrusion detection. Detection accuracy was poor for previously unseen, new, stealthy and Windows NT attacks. Ten of the 58 attack types were completely missed by all systems. Systems missed attacks because signatures for old attacks did not generalize to new attacks, auditing was not available on all hosts, and protocols and TCP services were not analyzed at all or to the depth required. Promising capabilities were demonstrated by host-based systems, anomaly detection systems and a system that performs forensic analysis on file system data.
READ LESS

Summary

Eight sites participated in the second Defense Advanced Research Projects Agency (DARPA) off-line intrusion detection evaluation in 1999. A test bed generated live background traffic similar to that on a government site containing hundreds of users on thousands of hosts. More than 200 instances of 58 attack types were launched...

READ MORE

Wordspotter training using figure-of-merit back propagation

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Speech Processing, 19-22 April 1994, pp. 389-392.

Summary

A new approach to wordspotter training is presented which directly maximizes the Figure of Merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the FOM of wordspotters tested using cross-validation on the credit-card speech corpus training conversations by 4 to 5 percentage points to roughly 70% This improved performance requires little extra complexity during wordspotting and only two extra passes through the training data during training. The FOM gradient is computed analytically for each putative hit, back-propagated through HMM word models using the Viterbi alignment, and used to adjust RBF hidden node centers and state-weights associated with every node in HMM keyword models.
READ LESS

Summary

A new approach to wordspotter training is presented which directly maximizes the Figure of Merit (FOM) defined as the average detection rate over a specified range of false alarm rates. This systematic approach to discriminant training for wordspotters eliminates the necessity of ad hoc thresholds and tuning. It improves the...

READ MORE

Neural networks, Bayesian a posteriori probabilities, and pattern classification

Published in:
Chapter 4 in From Statistics to Neural Networks: Theory and Pattern Recognition Applications, 1994, pp. 83-104.

Summary

Researchers in the fields of neural networks, statistics, machine learning, and artificial intelligence have followed three basic approaches to developing new pattern classifiers. Probability Density Function (PDF) classifiers include Gaussian and Gaussian Mixture classifiers which estimate distributions or densities of input features separately for each class. Posterior probability classifiers include multilayer perceptron neural networks with sigmoid nonlinearities and radial basis function networks. These classifiers estimate minimum-error Bayesian a posteriori probabilities (hereafter referred to as posterior probabilities) simultaneously for all classes. Boundary forming classifiers include hard-limiting single-layer perceptrons, hypersphere classifiers, and nearest neighbor classifiers. These classifiers have binary indicator outputs which form decision regions that specify the class of any input pattern. Posterior probability and boundary-forming classifiers are trained using discriminant training. All training data is used simultaneously to estimate Bayesian posterior probabilities or minimize overall classification error rates. PDF classifiers are trained using maximum likelihood approaches which individually model class distributions without regard to overall classification performance. Analytic results are presented which demonstrate that many neural network classifiers can accurately estimate posterior probabilities and that these neural network classifiers can sometimes provide lower error rates than PDF classifiers using the same number of trainable parameters. Experiments also demonstrate how interpretation of network outputs as posterior probabilities makes it possible to estimate the confidence of a classification decision, compensate for differences in class prior probabilities between test and training data, and combine outputs of multiple classifiers over time for speech recognition.
READ LESS

Summary

Researchers in the fields of neural networks, statistics, machine learning, and artificial intelligence have followed three basic approaches to developing new pattern classifiers. Probability Density Function (PDF) classifiers include Gaussian and Gaussian Mixture classifiers which estimate distributions or densities of input features separately for each class. Posterior probability classifiers include...

READ MORE

Predicting the risk of complications in coronary artery bypass operations using neural networks

Published in:
Proc. 7th Int. Conf. on Neural Information Processing Systems, NIPS, 1994, pp. 1055-62.

Summary

Experiments demonstrated that sigmoid multilayer perceptron (MLP) networks provide slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations at the Lahey Clinic. MLP networks with no hidden layer and networks with one hidden layer were trained using stochastic gradient descent with early stopping. MLP networks and logistic regression used the same input features and were evaluated using bootstrap sampling with 50 replications. ROC areas for predicting mortality using preoperative input features were 70.5% for logistic regression and 76.0% for MLP networks. Regularization provided by early stopping was an important component of improved performance. A simplified approach to generating confidence intervals for MLP risk predictions using an auxiliary "confidence MLP" was developed. The confidence MLP is trained to reproduce confidence intervals that were generated during training using the outputs of 50 MLP networks trained with different bootstrap samples.
READ LESS

Summary

Experiments demonstrated that sigmoid multilayer perceptron (MLP) networks provide slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations at the Lahey Clinic. MLP networks with no hidden layer and networks...

READ MORE

LNKnet: Neural network, machine-learning, and statistical software for pattern classification

Published in:
Lincoln Laboratory Journal, Vol. 6, No. 2, Summer/Fall 1993, pp. 249-268.

Summary

Pattern-classification and clustering algorithms are key components of modern information processing systems used to perform tasks such as speech and image recognition, printed-character recognition, medical diagnosis, fault detection, process control, and financial decision making. To simplify the task of applying these types of algorithms in new application areas, we have developed LNKnet-a software package that provides access to more than 20 pattern-classification, clustering, and feature-selection algorithms. Included are the most important algorithms from the fields of neural networks, statistics, machine learning, and artificial intelligence. The algorithms can be trained and tested on separate data or tested with automatic cross-validation. LNKnet runs under the UNM operating system and access to the different algorithms is provided through a graphical point-and-click user interface. Graphical outputs include two-dimensional (2-D) scatter and decision-region plots and 1-D plots of data histograms, classifier outputs, and error rates during training. Parameters of trained classifiers are stored in files from which the parameters can be translated into source-code subroutines (written in the C programming language) that can then be embedded in a user application program. Lincoln Laboratory and other research laboratories have used LNKnet successfully for many diverse applications.
READ LESS

Summary

Pattern-classification and clustering algorithms are key components of modern information processing systems used to perform tasks such as speech and image recognition, printed-character recognition, medical diagnosis, fault detection, process control, and financial decision making. To simplify the task of applying these types of algorithms in new application areas, we have...

READ MORE

Neural network classifiers estimate Bayesian a posteriori probabilities

Published in:
Neural Comput., Vol. 3, No. 4, Winter 1991, pp. 461-483.

Summary

Many neural network classifiers provide outputs which estimate Bayesian a posteriori probabilities. When the estimation is accurate, network outputs can be treated as probabilities and sum to one. Simple proofs show that Bayesian probabilities are estimated when desired network outputs are 1 of M (one output unity, all others zero) and a squared-error or mss-entropy cost function is used. Results of Monte Carlo simulations performed using multilayer perceptron (MLP) networks trained with backpropagation, radial basis function (RBD networks, and high-order polynomial networks graphically demonstrate that network outputs provide good estimates of Bayesian probabilities. Estimation accuracy depends on network complexity, the amount of training data, and the degree to which training data reflect true likelihood distributions and a priori class probabilities. Interpretation of network outputs as Bayesian probabilities allows outputs from multiple networks to be combined for higher level decision making, simplifies creation of rejection thresholds, makes it possible to compensate for differences between pattern class probabilities in training and test data, allows outputs to be used to minimize alternative risk functions, and suggests alternative measures of network performance.
READ LESS

Summary

Many neural network classifiers provide outputs which estimate Bayesian a posteriori probabilities. When the estimation is accurate, network outputs can be treated as probabilities and sum to one. Simple proofs show that Bayesian probabilities are estimated when desired network outputs are 1 of M (one output unity, all others zero)...

READ MORE

An introduction to computing with neural nets

Published in:
IEEE ASSP Mag., Vol. 4, No. 2, April 1987, pp. 4-22.

Summary

Artificial neural net models have been studied for many years in the hope of achieving human-like performance in the fields of speech and image recognition. These models are composed of many nonlinear computational elements operating in parallel and arranged in patterns reminiscent of biological neural nets. Computational elements or nodes are connected via weights that are typically adapted during use to improve performance. There has been a recent resurgence in the field of artificial neural nets caused by new net topologies and algorithms, analog VLSI implementation techniques, and the belief that massive parallelism is essential for high performance speech and image recognition. This paper provides an introduction to the field of artificial neural nets by reviewing six important neural net models that can be used for pattern classification. These nets are highly parallel building blocks that illustrate neural net components and design principles and can be used to construct more complex systems. In addition to describing these nets, a major emphasis is placed on exploring how some existing classification and clustering algorithms can be performed using simple neuron-like components. Single-layer nets can implement algorithms required by Gaussian maximum-likelihood classifiers and optimum minimum-error classifiers for binary patterns corrupted by noise. More generally, the decision regions required by any classification algorithm can be generated in a straightforward manner by three-layer feed-forward nets.
READ LESS

Summary

Artificial neural net models have been studied for many years in the hope of achieving human-like performance in the fields of speech and image recognition. These models are composed of many nonlinear computational elements operating in parallel and arranged in patterns reminiscent of biological neural nets. Computational elements or nodes...

READ MORE