Publications

Refine Results

(Filters Applied) Clear All

Learning emergent discrete message communication for cooperative reinforcement learning

Published in:
37th Conf. on Uncertainty in Artificial Intelligence, UAI 2021, early access, 26-30 July 2021.

Summary

Communication is a important factor that enables agents work cooperatively in multi-agent reinforcement learning (MARL). Most previous work uses continuous message communication whose high representational capacity comes at the expense of interpretability. Allowing agents to learn their own discrete message communication protocol emerged from a variety of domains can increase the interpretability for human designers and other agents. This paper proposes a method to generate discrete messages analogous to human languages, and achieve communication by a broadcast-and-listen mechanism based on self-attention. We show that discrete message communication has performance comparable to continuous message communication but with much a much smaller vocabulary size. Furthermore, we propose an approach that allows humans to interactively send discrete messages to agents.
READ LESS

Summary

Communication is a important factor that enables agents work cooperatively in multi-agent reinforcement learning (MARL). Most previous work uses continuous message communication whose high representational capacity comes at the expense of interpretability. Allowing agents to learn their own discrete message communication protocol emerged from a variety of domains can increase...

READ MORE

Information Aware max-norm Dirichlet networks for predictive uncertainty estimation

Published in:
Neural Netw., Vol. 135, 2021, pp. 105–114.

Summary

Precise estimation of uncertainty in predictions for AI systems is a critical factor in ensuring trust and safety. Deep neural networks trained with a conventional method are prone to over-confident predictions. In contrast to Bayesian neural networks that learn approximate distributions on weights to infer prediction confidence, we propose a novel method, Information Aware Dirichlet networks, that learn an explicit Dirichlet prior distribution on predictive distributions by minimizing a bound on the expected max norm of the prediction error and penalizing information associated with incorrect outcomes. Properties of the new cost function are derived to indicate how improved uncertainty estimation is achieved. Experiments using real datasets show that our technique outperforms, by a large margin, state-of-the-art neural networks for estimating within-distribution and out-of-distribution uncertainty, and detecting adversarial examples.
READ LESS

Summary

Precise estimation of uncertainty in predictions for AI systems is a critical factor in ensuring trust and safety. Deep neural networks trained with a conventional method are prone to over-confident predictions. In contrast to Bayesian neural networks that learn approximate distributions on weights to infer prediction confidence, we propose a...

READ MORE

Automated posterior interval evaluation for inference in probabilistic programming

Author:
Published in:
Intl. Conf. on Probabilistic Programming, PROBPROG, 22 October 2020.

Summary

In probabilistic inference, credible intervals constructed from posterior samples provide ranges of likely values for continuous parameters of interest. Intuitively, an inference procedure is optimal if it produces the most precise posterior intervals that cover the true parameter value with the expected frequency in repeated experiments. We present theories and methods for automating posterior interval evaluation of inference performance in probabilistic programming using two metrics: 1.) truth coverage, and 2.) ratio of the empirical over the ideal interval widths. Demonstrating with inference on popular regression and state-space models, we show how the metrics provide effective comparisons between different inference procedures, and capture the effects of collinearity and model misspecification. Overall, we claim such automated interval evaluation can accelerate the robust design and comparison of probabilistic inference programs by directly diagnosing how accurately and precisely they can estimate parameters of interest.
READ LESS

Summary

In probabilistic inference, credible intervals constructed from posterior samples provide ranges of likely values for continuous parameters of interest. Intuitively, an inference procedure is optimal if it produces the most precise posterior intervals that cover the true parameter value with the expected frequency in repeated experiments. We present theories and...

READ MORE

Failure prediction by confidence estimation of uncertainty-aware Dirichlet networks

Published in:
https://arxiv.org/abs/2010.09865

Summary

Reliably assessing model confidence in deep learning and predicting errors likely to be made are key elements in providing safety for model deployment, in particular for applications with dire consequences. In this paper, it is first shown that uncertainty-aware deep Dirichlet neural networks provide an improved separation between the confidence of correct and incorrect predictions in the true class probability (TCP) metric. Second, as the true class is unknown at test time, a new criterion is proposed for learning the true class probability by matching prediction confidence scores while taking imbalance and TCP constraints into account for correct predictions and failures. Experimental results show our method improves upon the maximum class probability (MCP) baseline and predicted TCP for standard networks on several image classification tasks with various network architectures.
READ LESS

Summary

Reliably assessing model confidence in deep learning and predicting errors likely to be made are key elements in providing safety for model deployment, in particular for applications with dire consequences. In this paper, it is first shown that uncertainty-aware deep Dirichlet neural networks provide an improved separation between the confidence...

READ MORE

A multi-task LSTM framework for improved early sepsis prediction

Published in:
Proc. Artificial Intelligence in Medicine, AIME, 2020, pp. 49-58.

Summary

Early detection for sepsis, a high-mortality clinical condition, is important for improving patient outcomes. The performance of conventional deep learning methods degrades quickly as predictions are made several hours prior to the clinical definition. We adopt recurrent neural networks (RNNs) to improve early prediction of the onset of sepsis using times series of physiological measurements. Furthermore, physiological data is often missing and imputation is necessary. Absence of data might arise due to decisions made by clinical professionals which carries information. Using the missing data patterns into the learning process can further guide how much trust to place on imputed values. A new multi-task LSTM model is proposed that takes informative missingness into account during training that effectively attributes trust to temporal measurements. Experimental results demonstrate our method outperforms conventional CNN and LSTM models on the PhysioNet-2019 CiC early sepsis prediction challenge in terms of area under receiver-operating curve and precision-recall curve, and further improves upon calibration of prediction scores.
READ LESS

Summary

Early detection for sepsis, a high-mortality clinical condition, is important for improving patient outcomes. The performance of conventional deep learning methods degrades quickly as predictions are made several hours prior to the clinical definition. We adopt recurrent neural networks (RNNs) to improve early prediction of the onset of sepsis using...

READ MORE

Towards a distributed framework for multi-agent reinforcement learning research

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into the limits of these approaches as computation increases. In this paper, we present a distributed RL training framework designed for super computing infrastructures such as the MIT SuperCloud. We review a collection of challenging learning environments—such as Google Research Football, StarCraft II, and Multi-Agent Mujoco— which are at the frontier of reinforcement learning research. We provide results on these environments that illustrate the current state of the field on these problems. Finally, we also quantify and discuss the computational requirements needed for performing RL research by enumerating all experiments performed on these environments.
READ LESS

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into...

READ MORE

Fast training of deep neural networks robust to adversarial perturbations

Summary

Deep neural networks are capable of training fast and generalizing well within many domains. Despite their promising performance, deep networks have shown sensitivities to perturbations of their inputs (e.g., adversarial examples) and their learned feature representations are often difficult to interpret, raising concerns about their true capability and trustworthiness. Recent work in adversarial training, a form of robust optimization in which the model is optimized against adversarial examples, demonstrates the ability to improve performance sensitivities to perturbations and yield feature representations that are more interpretable. Adversarial training, however, comes with an increased computational cost over that of standard (i.e., nonrobust) training, rendering it impractical for use in largescale problems. Recent work suggests that a fast approximation to adversarial training shows promise for reducing training time and maintaining robustness in the presence of perturbations bounded by the infinity norm. In this work, we demonstrate that this approach extends to the Euclidean norm and preserves the human-aligned feature representations that are common for robust models. Additionally, we show that using a distributed training scheme can further reduce the time to train robust deep networks. Fast adversarial training is a promising approach that will provide increased security and explainability in machine learning applications for which robust optimization was previously thought to be impractical.
READ LESS

Summary

Deep neural networks are capable of training fast and generalizing well within many domains. Despite their promising performance, deep networks have shown sensitivities to perturbations of their inputs (e.g., adversarial examples) and their learned feature representations are often difficult to interpret, raising concerns about their true capability and trustworthiness. Recent...

READ MORE

Deep implicit coordination graphs for multi-agent reinforcement learning [e-print]

Summary

Multi-agent reinforcement learning (MARL) requires coordination to efficiently solve certain tasks. Fully centralized control is often infeasible in such domains due to the size of joint action spaces. Coordination graph based formalization allows reasoning about the joint action based on the structure of interactions. However, they often require domain expertise in their design. This paper introduces the deep implicit coordination graph (DICG) architecture for such scenarios. DICG consists of a module for inferring the dynamic coordination graph structure which is then used by a graph neural network based module to learn to implicitly reason about the joint actions or values. DICG allows learning the tradeoff between full centralization and decentralization via standard actor-critic methods to significantly improve coordination for domains with large number of agents. We apply DICG to both centralized-training-centralized-execution and centralized-training-decentralized-execution regimes. We demonstrate that DICG solves the relative overgeneralization pathology in predatory-prey tasks as well as outperforms various MARL baselines on the challenging StarCraft II Multi-agent Challenge (SMAC) and traffic junction environments.
READ LESS

Summary

Multi-agent reinforcement learning (MARL) requires coordination to efficiently solve certain tasks. Fully centralized control is often infeasible in such domains due to the size of joint action spaces. Coordination graph based formalization allows reasoning about the joint action based on the structure of interactions. However, they often require domain expertise...

READ MORE

Toward an autonomous aerial survey and planning system for humanitarian aid and disaster response

Summary

In this paper we propose an integrated system concept for autonomously surveying and planning emergency response for areas impacted by natural disasters. Referred to as AASAPS-HADR, this system is composed of a network of ground stations and autonomous aerial vehicles interconnected by an ad hoc emergency communication network. The system objectives are three-fold: to provide situational awareness of the evolving disaster event, to generate dispatch and routing plans for emergency vehicles, and to provide continuous communication networks which augment pre-existing communication infrastructure that may have been damaged or destroyed. Lacking development in previous literature, we give particular emphasis to the situational awareness objective of disaster response by proposing an autonomous aerial survey that is tasked with assessing damage to existing road networks, detecting and locating human victims, and providing a cursory assessment of casualty types that can be used to inform medical response priorities. In this paper we provide a high-level system design concept, identify existing AI perception and planning algorithms that most closely suit our purposes as well as technology gaps within those algorithms, and provide initial experimental results for non-contact health monitoring using real-time pose recognition algorithms running on a Nvidia Jetson TX2 mounted on board a quadrotor UAV. Finally we provide technology development recommendations for future phases of the AASAPS-HADR system.
READ LESS

Summary

In this paper we propose an integrated system concept for autonomously surveying and planning emergency response for areas impacted by natural disasters. Referred to as AASAPS-HADR, this system is composed of a network of ground stations and autonomous aerial vehicles interconnected by an ad hoc emergency communication network. The system...

READ MORE

Safe predictors for enforcing input-output specifications [e-print]

Summary

We present an approach for designing correct-by-construction neural networks (and other machine learning models) that are guaranteed to be consistent with a collection of input-output specifications before, during, and after algorithm training. Our method involves designing a constrained predictor for each set of compatible constraints, and combining them safely via a convex combination of their predictions. We demonstrate our approach on synthetic datasets and an aircraft collision avoidance problem.
READ LESS

Summary

We present an approach for designing correct-by-construction neural networks (and other machine learning models) that are guaranteed to be consistent with a collection of input-output specifications before, during, and after algorithm training. Our method involves designing a constrained predictor for each set of compatible constraints, and combining them safely via...

READ MORE

Showing Results

1-10 of 12