Publications

Refine Results

(Filters Applied) Clear All

Health-informed policy gradients for multi-agent reinforcement learning

Summary

This paper proposes a definition of system health in the context of multiple agents optimizing a joint reward function. We use this definition as a credit assignment term in a policy gradient algorithm to distinguish the contributions of individual agents to the global reward. The health-informed credit assignment is then extended to a multi-agent variant of the proximal policy optimization algorithm and demonstrated on simple particle environments that have elements of system health, risk-taking, semi-expendable agents, and partial observability. We show significant improvement in learning performance compared to policy gradient methods that do not perform multi-agent credit assignment.
READ LESS

Summary

This paper proposes a definition of system health in the context of multiple agents optimizing a joint reward function. We use this definition as a credit assignment term in a policy gradient algorithm to distinguish the contributions of individual agents to the global reward. The health-informed credit assignment is then...

READ MORE

Learning emergent discrete message communication for cooperative reinforcement learning

Published in:
37th Conf. on Uncertainty in Artificial Intelligence, UAI 2021, early access, 26-30 July 2021.

Summary

Communication is a important factor that enables agents work cooperatively in multi-agent reinforcement learning (MARL). Most previous work uses continuous message communication whose high representational capacity comes at the expense of interpretability. Allowing agents to learn their own discrete message communication protocol emerged from a variety of domains can increase the interpretability for human designers and other agents. This paper proposes a method to generate discrete messages analogous to human languages, and achieve communication by a broadcast-and-listen mechanism based on self-attention. We show that discrete message communication has performance comparable to continuous message communication but with much a much smaller vocabulary size. Furthermore, we propose an approach that allows humans to interactively send discrete messages to agents.
READ LESS

Summary

Communication is a important factor that enables agents work cooperatively in multi-agent reinforcement learning (MARL). Most previous work uses continuous message communication whose high representational capacity comes at the expense of interpretability. Allowing agents to learn their own discrete message communication protocol emerged from a variety of domains can increase...

READ MORE

Beyond expertise and roles: a framework to characterize the stakeholders of interpretable machine learning and their needs

Published in:
Proc. Conf. on Human Factors in Computing Systems, 8-13 May 2021, article no. 74.

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples stakeholders' knowledge from their interpretability needs. We characterize stakeholders by their formal, instrumental, and personal knowledge and how it manifests in the contexts of machine learning, the data domain, and the general milieu. We additionally distill a hierarchical typology of stakeholder needs that distinguishes higher-level domain goals from lower-level interpretability tasks. In assessing the descriptive, evaluative, and generative powers of our framework, we find our more nuanced treatment of stakeholders reveals gaps and opportunities in the interpretability literature, adds precision to the design and comparison of user studies, and facilitates a more reflexive approach to conducting this research.
READ LESS

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples...

READ MORE

Automated posterior interval evaluation for inference in probabilistic programming

Author:
Published in:
Intl. Conf. on Probabilistic Programming, PROBPROG, 22 October 2020.

Summary

In probabilistic inference, credible intervals constructed from posterior samples provide ranges of likely values for continuous parameters of interest. Intuitively, an inference procedure is optimal if it produces the most precise posterior intervals that cover the true parameter value with the expected frequency in repeated experiments. We present theories and methods for automating posterior interval evaluation of inference performance in probabilistic programming using two metrics: 1.) truth coverage, and 2.) ratio of the empirical over the ideal interval widths. Demonstrating with inference on popular regression and state-space models, we show how the metrics provide effective comparisons between different inference procedures, and capture the effects of collinearity and model misspecification. Overall, we claim such automated interval evaluation can accelerate the robust design and comparison of probabilistic inference programs by directly diagnosing how accurately and precisely they can estimate parameters of interest.
READ LESS

Summary

In probabilistic inference, credible intervals constructed from posterior samples provide ranges of likely values for continuous parameters of interest. Intuitively, an inference procedure is optimal if it produces the most precise posterior intervals that cover the true parameter value with the expected frequency in repeated experiments. We present theories and...

READ MORE

A multi-task LSTM framework for improved early sepsis prediction

Published in:
Proc. Artificial Intelligence in Medicine, AIME, 2020, pp. 49-58.

Summary

Early detection for sepsis, a high-mortality clinical condition, is important for improving patient outcomes. The performance of conventional deep learning methods degrades quickly as predictions are made several hours prior to the clinical definition. We adopt recurrent neural networks (RNNs) to improve early prediction of the onset of sepsis using times series of physiological measurements. Furthermore, physiological data is often missing and imputation is necessary. Absence of data might arise due to decisions made by clinical professionals which carries information. Using the missing data patterns into the learning process can further guide how much trust to place on imputed values. A new multi-task LSTM model is proposed that takes informative missingness into account during training that effectively attributes trust to temporal measurements. Experimental results demonstrate our method outperforms conventional CNN and LSTM models on the PhysioNet-2019 CiC early sepsis prediction challenge in terms of area under receiver-operating curve and precision-recall curve, and further improves upon calibration of prediction scores.
READ LESS

Summary

Early detection for sepsis, a high-mortality clinical condition, is important for improving patient outcomes. The performance of conventional deep learning methods degrades quickly as predictions are made several hours prior to the clinical definition. We adopt recurrent neural networks (RNNs) to improve early prediction of the onset of sepsis using...

READ MORE

GraphChallenge.org triangle counting performance [e-print]

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of preparsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. The triangle counting component of GraphChallenge.org tests the performance of graph processing systems to count all the triangles in a graph and exercises key graph operations found in many graph algorithms. In 2017, 2018, and 2019 many triangle counting submissions were received from a wide range of authors and organizations. This paper presents a performance analysis of the best performers of these submissions. These submissions show that their state-of-the-art triangle counting execution time, Ttri, is a strong function of the number of edges in the graph, Ne, which improved significantly from 2017 (Ttri \approx (Ne/10^8)^4=3) to 2018 (Ttri \approx Ne/10^9) and remained comparable from 2018 to 2019. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs
READ LESS

Summary

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems...

READ MORE

Leveraging linear algebra to count and enumerate simple subgraphs

Published in:
2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Summary

Even though subgraph counting and subgraph matching are well-known NP-Hard problems, they are foundational building blocks for many scientific and commercial applications. In order to analyze graphs that contain millions to billions of edges, distributed systems can provide computational scalability through search parallelization. One recent approach for exposing graph algorithm parallelization is through a linear algebra formulation and the use of the matrix multiply operation, which conceptually is equivalent to a massively parallel graph traversal. This approach has several benefits, including 1) a mathematically-rigorous foundation, and 2) ability to leverage specialized linear algebra accelerators and high-performance libraries. In this paper, we explore and define a linear algebra methodology for performing exact subgraph counting and matching for 4-vertex subgraphs excluding the clique. Matches on these simple subgraphs can be joined as components for a larger subgraph. With thorough analysis, we demonstrate that the linear algebra formulation leverages path aggregation which allows it to be up 2x to 5x more efficient in traversing the search space and compressing the results as compared to tree-based subgraph matching techniques.
READ LESS

Summary

Even though subgraph counting and subgraph matching are well-known NP-Hard problems, they are foundational building blocks for many scientific and commercial applications. In order to analyze graphs that contain millions to billions of edges, distributed systems can provide computational scalability through search parallelization. One recent approach for exposing graph algorithm...

READ MORE

Towards a distributed framework for multi-agent reinforcement learning research

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into the limits of these approaches as computation increases. In this paper, we present a distributed RL training framework designed for super computing infrastructures such as the MIT SuperCloud. We review a collection of challenging learning environments—such as Google Research Football, StarCraft II, and Multi-Agent Mujoco— which are at the frontier of reinforcement learning research. We provide results on these environments that illustrate the current state of the field on these problems. Finally, we also quantify and discuss the computational requirements needed for performing RL research by enumerating all experiments performed on these environments.
READ LESS

Summary

Some of the most important publications in deep reinforcement learning over the last few years have been fueled by access to massive amounts of computation through large scale distributed systems. The success of these approaches in achieving human-expert level performance on several complex video-game environments has motivated further exploration into...

READ MORE

Augmented Annotation Phase 3

Author:
Published in:
MIT Lincoln Laboratory Report TR-1248

Summary

Automated visual object detection is an important capability in reducing the burden on human operators in many DoD applications. To train modern deep learning algorithms to recognize desired objects, the algorithms must be "fed" more than 1000 labeled images (for 55%–85% accuracy according to project Maven - Oct 2017 O6, Working Group slide 27) of each particular object. The task of labeling training data for use in machine learning algorithms is human intensive, requires special software, and takes a great deal of time. Estimates from ImageNet, a widely used and publicly available visual object detection dataset, indicate that humans generated four annotations per minute in the overall production of ImageNet annotations. DoD's need is to reduce direct object-by-object human labeling particularly in the video domain where data quantity can be significant. The Augmented Annotations System addresses this need by leveraging a small amount of human annotation effort to propagate human initiated annotations through video to build an initial labeled dataset for training an object detector, and utilizing an automated object detector in an iterative loop to assist humans in pre-annotating new datasets.
READ LESS

Summary

Automated visual object detection is an important capability in reducing the burden on human operators in many DoD applications. To train modern deep learning algorithms to recognize desired objects, the algorithms must be "fed" more than 1000 labeled images (for 55%–85% accuracy according to project Maven - Oct 2017 O6...

READ MORE

Toward an autonomous aerial survey and planning system for humanitarian aid and disaster response

Summary

In this paper we propose an integrated system concept for autonomously surveying and planning emergency response for areas impacted by natural disasters. Referred to as AASAPS-HADR, this system is composed of a network of ground stations and autonomous aerial vehicles interconnected by an ad hoc emergency communication network. The system objectives are three-fold: to provide situational awareness of the evolving disaster event, to generate dispatch and routing plans for emergency vehicles, and to provide continuous communication networks which augment pre-existing communication infrastructure that may have been damaged or destroyed. Lacking development in previous literature, we give particular emphasis to the situational awareness objective of disaster response by proposing an autonomous aerial survey that is tasked with assessing damage to existing road networks, detecting and locating human victims, and providing a cursory assessment of casualty types that can be used to inform medical response priorities. In this paper we provide a high-level system design concept, identify existing AI perception and planning algorithms that most closely suit our purposes as well as technology gaps within those algorithms, and provide initial experimental results for non-contact health monitoring using real-time pose recognition algorithms running on a Nvidia Jetson TX2 mounted on board a quadrotor UAV. Finally we provide technology development recommendations for future phases of the AASAPS-HADR system.
READ LESS

Summary

In this paper we propose an integrated system concept for autonomously surveying and planning emergency response for areas impacted by natural disasters. Referred to as AASAPS-HADR, this system is composed of a network of ground stations and autonomous aerial vehicles interconnected by an ad hoc emergency communication network. The system...

READ MORE