Mission-Ready Reinforcement Learning

We are using reinforcement learning to train artificial intelligence to team with humans.
Multiple images of drawn faces in different colors connected by white lines.

Reinforcement learning (RL) is a machine learning technique that trains artificial intelligence (AI) to solve complex decision problems — such as finding the optimal strategy for playing chess. Lincoln Laboratory researchers believe that RL will be a key technology for human-machine collaborative tasks, such as disaster response operations. However, in order for AI to be an effective teammate in highly complex and nuanced real-world scenarios, we must first demonstrate AI teaming effectiveness in a simplified, constrained task.

Our Mission-Ready Reinforcement Learning (MeRLin) project paired human players with various AI teammates in the collaborative card game called Hanabi. Our results showed that the human participants had a strong adverse subjective reaction toward a state-of-the art RL agent across nearly all axes of trust, interpretability, and predictability. We hypothesized that the current human-AI technology gap is due to the fact that RL is optimizing for the wrong metrics. We then developed a new RL training paradigm for collaborative settings where agents are trained based on a metric known as diversity, which teaches AI how to be a good teammate by training it with a mathematically diverse set of AI counterparts.

For next steps, we plan to integrate this new algorithm with a serious-gaming simulator and an RL framework called RLlib. This software engineering effort will enable a large suite of RL algorithms to be trained within the wide range of simulated scenarios.