Mission-Ready Reinforcement Learning
Reinforcement learning (RL) is a machine learning technique that trains artificial intelligence (AI) to solve complex decision problems — such as finding the optimal strategy for playing chess. Lincoln Laboratory researchers believe that RL will be a key technology for human-machine collaborative tasks, such as disaster response operations. However, in order for AI to be an effective teammate in highly complex and nuanced real-world scenarios, we must first demonstrate AI teaming effectiveness in a simplified, constrained task.
Our Mission-Ready Reinforcement Learning (MeRLin) project paired human players with various AI teammates in the collaborative card game called Hanabi. Our results showed that the human participants had a strong adverse subjective reaction toward a state-of-the art RL agent across nearly all axes of trust, interpretability, and predictability. We hypothesized that the current human-AI technology gap is due to the fact that RL is optimizing for the wrong metrics. We then developed a new RL training paradigm for collaborative settings where agents are trained based on a metric known as diversity, which teaches AI how to be a good teammate by training it with a mathematically diverse set of AI counterparts.
For next steps, we plan to integrate this new algorithm with a serious-gaming simulator and an RL framework called RLlib. This software engineering effort will enable a large suite of RL algorithms to be trained within the wide range of simulated scenarios.