Partially-controlled Markov decision processes for collision avoidance systems
Summary
Deciding when and how to avoid collision in stochastic environments requires accounting for the likelihood and relative costs of future sequences of outcomes in response to different sequences of actions. Prior work has investigated formulating the problem as a Markov decision process, discretizing the state space, and solving for the optimal strategy using dynamic programming. Experiments have shown that such an approach can be very effective, but scaling to higher-dimensional problems can be challenging due to the exponential growth of the discrete state space. This paper presents an approach that can greatly reduce the complexity of computing the optimal strategy in problems where only some of the dimensions of the problem are controllable. The approach is demonstrated on an airborne collision avoidance problem where the system must recommend maneuvers to an imperfect pilot.