Human-like Decision Making for Autonomous Driving via Adversarial Inverse Reinforcement Learning

To make human-like decisions under complex driving environment is a challenging task for autonomous agents. Imitation Learning or learning-from-demonstration methods have seen great potential for achieving such a goal. Some state-of-the-art studies apply Generative Adversarial Imitation Learning (GAIL) to learn sequential decision-making and control policies. While GAIL can directly learn a policy, it lacks the ability to recover a reward function, which is considered robust and adaptable to environments changes. Adversarial Inverse Reinforcement Learning (AIRL) is another learning-from-demonstration method that can achieve similar benefits as GAIL but also learns the reward function with the policy simultaneously. In the original work of AIRL, it has been demonstrated in single-agent environments such as maze navigation and ant running tasks in OpenAI Gyms. In this paper, we augment AIRL by concatenating semantic reward terms in the learning framework to improve and stabilize its performance, and then extend it to a more practical but challenging situation, i.e. decision-making scenario in highly interactive driving environment. Four performance evaluation metrics are proposed and compared with some Imitation Learning based methods and Reinforcement Learning based methods. Simulation results show that the augmented AIRL outperforms all the other methods, and the trained vehicle agent can perform decision-making behaviors comparable with that of the experts.
View on arXiv