Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval

While there have been many proposals on making AI algorithms transparent and explainable, few have attempted to evaluate the impact of AI-generated explanations on human performance in conducting human-AI collaborative tasks. To bridge the gap, we propose a Twenty-Questions style collaborative image retrieval game, Explanation-assisted Guess Which (ExAG), as a method of evaluating the efficacy of explanations (visual evidence or textual justification) in the context of Visual Question Answering (VQA). In our proposed ExAG, a human user needs to guess a secretly picked image by the VQA agent by asking natural language questions. We show that when AI explains its answers, users succeed more often in guessing the secret image correctly. Furthermore, we show that while good explanations improve human performance, incorrect explanations can degrade game performance as compared to no-explanation games. Notably, a few correct explanations can readily improve human performance in game rounds where the AI system's answers are mostly incorrect as compared to no-explanation games. Our experiments, therefore, show that ExAG is an effective means to evaluate the efficacy of AI-generated explanation on human-AI collaborative tasks.
View on arXiv