On Completeness-aware Concept-Based Explanations in Deep Neural Networks

17 October 2019

Chih-Kuan Yeh

Been Kim

Sercan O. Arik

Chun-Liang Li

Tomas Pfister

Pradeep Ravikumar

FAtt

ArXiv (abs)PDF HTML Github (53★)

Abstract

Human explanations of high-level decisions are often expressed in terms of key concepts the decisions are based on. In this paper, we study such concept-based explainability for Deep Neural Networks (DNNs). First, we define the notion of completeness, which quantifies how sufficient is a particular set of concepts in explaining a model's prediction behavior. Next, we propose a concept discovery method that aims to infer a complete set of concepts that are additionally encouraged to be interpretable. Our concept discovery method aims to address the limitations of commonly-used methods such as PCA and TCAV. To define an importance score for each discovered concept, we adapt game-theoretic notions to aggregate over sets and propose \emph{ConceptSHAP}. On a Synthetic dataset with ground-truth concept explanations, on a real-world dataset, and with a user study, we validate the effectiveness of our framework in finding concepts that are both complete in explaining the decisions, and interpretable.

View on arXiv

Comments on this paper