Counterfactual Explanations & Adversarial Examples -- Common Grounds, Essential Differences, and Potential Transfers

Minds and Machines (MM), 2020

11 September 2020

Abstract

The same optimization problem underlies counterfactual explanations (CEs) and adversarial examples (AEs). While this is well known, the relationship between the two at the conceptual level remains unclear. The present paper provides exactly the missing conceptual link. We compare CEs and AEs with respect to their philosophical basis, aims, and modeling techniques. We argue that CEs are a more general object-class than AEs. In particular, we introduce the conceptual distinction between feasible and contesting CEs and show that AEs correspond to the latter.

View on arXiv

Comments on this paper