A handful of recent works have argued on the connection between machine learning and causality. In a reverse thought process, starting from the grounding of mental models in causal models, we strengthen these initial works with results that suggest XAI essentially requiring machine learning to learn models that are causally consistent with the task at hand. By recognizing how human mental models (HMM) are naturally represented by the Pearlian Structural Causal Model (SCM), we make two key observations through the construction of an example metric space for linear SCM: first, that the notion of a "true" data-underlying SCM is justified, and second, that an aggregation of human-derived SCM might point to said "true" SCM. Motivated by the implications of these insights, we conclude with a third observation which argues that interpretations derived from HMM must imply interpretability in the SCM framework. Following this intuition, we present an original derivation using these priorly established first principles to reveal a human-readable interpretation scheme consistent with the given SCM, justifying the naming Structural Causal Interpretations (SCI). Going further, we analyze these SCIs and their mathematical properties theoretically and empirically. We prove that any existing graph induction method (GIM) is in fact interpretable in the SCI-sense. Our first experiment (E1) assesses the quality of such GIM-based SCI. In (E2) we observe evidence for our conjecture on improved sample-efficiency for SCI-based learning. For (E3) we conduct a study (N=22) and observe superiority in human-based SCI over GIM ones, corroborating our initial hypothesis.
View on arXiv