90

Delta-Audit: Explaining What Changes When Models Change

Main:5 Pages
1 Figures
Bibliography:2 Pages
1 Tables
Abstract

Model updates (new hyperparameters, kernels, depths, solvers, or data) change performance, but the \emph{reason} often remains opaque. We introduce \textbf{Delta-Attribution} (\mbox{Δ\Delta-Attribution}), a model-agnostic framework that explains \emph{what changed} between versions AA and BB by differencing per-feature attributions: Δϕ(x)=ϕB(x)ϕA(x)\Delta\phi(x)=\phi_B(x)-\phi_A(x). We evaluate Δϕ\Delta\phi with a \emph{Δ\Delta-Attribution Quality Suite} covering magnitude/sparsity (L1, Top-kk, entropy), agreement/shift (rank-overlap@10, Jensen--Shannon divergence), behavioural alignment (Delta Conservation Error, DCE; Behaviour--Attribution Coupling, BAC; COΔ\DeltaF), and robustness (noise, baseline sensitivity, grouped occlusion).Instantiated via fast occlusion/clamping in standardized space with a class-anchored margin and baseline averaging, we audit 45 settings: five classical families (Logistic Regression, SVC, Random Forests, Gradient Boosting, kkNN), three datasets (Breast Cancer, Wine, Digits), and three A/B pairs per family. \textbf{Findings.} Inductive-bias changes yield large, behaviour-aligned deltas (e.g., SVC poly ⁣\!\rightarrowrbf on Breast Cancer: BAC\approx0.998, DCE\approx6.6; Random Forest feature-rule swap on Digits: BAC\approx0.997, DCE\approx7.5), while ``cosmetic'' tweaks (SVC \texttt{gamma=scale} vs.\ \texttt{auto}, kkNN search) show rank-overlap@10=1.0=1.0 and DCE\approx0. The largest redistribution appears for deeper GB on Breast Cancer (JSD\approx0.357). Δ\Delta-Attribution offers a lightweight update audit that complements accuracy by distinguishing benign changes from behaviourally meaningful or risky reliance shifts.

View on arXiv
Comments on this paper