Attention Flows are Shapley Value Explanations

Attention Flows are Shapley Value Explanations

31 May 2021

Kawin Ethayarajh

Dan Jurafsky

Papers citing "Attention Flows are Shapley Value Explanations"

9 / 9 papers shown

Title
Utility-inspired Reward Transformations Improve Reinforcement Learning Training of Language Models Roberto-Rafael Maura-Rivero Chirag Nagpal Roma Patel Francesco Visin 51 1 0 08 Jan 2025
Attention Meets Post-hoc Interpretability: A Mathematical Perspective Gianluigi Lopardo F. Precioso Damien Garreau 21 4 0 05 Feb 2024
Game Theory Solutions in Sensor-Based Human Activity Recognition: A Review M. Shayesteh Behrooz Sharokhzadeh B. Masoumi 18 3 0 09 Nov 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models Yifan Hou Jiaoda Li Yu Fei Alessandro Stolfo Wangchunshu Zhou Guangtao Zeng Antoine Bosselut Mrinmaya Sachan LRM 30 40 0 23 Oct 2023
Morphosyntactic probing of multilingual BERT models Judit Ács Endre Hamerlik Roy Schwartz Noah A. Smith András Kornai 35 9 0 09 Jun 2023
Weakly Supervised Learning Significantly Reduces the Number of Labels Required for Intracranial Hemorrhage Detection on Head CT Jacopo Teneggi P. Yi Jeremias Sulam 32 3 0 29 Nov 2022
IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces Kelly Marchisio Neha Verma Kevin Duh Philipp Koehn 32 6 0 11 Oct 2022
Attention Flows for General Transformers Niklas Metzger Christopher Hahn Julian Siber Frederik Schmitt Bernd Finkbeiner 42 0 0 30 May 2022
Diagnosing AI Explanation Methods with Folk Concepts of Behavior Alon Jacovi Jasmijn Bastings Sebastian Gehrmann Yoav Goldberg Katja Filippova 36 15 0 27 Jan 2022