Post Hoc Explanations of Language Models Can Improve Language Models

v1v2v3 (latest)

Post Hoc Explanations of Language Models Can Improve Language Models

19 May 2023

Satyapriya Krishna

Jiaqi Ma

Asma Ghandeharioun

Himabindu Lakkaraju

ArXiv (abs)PDF HTML

Papers citing "Post Hoc Explanations of Language Models Can Improve Language Models"

7 / 7 papers shown

Title
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety Seongmin Lee Aeree Cho Grace C. Kim ShengYun Peng Mansi Phute Duen Horng Chau LM&MA AI4CE 84 0 0 05 Jun 2025
FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation Qianli Wang Nils Feldhus Simon Ostermann Luis Felipe Villa-Arenas Sebastian Möller Vera Schmitt AAML 128 1 0 01 Jan 2025
Interplay between Federated Learning and Explainable Artificial Intelligence: a Scoping Review Luis M. Lopez-Ramos Florian Leiser Aditya Rastogi Steven Hicks Inga Strümke V. Madai Tobias Budig Ali Sunyaev A. Hilbert 248 3 0 07 Nov 2024
Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation Mufei Li Siqi Miao Pan Li RALM 205 17 0 28 Oct 2024
Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization Swarnadeep Saha Peter Hase Mohit Bansal LRM 80 11 0 15 Jun 2023
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective Satyapriya Krishna Tessa Han Alex Gu Steven Wu S. Jabbari Himabindu Lakkaraju 284 197 0 03 Feb 2022
"Why Should I Trust You?": Explaining the Predictions of Any Classifier Marco Tulio Ribeiro Sameer Singh Carlos Guestrin FAtt FaML 1.3K 17,237 0 16 Feb 2016