Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.02160
Cited By
Explaining Language Models' Predictions with High-Impact Concepts
3 May 2023
Ruochen Zhao
Shafiq R. Joty
Yongjie Wang
Tan Wang
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Explaining Language Models' Predictions with High-Impact Concepts"
7 / 7 papers shown
Title
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
40
10
0
27 Jul 2024
Explaining black box text modules in natural language with language models
Chandan Singh
Aliyah R. Hsu
Richard Antonello
Shailee Jain
Alexander G. Huth
Bin-Xia Yu
Jianfeng Gao
MILM
31
47
0
17 May 2023
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
226
405
0
24 Feb 2021
On Completeness-aware Concept-Based Explanations in Deep Neural Networks
Chih-Kuan Yeh
Been Kim
Sercan Ö. Arik
Chun-Liang Li
Tomas Pfister
Pradeep Ravikumar
FAtt
122
297
0
17 Oct 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
882
0
03 May 2018
A causal framework for explaining the predictions of black-box sequence-to-sequence models
David Alvarez-Melis
Tommi Jaakkola
CML
229
201
0
06 Jul 2017
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
254
3,684
0
28 Feb 2017
1