Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.05014
Cited By
Interpreting Pretrained Language Models via Concept Bottlenecks
8 November 2023
Zhen Tan
Lu Cheng
Song Wang
Yuan Bo
Wenlin Yao
Huan Liu
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Interpreting Pretrained Language Models via Concept Bottlenecks"
17 / 17 papers shown
Title
Intrinsic Barriers to Explaining Deep Foundation Models
Zhen Tan
Huan Liu
AI4CE
22
0
0
21 Apr 2025
Concept Layers: Enhancing Interpretability and Intervenability via LLM Conceptualization
Or Raphael Bidusa
Shaul Markovitch
65
0
0
20 Feb 2025
VLG-CBM: Training Concept Bottleneck Models with Vision-Language Guidance
Divyansh Srivastava
Beatriz Cabrero-Daniel
Christian Berger
VLM
67
8
0
17 Jan 2025
Concept Bottleneck Language Models For protein design
Aya Abdelsalam Ismail
Tuomas Oikarinen
Amy Wang
Julius Adebayo
Samuel Stanton
...
J. Kleinhenz
Allen Goodman
H. C. Bravo
Kyunghyun Cho
Nathan C. Frey
45
4
0
09 Nov 2024
Enforcing Interpretability in Time Series Transformers: A Concept Bottleneck Framework
Angela van Sprang
Erman Acar
Willem Zuidema
AI4TS
51
1
0
08 Oct 2024
Model Attribution in LLM-Generated Disinformation: A Domain Generalization Approach with Supervised Contrastive Learning
Alimohammad Beigi
Zhen Tan
Nivedh Mudiam
Canyu Chen
Kai Shu
Huan Liu
DeLMO
41
2
0
31 Jul 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
42
10
0
27 Jul 2024
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
56
7
0
02 Jul 2024
Conceptual Learning via Embedding Approximations for Reinforcing Interpretability and Transparency
Maor Dikter
Tsachi Blau
Chaim Baskin
43
0
0
13 Jun 2024
Facial Affective Behavior Analysis with Instruction Tuning
Yifan Li
Anh Dao
Wentao Bao
Zhen Tan
Tianlong Chen
Huan Liu
Yu Kong
CVBM
65
15
0
07 Apr 2024
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach
Zhen Tan
Jie Peng
Tianlong Chen
Huan Liu
37
6
0
08 Mar 2024
Large Language Models for Data Annotation: A Survey
Zhen Tan
Dawei Li
Song Wang
Alimohammad Beigi
Bohan Jiang
Amrita Bhattacharjee
Mansooreh Karami
Wenlin Yao
Lu Cheng
Huan Liu
SyDa
56
50
0
21 Feb 2024
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention
Zhen Tan
Tianlong Chen
Zhenyu Zhang
Huan Liu
52
14
0
22 Dec 2023
Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking Oracles
Zhiwei Tang
Dmitry Rybin
Tsung-Hui Chang
ALM
DiffM
39
26
0
07 Mar 2023
Causal Proxy Models for Concept-Based Model Explanations
Zhengxuan Wu
Karel DÓosterlinck
Atticus Geiger
Amir Zur
Christopher Potts
MILM
83
35
0
28 Sep 2022
Text Summarization with Pretrained Encoders
Yang Liu
Mirella Lapata
MILM
258
1,433
0
22 Aug 2019
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
296
31,267
0
16 Jan 2013
1