ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.01029
  4. Cited By
Explainability for Large Language Models: A Survey

Explainability for Large Language Models: A Survey

2 September 2023
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Mengnan Du
    LRM
ArXivPDFHTML

Papers citing "Explainability for Large Language Models: A Survey"

50 / 67 papers shown
Title
Retrieval Augmented Generation Evaluation for Health Documents
Retrieval Augmented Generation Evaluation for Health Documents
Mario Ceresa
Lorenzo Bertolini
Valentin Comte
Nicholas Spadaro
Barbara Raffael
...
Sergio Consoli
Amalia Muñoz Piñeiro
Alex Patak
Maddalena Querci
Tobias Wiesenthal
RALM
3DV
39
0
1
07 May 2025
Privacy Risks and Preservation Methods in Explainable Artificial Intelligence: A Scoping Review
Privacy Risks and Preservation Methods in Explainable Artificial Intelligence: A Scoping Review
Sonal Allana
Mohan Kankanhalli
Rozita Dara
32
0
0
05 May 2025
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures
Francisco Aguilera-Martínez
Fernando Berzal
PILM
52
0
0
02 May 2025
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Mahdi Dhaini
Ege Erdogan
Nils Feldhus
Gjergji Kasneci
46
0
0
02 May 2025
XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs
XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs
Marco Arazzi
Vignesh Kumar Kembu
Antonino Nocera
V. P.
82
0
0
30 Apr 2025
Bi-directional Model Cascading with Proxy Confidence
Bi-directional Model Cascading with Proxy Confidence
David Warren
Mark Dras
44
0
0
27 Apr 2025
Beyond Public Access in LLM Pre-Training Data
Beyond Public Access in LLM Pre-Training Data
Sruly Rosenblat
Tim O'Reilly
Ilan Strauss
MLAU
57
0
0
24 Apr 2025
An evaluation of LLMs and Google Translate for translation of selected Indian languages via sentiment and semantic analyses
An evaluation of LLMs and Google Translate for translation of selected Indian languages via sentiment and semantic analyses
Rohitash Chandra
Aryan Chaudhary
Yeshwanth Rayavarapu
44
0
0
27 Mar 2025
TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention
TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention
Jinhao Duan
Fei Kong
Hao-Ran Cheng
James Diffenderfer
B. Kailkhura
Lichao Sun
Xiaofeng Zhu
Xiaoshuang Shi
Kaidi Xu
141
0
0
13 Mar 2025
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?
Yuhang Liu
Dong Gong
Erdun Gao
Zhen Zhang
Biwei Huang
Mingming Gong
Anton van den Hengel
Javen Qinfeng Shi
J. Shi
154
0
0
12 Mar 2025
Statistical Deficiency for Task Inclusion Estimation
Loïc Fosse
Frédéric Béchet
Benoit Favre
Géraldine Damnati
Gwénolé Lecorvé
Maxime Darrin
Philippe Formont
Pablo Piantanida
136
0
0
07 Mar 2025
Unveiling Scoring Processes: Dissecting the Differences between LLMs and Human Graders in Automatic Scoring
Unveiling Scoring Processes: Dissecting the Differences between LLMs and Human Graders in Automatic Scoring
Xuansheng Wu
Padmaja Pravin Saraf
Gyeong-Geon Lee
Ehsan Latif
Ninghao Liu
Xiaoming Zhai
55
4
0
24 Feb 2025
Exploring Translation Mechanism of Large Language Models
Exploring Translation Mechanism of Large Language Models
Hongbin Zhang
Kehai Chen
Xuefeng Bai
Xiucheng Li
Yang Xiang
Min Zhang
59
1
0
17 Feb 2025
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Yueying Zou
Peipei Li
Zekun Li
Huaibo Huang
Xing Cui
Xuannan Liu
Chenghanyu Zhang
Ran He
DeLMO
120
2
0
07 Feb 2025
CueTip: An Interactive and Explainable Physics-aware Pool Assistant
CueTip: An Interactive and Explainable Physics-aware Pool Assistant
Sean Memery
Kevin Denamganai
Jiaxin Zhang
Zehai Tu
Yiwen Guo
Kartic Subr
LRM
42
0
0
30 Jan 2025
Citations and Trust in LLM Generated Responses
Yifan Ding
Matthew Facciani
Amrit Poudel
Ellen Joyce
Salvador Aguiñaga
Balaji Veeramani
Sanmitra Bhattacharya
Tim Weninger
HILM
41
3
0
03 Jan 2025
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
Yanwen Huang
Yong Zhang
Ning Cheng
Zhitao Li
Shaojun Wang
Jing Xiao
86
0
0
02 Jan 2025
How Do Artificial Intelligences Think? The Three Mathematico-Cognitive Factors of Categorical Segmentation Operated by Synthetic Neurons
How Do Artificial Intelligences Think? The Three Mathematico-Cognitive Factors of Categorical Segmentation Operated by Synthetic Neurons
Michael Pichat
William Pogrund
Armanush Gasparian
Paloma Pichat
Samuel Demarchi
Michael Veillet-Guillem
42
3
0
26 Dec 2024
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Huaizhi Ge
Yiming Li
Qifan Wang
Yongfeng Zhang
Ruixiang Tang
AAML
SILM
81
0
0
19 Nov 2024
Attention Tracker: Detecting Prompt Injection Attacks in LLMs
Attention Tracker: Detecting Prompt Injection Attacks in LLMs
Kuo-Han Hung
Ching-Yun Ko
Ambrish Rawat
I-Hsin Chung
Winston H. Hsu
Pin-Yu Chen
49
7
0
01 Nov 2024
Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers
Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers
Lam Nguyen Tung
Steven Cho
Xiaoning Du
Neelofar Neelofar
Valerio Terragni
Stefano Ruberto
Aldeida Aleti
136
2
0
30 Oct 2024
Large Language Model-assisted Speech and Pointing Benefits Multiple 3D
  Object Selection in Virtual Reality
Large Language Model-assisted Speech and Pointing Benefits Multiple 3D Object Selection in Virtual Reality
Junlong Chen
Jens Grubert
Per Ola Kristensson
21
0
0
28 Oct 2024
On the Role of Attention Heads in Large Language Model Safety
On the Role of Attention Heads in Large Language Model Safety
Z. Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Kun Wang
Yang Liu
Junfeng Fang
Yongbin Li
59
5
0
17 Oct 2024
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based
  Language Models
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models
Kushal Tatariya
Vladimir Araujo
Thomas Bauwens
Miryam de Lhoneux
VLM
33
0
0
15 Oct 2024
Output Scouting: Auditing Large Language Models for Catastrophic Responses
Output Scouting: Auditing Large Language Models for Catastrophic Responses
Andrew Bell
João Fonseca
KELM
51
1
0
04 Oct 2024
F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI
F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI
Xu Zheng
Farhad Shirani
Zhuomin Chen
Chaohao Lin
Wei Cheng
Wenbo Guo
Dongsheng Luo
AAML
28
0
0
03 Oct 2024
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
Haiyan Zhao
Heng Zhao
Bo Shen
Ali Payani
Fan Yang
Mengnan Du
59
2
0
30 Sep 2024
SynSUM -- Synthetic Benchmark with Structured and Unstructured Medical Records
SynSUM -- Synthetic Benchmark with Structured and Unstructured Medical Records
Paloma Rabaey
Henri Arno
Stefan Heytens
Thomas Demeester
31
1
0
13 Sep 2024
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Wei Chen
Zhen Huang
Liang Xie
Binbin Lin
Houqiang Li
...
Deng Cai
Yonggang Zhang
Wenxiao Wang
Xu Shen
Jieping Ye
51
6
0
03 Sep 2024
Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models
Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models
Xiyu Liu
Zhengxiao Liu
Naibin Gu
Zheng-Shen Lin
Wanli Ma
Ji Xiang
Weiping Wang
KELM
44
0
0
27 Aug 2024
Visual Agents as Fast and Slow Thinkers
Visual Agents as Fast and Slow Thinkers
Guangyan Sun
Mingyu Jin
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAG
LRM
77
13
0
16 Aug 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
38
10
0
27 Jul 2024
MAVEN-Fact: A Large-scale Event Factuality Detection Dataset
MAVEN-Fact: A Large-scale Event Factuality Detection Dataset
Chunyang Li
Hao Peng
Xiaozhi Wang
Y. Qi
Lei Hou
Bin Xu
Juanzi Li
HILM
35
1
0
22 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
38
18
0
08 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
82
19
0
02 Jul 2024
When Search Engine Services meet Large Language Models: Visions and
  Challenges
When Search Engine Services meet Large Language Models: Visions and Challenges
Haoyi Xiong
Jiang Bian
Yuchen Li
Xuhong Li
Mengnan Du
Shuaiqiang Wang
Dawei Yin
Sumi Helal
53
28
0
28 Jun 2024
Applications of Generative AI in Healthcare: algorithmic, ethical, legal
  and societal considerations
Applications of Generative AI in Healthcare: algorithmic, ethical, legal and societal considerations
Onyekachukwu R. Okonji
Kamol Yunusov
Bonnie Gordon
MedIm
41
3
0
15 Jun 2024
How Alignment and Jailbreak Work: Explain LLM Safety through
  Intermediate Hidden States
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Yongbin Li
24
26
0
09 Jun 2024
Generative AI Voting: Fair Collective Choice is Resilient to LLM Biases and Inconsistencies
Generative AI Voting: Fair Collective Choice is Resilient to LLM Biases and Inconsistencies
Srijoni Majumdar
Edith Elkind
Evangelos Pournaras
SyDa
49
1
0
31 May 2024
Large Language Models Cannot Explain Themselves
Large Language Models Cannot Explain Themselves
Advait Sarkar
LRM
37
7
0
07 May 2024
Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware
  Campaigns
Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns
Constantinos Patsakis
Fran Casino
Nikolaos Lykousas
39
12
0
30 Apr 2024
Transformers for molecular property prediction: Lessons learned from the
  past five years
Transformers for molecular property prediction: Lessons learned from the past five years
Afnan Sultan
Jochen Sieg
M. Mathea
Andrea Volkamer
AI4CE
29
10
0
05 Apr 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
49
8
0
29 Feb 2024
The RL/LLM Taxonomy Tree: Reviewing Synergies Between Reinforcement
  Learning and Large Language Models
The RL/LLM Taxonomy Tree: Reviewing Synergies Between Reinforcement Learning and Large Language Models
M. Pternea
Prerna Singh
Abir Chakraborty
Y. Oruganti
M. Milletarí
Sayli Bapat
Kebei Jiang
OffRL
18
7
0
02 Feb 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Black-Box Access is Insufficient for Rigorous AI Audits
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
34
78
0
25 Jan 2024
Walking a Tightrope -- Evaluating Large Language Models in High-Risk
  Domains
Walking a Tightrope -- Evaluating Large Language Models in High-Risk Domains
Chia-Chien Hung
Wiem Ben-Rim
Lindsay Frost
Lars Bruckner
Carolin (Haas) Lawrence
AILaw
ALM
ELM
25
9
0
25 Nov 2023
A Survey of Graph Meets Large Language Model: Progress and Future
  Directions
A Survey of Graph Meets Large Language Model: Progress and Future Directions
Yuhan Li
Zhixun Li
Peisong Wang
Jia Li
Xiangguo Sun
Hongtao Cheng
Jeffrey Xu Yu
38
55
0
21 Nov 2023
Explain-then-Translate: An Analysis on Improving Program Translation
  with Self-generated Explanations
Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations
Zilu Tang
Mayank Agarwal
Alex Shypula
Bailin Wang
Derry Wijaya
Jie Chen
Yoon Kim
LRM
37
15
0
13 Nov 2023
The Geometry of Truth: Emergent Linear Structure in Large Language Model
  Representations of True/False Datasets
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
Samuel Marks
Max Tegmark
HILM
102
168
0
10 Oct 2023
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
131
619
0
26 Apr 2023
12
Next