Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.05262
Cited By
v1
v2
v3
v4
v5 (latest)
Locating and Editing Factual Associations in GPT
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Locating and Editing Factual Associations in GPT"
50 / 1,056 papers shown
Title
Large Language Models as Psychological Simulators: A Methodological Guide
Zhicheng Lin
LLMAG
40
1
0
20 Jun 2025
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
Jingtong Su
Julia Kempe
Karen Ullrich
21
0
0
20 Jun 2025
Latent Concept Disentanglement in Transformer-based Language Models
Guan Zhe Hong
Bhavya Vasudeva
Vatsal Sharan
Cyrus Rashtchian
Prabhakar Raghavan
Rina Panigrahy
ReLM
LRM
35
0
0
20 Jun 2025
Under the Shadow of Babel: How Language Shapes Reasoning in LLMs
Chenxi Wang
Y. Zhang
Lang Gao
Zixiang Xu
Zirui Song
Yanbo Wang
Xiuying Chen
21
0
0
19 Jun 2025
Mr. Snuffleupagus at SemEval-2025 Task 4: Unlearning Factual Knowledge from LLMs Using Adaptive RMU
Arjun Dosajh
Mihika Sanghi
MU
24
0
0
19 Jun 2025
Can structural correspondences ground real world representational content in Large Language Models?
Iwan Williams
29
0
0
19 Jun 2025
Visual symbolic mechanisms: Emergent symbol processing in vision language models
Rim Assouel
Declan Campbell
Taylor Webb
15
0
0
18 Jun 2025
The Compositional Architecture of Regret in Large Language Models
Xiangxiang Cui
Shu Yang
Tianjin Huang
Wanyu Lin
Lijie Hu
Di Wang
33
0
0
18 Jun 2025
Learning-Time Encoding Shapes Unlearning in LLMs
Ruihan Wu
Konstantin Garov
Kamalika Chaudhuri
MU
31
0
0
18 Jun 2025
Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models
Chenchen Yuan
Zheyu Zhang
Shuo Yang
Bardh Prenkaj
Gjergji Kasneci
43
0
0
17 Jun 2025
DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models
Zhiyi Shi
Binjie Wang
Chongjie Si
Yichen Wu
Junsik Kim
Hanspeter Pfister
KELM
VLM
34
0
0
16 Jun 2025
Position: Pause Recycling LoRAs and Prioritize Mechanisms to Uncover Limits and Effectiveness
Mei-Yen Chen
Thi Thu Uyen Hoang
Michael Hahn
M. Sarfraz
MoMe
33
0
0
16 Jun 2025
Mitigating Safety Fallback in Editing-based Backdoor Injection on LLMs
Houcheng Jiang
Zetong Zhao
Junfeng Fang
Haokai Ma
Ruipeng Wang
Yang Deng
Xiang Wang
Xiangnan He
KELM
AAML
35
0
0
16 Jun 2025
TrojanTO: Action-Level Backdoor Attacks against Trajectory Optimization Models
Yang Dai
Oubo Ma
Longfei Zhang
Xingxing Liang
Xiaochun Cao
Shouling Ji
J. Zhang
Jincai Huang
Li Shen
39
0
0
15 Jun 2025
Model Merging for Knowledge Editing
Zichuan Fu
Xian Wu
Guojing Li
Yingying Zhang
Yefeng Zheng
Tianshi Ming
Y. X. R. Wang
Wanyu Wang
Xiangyu Zhao
KELM
MoMe
CLL
32
0
0
14 Jun 2025
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback
Dongwei Jiang
Alvin Zhang
Andrew Wang
Nicholas Andrews
Daniel Khashabi
LRM
31
0
0
13 Jun 2025
Mitigating Negative Interference in Multilingual Sequential Knowledge Editing through Null-Space Constraints
Wei Sun
Tingyu Qu
Mingxiao Li
Jesse Davis
Marie-Francine Moens
KELM
130
0
0
12 Jun 2025
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization
Or Shafran
Atticus Geiger
Mor Geva
MILM
109
0
0
12 Jun 2025
Self-Adapting Language Models
Adam Zweiger
Jyothish Pari
Han Guo
Ekin Akyürek
Yoon Kim
Pulkit Agrawal
KELM
LRM
155
0
0
12 Jun 2025
Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
Wuwei Zhang
Fangcong Yin
Howard Yen
Danqi Chen
Xi Ye
LRM
86
0
0
11 Jun 2025
Is Fine-Tuning an Effective Solution? Reassessing Knowledge Editing for Unstructured Data
Hao Xiong
Chuanyuan Tan
Wenliang Chen
KELM
54
0
0
11 Jun 2025
Know-MRI: A Knowledge Mechanisms Revealer&Interpreter for Large Language Models
Jiaxiang Liu
Boxuan Xing
Chenhao Yuan
Chenxiang Zhang
Di Wu
...
Haida Yu
Chuhan Lang
Pengfei Cao
Jun Zhao
Kang Liu
25
0
0
10 Jun 2025
PropMEND: Hypernetworks for Knowledge Propagation in LLMs
Zeyu Leo Liu
Greg Durrett
Eunsol Choi
KELM
39
0
0
10 Jun 2025
Did I Faithfully Say What I Thought? Bridging the Gap Between Neural Activity and Self-Explanations in Large Language Models
Milan Bhan
Jean-Noel Vittaut
Nicolas Chesneau
Sarath Chandar
Marie-Jeanne Lesot
LRM
35
0
0
10 Jun 2025
Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models
Elena Sofia Ruzzetti
Giancarlo A. Xompero
Davide Venditti
Fabio Massimo Zanzotto
KELM
PILM
58
0
0
09 Jun 2025
LLM Unlearning Should Be Form-Independent
Xiaotian Ye
Mengqi Zhang
Shu Wu
MU
29
0
0
09 Jun 2025
Beyond Benchmarks: A Novel Framework for Domain-Specific LLM Evaluation and Knowledge Mapping
Nitin Sharma
Thomas Wolfers
Çağatay Yıldız
ALM
32
0
0
09 Jun 2025
Learning Distribution-Wise Control in Representation Space for Language Models
Chunyuan Deng
Ruidi Chang
Hanjie Chen
24
0
0
07 Jun 2025
On the Adaptive Psychological Persuasion of Large Language Models
Tianjie Ju
Yujia Chen
Hao Fei
Mong Li Lee
Wynne Hsu
Pengzhou Cheng
Zongru Wu
Zhuosheng Zhang
Gongshen Liu
23
0
0
07 Jun 2025
Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
Rongzhe Wei
Peizhi Niu
Hans Hao-Hsun Hsu
Ruihan Wu
Haoteng Yin
...
Vamsi K. Potluru
Eli Chien
Kamalika Chaudhuri
Olgica Milenković
P. Li
MU
KELM
71
0
0
06 Jun 2025
What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models
Kaiser Sun
Fan Bai
Mark Dredze
21
0
0
06 Jun 2025
Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective
Bhavik Chandna
Zubair Bashir
Procheta Sen
100
0
0
05 Jun 2025
AudioLens: A Closer Look at Auditory Attribute Perception of Large Audio-Language Models
Chih-Kai Yang
Neo Ho
Yi-Jyun Lee
Hung-yi Lee
AuLLM
111
0
0
05 Jun 2025
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety
Seongmin Lee
Aeree Cho
Grace C. Kim
ShengYun Peng
Mansi Phute
Duen Horng Chau
LM&MA
AI4CE
84
0
0
05 Jun 2025
LLMs Can Compensate for Deficiencies in Visual Representations
Sho Takishita
Jay Gala
Abdelrahman Mohamed
Kentaro Inui
Yova Kementchedjhieva
VLM
57
0
0
05 Jun 2025
MobiEdit: Resource-efficient Knowledge Editing for Personalized On-device LLMs
Zhenyan Lu
Daliang Xu
Dongqi Cai
Zexi Li
Wei Liu
Fangming Liu
Shangguang Wang
Mengwei Xu
KELM
22
0
0
05 Jun 2025
RedDebate: Safer Responses through Multi-Agent Red Teaming Debates
Ali Asad
Stephen Obadinma
Radin Shayanfar
Xiaodan Zhu
AAML
LLMAG
29
0
0
04 Jun 2025
AD-EE: Early Exiting for Fast and Reliable Vision-Language Models in Autonomous Driving
Lianming Huang
Haibo Hu
Yufei Cui
Jiacheng Zuo
Shangyu Wu
Nan Guan
Chun Jason Xue
VLM
25
0
0
04 Jun 2025
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis
Kejian Zhu
Shangqing Tu
Zhuoran Jin
Lei Hou
Juanzi Li
Jun Zhao
KELM
92
0
0
04 Jun 2025
Misalignment or misuse? The AGI alignment tradeoff
Max Hellrigel-Holderbaum
Leonard Dung
81
0
0
04 Jun 2025
Beyond Memorization: A Rigorous Evaluation Framework for Medical Knowledge Editing
Shigeng Chen
Linhao Luo
Zhangchi Qiu
Yanan Cao
Carl Yang
Shirui Pan
KELM
113
0
0
04 Jun 2025
Efficient Knowledge Editing via Minimal Precomputation
Akshat Gupta
Maochuan Lu
Thomas Hartvigsen
Gopala Anumanchipalli
KELM
84
0
0
04 Jun 2025
Bridging Neural ODE and ResNet: A Formal Error Bound for Safety Verification
Abdelrahman Sayed Sayed
Pierre-Jean Meyer
Mohamed Ghazel
33
0
0
03 Jun 2025
On Entity Identification in Language Models
Masaki Sakata
Sho Yokoi
Benjamin Heinzerling
Takumi Ito
Kentaro Inui
92
0
0
03 Jun 2025
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Jinyuan Luo
Zhen Fang
Yixuan Li
Seongheon Park
Ling Chen
AAML
HILM
67
0
0
03 Jun 2025
Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation
Dingwei Chen
Ziqiang Liu
Feiteng Fang
Chak Tou Leong
Shiwen Ni
A. Argha
Hamid Alinejad-Rokny
Min Yang
Chengming Li
KELM
HILM
61
0
0
03 Jun 2025
Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning
Changsheng Wang
Yihua Zhang
Jinghan Jia
Parikshit Ram
Dennis L. Wei
Yuguang Yao
Soumyadeep Pal
Nathalie Baracaldo
Sijia Liu
MU
84
0
0
02 Jun 2025
Detoxification of Large Language Models through Output-layer Fusion with a Calibration Model
Yuanhe Tian
Mingjie Deng
Guoqing Jin
Yan Song
MU
KELM
63
0
0
02 Jun 2025
ThinkEval: Practical Evaluation of Knowledge Preservation and Consistency in LLM Editing with Thought-based Knowledge Graphs
Manit Baser
D. Divakaran
M. Gurusamy
KELM
85
0
0
02 Jun 2025
Tug-of-war between idiom's figurative and literal meanings in LLMs
Soyoung Oh
Xinting Huang
Mathis Pink
Michael Hahn
Vera Demberg
66
0
0
02 Jun 2025
1
2
3
4
...
20
21
22
Next