Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.05262
Cited By
v1
v2
v3
v4
v5 (latest)
Locating and Editing Factual Associations in GPT
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Locating and Editing Factual Associations in GPT"
50 / 1,056 papers shown
Title
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation
Xunjian Yin
Xu Zhang
Jie Ruan
Xiaojun Wan
ELM
116
24
0
18 Feb 2024
MIKE: A New Benchmark for Fine-grained Multimodal Entity Knowledge Editing
Jiaqi Li
Miaozeng Du
Chuanyi Zhang
Yongrui Chen
Nan Hu
Guilin Qi
Haiyun Jiang
Siyuan Cheng
Bo Tian
81
16
0
18 Feb 2024
InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration
Fali Wang
Runxue Bao
Suhang Wang
Wenchao Yu
Yanchi Liu
Wei Cheng
Haifeng Chen
KELM
77
13
0
18 Feb 2024
EVEDIT: Event-based Knowledge Editing with Deductive Editing Boundaries
Jiateng Liu
Pengfei Yu
Yuji Zhang
Sha Li
Zixuan Zhang
Heng Ji
KELM
84
17
0
17 Feb 2024
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Zihao Lin
Mohammad Beigi
Hongxuan Li
Yufan Zhou
Yuxiang Zhang
Qifan Wang
Wenpeng Yin
Lifu Huang
KELM
70
9
0
16 Feb 2024
Model Editing by Standard Fine-Tuning
G. Gangadhar
Karl Stratos
KELM
112
11
0
16 Feb 2024
Towards Uncovering How Large Language Model Works: An Explainability Perspective
Haiyan Zhao
Fan Yang
Bo Shen
Himabindu Lakkaraju
Jundong Li
91
13
0
16 Feb 2024
Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning
Tuc Nguyen
Thai Le
MoMe
95
3
0
16 Feb 2024
Do Llamas Work in English? On the Latent Language of Multilingual Transformers
Chris Wendler
V. Veselovsky
Giovanni Monea
Robert West
153
133
0
16 Feb 2024
Where is the answer? Investigating Positional Bias in Language Model Knowledge Extraction
Kuniaki Saito
Kihyuk Sohn
Chen-Yu Lee
Yoshitaka Ushiku
152
3
0
16 Feb 2024
Representation Surgery: Theory and Practice of Affine Steering
Shashwat Singh
Shauli Ravfogel
Jonathan Herzig
Roee Aharoni
Ryan Cotterell
Ponnurangam Kumaraguru
LLMSV
77
16
0
15 Feb 2024
Long-form evaluation of model editing
Domenic Rosati
Robie Gonzales
Jinkun Chen
Xuemin Yu
Melis Erkan
Yahya Kayani
Satya Deepika Chavatapalli
Frank Rudzicz
Hassan Sajjad
KELM
68
15
0
14 Feb 2024
Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models
Goutham Rajendran
Simon Buchholz
Bryon Aragam
Bernhard Schölkopf
Pradeep Ravikumar
AI4CE
177
23
0
14 Feb 2024
Spectral Filters, Dark Signals, and Attention Sinks
Nicola Cancedda
115
18
0
14 Feb 2024
Rethinking Machine Unlearning for Large Language Models
Sijia Liu
Yuanshun Yao
Jinghan Jia
Stephen Casper
Nathalie Baracaldo
...
Hang Li
Kush R. Varshney
Mohit Bansal
Sanmi Koyejo
Yang Liu
AILaw
MU
191
120
0
13 Feb 2024
Knowledge Editing on Black-box Large Language Models
Xiaoshuai Song
Zhengyang Wang
Keqing He
Guanting Dong
Yutao Mou
Jinxu Zhao
Weiran Xu
KELM
78
4
0
13 Feb 2024
Suppressing Pink Elephants with Direct Principle Feedback
Louis Castricato
Nathan Lile
Suraj Anand
Hailey Schoelkopf
Siddharth Verma
Stella Biderman
106
12
0
12 Feb 2024
Summing Up the Facts: Additive Mechanisms Behind Factual Recall in LLMs
Bilal Chughtai
Alan Cooney
Neel Nanda
HILM
KELM
72
20
0
11 Feb 2024
Prompt Perturbation in Retrieval-Augmented Generation based Large Language Models
Zhibo Hu
Chen Wang
Yanfeng Shu
Helen Paik
Paik
Liming Zhu
SILM
RALM
77
10
0
11 Feb 2024
Discriminative Adversarial Unlearning
Rohan Sharma
Shijie Zhou
Kaiyi Ji
Changyou Chen
MU
76
1
0
10 Feb 2024
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Reduan Achtibat
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Aakriti Jain
Thomas Wiegand
Sebastian Lapuschkin
Wojciech Samek
100
37
0
08 Feb 2024
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Boyi Wei
Kaixuan Huang
Yangsibo Huang
Tinghao Xie
Xiangyu Qi
Mengzhou Xia
Prateek Mittal
Mengdi Wang
Peter Henderson
AAML
162
118
0
07 Feb 2024
MEMORYLLM: Towards Self-Updatable Large Language Models
Yu Wang
Yifan Gao
Xiusi Chen
Haoming Jiang
Shiyang Li
...
Zheng Li
Xian Li
Bing Yin
Jingbo Shang
Julian McAuley
KELM
108
19
0
07 Feb 2024
Exploring higher-order neural network node interactions with total correlation
Thomas Kerby
Teresa White
Kevin Moon
40
0
0
06 Feb 2024
Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision
Nathaniel Hudson
J. G. Pauloski
Matt Baughman
Alok V. Kamatar
Mansi Sakarvadia
...
Owen Price Skelly
Ben Blaiszik
Rick L. Stevens
Kyle Chard
Ian Foster
MedIm
88
8
0
05 Feb 2024
How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning
Zeping Yu
Sophia Ananiadou
102
12
0
05 Feb 2024
KS-Lottery: Finding Certified Lottery Tickets for Multilingual Language Models
Fei Yuan
Chang Ma
Shuai Yuan
Qiushi Sun
Lei Li
74
3
0
05 Feb 2024
Self-attention Networks Localize When QK-eigenspectrum Concentrates
Han Bao
Ryuichiro Hataya
Ryo Karakida
58
5
0
03 Feb 2024
What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement
Xisen Jin
Xiang Ren
KELM
CLL
93
7
0
02 Feb 2024
Building Guardrails for Large Language Models
Yizhen Dong
Ronghui Mu
Gao Jin
Yi Qi
Jinwei Hu
Xingyu Zhao
Jie Meng
Wenjie Ruan
Xiaowei Huang
OffRL
139
32
0
02 Feb 2024
The Queen of England is not England's Queen: On the Lack of Factual Coherency in PLMs
Paul Youssef
Jorg Schlotterer
Christin Seifert
KELM
119
2
0
02 Feb 2024
An introduction to graphical tensor notation for mechanistic interpretability
Jordan K. Taylor
67
3
0
02 Feb 2024
Desiderata for the Context Use of Question Answering Systems
Sagi Shaier
Lawrence E Hunter
Katharina von der Wense
125
5
0
31 Jan 2024
Neighboring Perturbations of Knowledge Editing on Large Language Models
Jun-Yu Ma
Zhen-Hua Ling
Ningyu Zhang
Jia-Chen Gu
KELM
82
6
0
31 Jan 2024
Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks
Wenyue Hua
Jiang Guo
Mingwen Dong
He Zhu
Patrick Ng
Zhiguo Wang
KELM
130
21
0
31 Jan 2024
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering
Xiaopeng Li
Huijun Liu
Shangwen Wang
Bin Ji
Bing Ji
...
Jun Ma
Jie Yu
Xiaodong Liu
Jing Wang
Weimin Zhang
KELM
177
5
0
31 Jan 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
159
96
0
25 Jan 2024
SLANG: New Concept Comprehension of Large Language Models
Lingrui Mei
Shenghua Liu
Yiwei Wang
Baolong Bi
Xueqi Chen
KELM
77
8
0
23 Jan 2024
Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models
Rima Hazra
Sayan Layek
Somnath Banerjee
Soujanya Poria
KELM
108
20
0
19 Jan 2024
DeepEdit: Knowledge Editing as Decoding with Constraints
Yiwei Wang
Muhao Chen
Nanyun Peng
Kai-Wei Chang
KELM
99
28
0
19 Jan 2024
AI-as-exploration: Navigating intelligence space
Dimitri Coelho Mollo
88
1
0
15 Jan 2024
See the Unseen: Better Context-Consistent Knowledge-Editing by Noises
Youcheng Huang
Wenqiang Lei
Zheng Zhang
Jiancheng Lv
Shuicheng Yan
KELM
83
6
0
15 Jan 2024
Editing Arbitrary Propositions in LLMs without Subject Labels
Itai Feigenbaum
Devansh Arpit
Huan Wang
Shelby Heinecke
Juan Carlos Niebles
Weiran Yao
Caiming Xiong
Silvio Savarese
KELM
62
2
0
15 Jan 2024
Model Editing at Scale leads to Gradual and Catastrophic Forgetting
Akshat Gupta
Anurag Rao
Gopala Anumanchipalli
KELM
CLL
82
55
0
15 Jan 2024
TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini
Zhili Feng
Avi Schwarzschild
Zachary Chase Lipton
J. Zico Kolter
MU
CLL
148
193
0
11 Jan 2024
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Asma Ghandeharioun
Avi Caciularu
Adam Pearce
Lucas Dixon
Mor Geva
146
114
0
11 Jan 2024
Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue
Jia-Chen Gu
Haoyang Xu
Jun-Yu Ma
Pan Lu
Zhen-Hua Ling
Kai-Wei Chang
Nanyun Peng
KELM
121
55
0
09 Jan 2024
MPN: Leveraging Multilingual Patch Neuron for Cross-lingual Model Editing
Nianwen Si
Hao Zhang
Weiqiang Zhang
KELM
70
8
0
06 Jan 2024
Large Language Models for Social Networks: Applications, Challenges, and Solutions
Jingying Zeng
Richard Huang
Waleed Malik
Langxuan Yin
Bojan Babic
Danny Shacham
Xiao Yan
Jaewon Yang
Qi He
72
9
0
04 Jan 2024
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Andrew Lee
Xiaoyan Bai
Itamar Pres
Martin Wattenberg
Jonathan K. Kummerfeld
Rada Mihalcea
150
121
0
03 Jan 2024
Previous
1
2
3
...
15
16
17
...
20
21
22
Next