Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.05262
Cited By
v1
v2
v3
v4
v5 (latest)
Locating and Editing Factual Associations in GPT
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Locating and Editing Factual Associations in GPT"
50 / 1,056 papers shown
Title
Tug-of-war between idiom's figurative and literal meanings in LLMs
Soyoung Oh
Xinting Huang
Mathis Pink
Michael Hahn
Vera Demberg
66
0
0
02 Jun 2025
CODEMENV: Benchmarking Large Language Models on Code Migration
Keyuan Cheng
Xudong Shen
Yihao Yang
Tengyue Wang
Yang Cao
Muhammad Asif Ali
Hanbin Wang
Lijie Hu
Di Wang
47
3
0
01 Jun 2025
COMPKE: Complex Question Answering under Knowledge Editing
Keyuan Cheng
Zijian Kan
Zhixian He
Zhuoran Zhang
Muhammad Asif Ali
Ke Xu
Lijie Hu
Di Wang
KELM
39
3
0
01 Jun 2025
One for All: Update Parameterized Knowledge Across Multiple Models
Weitao Ma
Xiyuan Du
Xiaocheng Feng
L. Huang
Yichong Huang
...
Xiaoliang Yang
Baohang Li
Xiachong Feng
Ting Liu
Bing Qin
KELM
65
0
0
01 Jun 2025
Spectral Insights into Data-Oblivious Critical Layers in Large Language Models
Xuyuan Liu
Lei Hsiung
Yaoqing Yang
Yujun Yan
AAML
59
0
0
31 May 2025
Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis
Junzhuo Li
Bo Wang
Xiuze Zhou
Peijie Jiang
Jia Liu
Xuming Hu
MoE
67
0
0
30 May 2025
Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration
Qinglin Zhu
Runcong Zhao
Hanqi Yan
Yulan He
Yudong Chen
Lin Gui
LRM
40
0
0
30 May 2025
Drop Dropout on Single-Epoch Language Model Pretraining
Houjun Liu
John Bauer
Christopher D. Manning
LRM
40
0
0
30 May 2025
Circuit Stability Characterizes Language Model Generalization
Alan Sun
LRM
35
0
0
30 May 2025
Mamba Knockout for Unraveling Factual Information Flow
Nir Endy
Idan Daniel Grosbard
Yuval Ran-Milo
Yonatan Slutzky
Itay Tshuva
Raja Giryes
36
0
0
30 May 2025
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs
Haokun Chen
Y. Zhang
Yuan Bi
Yao Zhang
Tong Liu
...
Jindong Gu
Claudia Grosser
Denis Krompass
Nassir Navab
Volker Tresp
MU
63
2
0
29 May 2025
ScEdit: Script-based Assessment of Knowledge Editing
Xinye Li
Zunwen Zheng
Qian Zhang
Dekai Zhuang
Jiabao Kang
...
Qingbin Liu
Xi Chen
Zhiying Tu
Dianhui Chu
Dianbo Sui
KELM
73
1
0
29 May 2025
From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs
Xuan Gong
Hanbo Huang
Shiyu Liang
48
0
0
29 May 2025
The End Of Universal Lifelong Identifiers: Identity Systems For The AI Era
Shriphani Palakodety
38
0
0
29 May 2025
Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation
Ziling Cheng
Meng Cao
Leila Pishdad
Yanshuai Cao
Jackie Chi Kit Cheung
LRM
105
1
0
29 May 2025
Understanding Refusal in Language Models with Sparse Autoencoders
Wei Jie Yeo
Nirmalendu Prakash
Clement Neo
Roy Ka-wei Lee
Erik Cambria
Ranjan Satapathy
18
0
0
29 May 2025
Sentinel: Attention Probing of Proxy Models for LLM Context Compression with an Understanding Perspective
Yong Zhang
Yanwen Huang
Ning Cheng
Yang Guo
Yun Zhu
Yanmeng Wang
Shaojun Wang
Jing Xiao
RALM
48
0
0
29 May 2025
MemOS: An Operating System for Memory-Augmented Generation (MAG) in Large Language Models
Zhiyu Li
Shichao Song
Hanyu Wang
Simin Niu
Ding Chen
...
Qingchen Yu
Bo Tang
Hongkang Yang
Zhi-hai Xu
Feiyu Xiong
RALM
43
0
0
28 May 2025
InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing
Shuaiyi Li
Zhisong Zhang
Yang Deng
Chenlong Deng
Tianqing Fang
Hongming Zhang
Haitao Mi
Dong Yu
Wai Lam
KELM
63
0
0
28 May 2025
Adaptive Detoxification: Safeguarding General Capabilities of LLMs through Toxicity-Aware Knowledge Editing
Yifan Lu
Jing Li
Yigeng Zhou
Yihui Zhang
Wenya Wang
Xiucheng Li
Meishan Zhang
Fangming Liu
Jun-chen Yu
Min Zhang
KELM
CLL
61
1
0
28 May 2025
Precise In-Parameter Concept Erasure in Large Language Models
Yoav Gur-Arieh
Clara Suslik
Yihuai Hong
Fazl Barez
Mor Geva
KELM
MU
105
0
0
28 May 2025
Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs
Ziling Cheng
Meng Cao
Marc-Antoine Rondeau
Jackie Chi Kit Cheung
LRM
80
1
0
28 May 2025
Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling
Hovhannes Tamoyan
Subhabrata Dutta
Iryna Gurevych
HILM
KELM
62
0
0
27 May 2025
Tracing and Reversing Rank-One Model Edits
Paul Youssef
Zhixue Zhao
C. Seifert
Jorg Schlotterer
KELM
27
0
0
27 May 2025
Pretrained LLMs Learn Multiple Types of Uncertainty
Roi Cohen
Omri Fahn
Gerard de Melo
43
0
0
27 May 2025
How Do Transformers Learn Variable Binding in Symbolic Programs?
Yiwei Wu
Atticus Geiger
Raphaël Millière
NAI
41
1
0
27 May 2025
SAEs Are Good for Steering -- If You Select the Right Features
Dana Arad
Aaron Mueller
Yonatan Belinkov
LLMSV
70
0
0
26 May 2025
CaseEdit: Enhancing Localized Commonsense Reasoning via Null-Space Constrained Knowledge Editing in Small Parameter Language Models
Varun Reddy
Yen-Ling Kuo
KELM
51
0
0
26 May 2025
Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?
Zexi Li
Xiangzhu Wang
William F. Shen
Meghdad Kurmanji
Xinchi Qiu
Dongqi Cai
Chao Wu
Nicholas D. Lane
KELM
MU
60
0
0
26 May 2025
DocMEdit: Towards Document-Level Model Editing
Li Zeng
Zeming Liu
Chong Feng
Heyan Huang
Yuhang Guo
KELM
53
0
0
26 May 2025
The Coverage Principle: A Framework for Understanding Compositional Generalization
Hoyeon Chang
Jinho Park
Hanseul Cho
Sohee Yang
Miyoung Ko
Hyeonbin Hwang
Seungpil Won
Dohaeng Lee
Youbin Ahn
Minjoon Seo
65
0
0
26 May 2025
Paths Not Taken: Understanding and Mending the Multilingual Factual Recall Pipeline
Meng Lu
Ruochen Zhang
Carsten Eickhoff
Ellie Pavlick
HILM
KELM
LRM
84
0
0
26 May 2025
Deriving Strategic Market Insights with Large Language Models: A Benchmark for Forward Counterfactual Generation
Keane Ong
Rui Mao
Deeksha Varshney
Paul Pu Liang
Erik Cambria
G. Mengaldo
AIFin
OffRL
23
0
0
26 May 2025
SCAR: Shapley Credit Assignment for More Efficient RLHF
Meng Cao
Shuyuan Zhang
Xiao-Wen Chang
Doina Precup
121
0
0
26 May 2025
Regularized Personalization of Text-to-Image Diffusion Models without Distributional Drift
Gihoon Kim
Hyungjin Park
Taesup Kim
DiffM
VLM
199
0
0
26 May 2025
A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models
Utkarsh Sahu
Zhisheng Qi
Y. Lei
Ryan Rossi
Franck Dernoncourt
Nesreen K. Ahmed
M. Halappanavar
Yao Ma
Yu Wang
75
0
0
25 May 2025
Concept Reachability in Diffusion Models: Beyond Dataset Constraints
Marta Aparicio Rodriguez
Xenia Miscouridou
Anastasia Borovykh
49
0
0
25 May 2025
REACT: Representation Extraction And Controllable Tuning to Overcome Overfitting in LLM Knowledge Editing
Haitian Zhong
Yuhuan Liu
Ziyang Xu
Guofan Liu
Qiang Liu
Shu Wu
Zhe Zhao
Liang Wang
Tieniu Tan
KELM
48
0
0
25 May 2025
Benchmarking and Rethinking Knowledge Editing for Large Language Models
Guoxiu He
Xin Song
Futing Wang
Aixin Sun
KELM
51
0
0
24 May 2025
Multi-Scale Manifold Alignment: A Unified Framework for Enhanced Explainability of Large Language Models
Yukun Zhang
Qi Dong
36
0
0
24 May 2025
Disentangling Knowledge Representations for Large Language Model Editing
Mengqi Zhang
Zisheng Zhou
Xiaotian Ye
Qiang Liu
Zhaochun Ren
Zhumin Chen
Fajie Yuan
KELM
39
1
0
24 May 2025
Why Do Some Inputs Break Low-Bit LLM Quantization?
Ting-Yun Chang
Muru Zhang
Jesse Thomason
Robin Jia
MQ
34
0
0
24 May 2025
Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment?
Hongzheng Yang
Yongqiang Chen
Zeyu Qin
Tongliang Liu
Chaowei Xiao
Kun Zhang
Bo Han
LLMSV
44
0
0
24 May 2025
GIM: Improved Interpretability for Large Language Models
Joakim Edin
Róbert Csordás
Tuukka Ruotsalo
Zhengxuan Wu
Maria Maistro
Jing-ling Huang
Lars Maaløe
126
0
0
23 May 2025
Conversations: Love Them, Hate Them, Steer Them
Niranjan Chebrolu
Gerard Christopher Yeo
Kokil Jaidka
30
0
0
23 May 2025
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
Mengru Wang
Ziwen Xu
Shengyu Mao
Shumin Deng
Zhaopeng Tu
Ningyu Zhang
N. Zhang
LLMSV
135
0
0
23 May 2025
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models
Patrick Leask
Neel Nanda
Noura Al Moubayed
91
1
0
23 May 2025
Reverse-Speech-Finder: A Neural Network Backtracking Architecture for Generating Alzheimer's Disease Speech Samples and Improving Diagnosis Performance
Victor O.K. Li
Yang Han
Jacqueline C. K. Lam
Lawrence Y. L. Cheung
198
0
0
23 May 2025
Multi-Scale Probabilistic Generation Theory: A Hierarchical Framework for Interpreting Large Language Models
Yukin Zhang
Qi Dong
117
0
0
23 May 2025
Model Editing with Graph-Based External Memory
Yash Kumar Atri
Ahmed Alaa
Thomas Hartvigsen
KELM
41
0
0
23 May 2025
Previous
1
2
3
4
5
...
20
21
22
Next