ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT
v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXiv (abs)PDFHTML

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,056 papers shown
Title
Knowledge Conflicts for LLMs: A Survey
Knowledge Conflicts for LLMs: A Survey
Rongwu Xu
Zehan Qi
Zhijiang Guo
Cunxiang Wang
Hongru Wang
Yue Zhang
Wei Xu
324
122
0
13 Mar 2024
The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage
  Brought By Model Editing
The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing
Jianchen Wang
Zhouhong Gu
Xiaoxuan Zhu
Lin Zhang
Haoning Ye
Zhuozhi Xiong
Hongwei Feng
Yanghua Xiao
KELM
106
2
0
12 Mar 2024
pyvene: A Library for Understanding and Improving PyTorch Models via
  Interventions
pyvene: A Library for Understanding and Improving PyTorch Models via Interventions
Zhengxuan Wu
Atticus Geiger
Aryaman Arora
Jing-ling Huang
Zheng Wang
Noah D. Goodman
Christopher D. Manning
Christopher Potts
MU
114
32
0
12 Mar 2024
Beyond Memorization: The Challenge of Random Memory Access in Language
  Models
Beyond Memorization: The Challenge of Random Memory Access in Language Models
Tongyao Zhu
Qian Liu
Liang Pang
Zhengbao Jiang
Min-Yen Kan
Min Lin
KELM
93
6
0
12 Mar 2024
VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark
VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark
Han Huang
Haitian Zhong
Tao Yu
Qiang Liu
Shu Wu
Liang Wang
Tien-Ping Tan
VLMKELM
61
11
0
12 Mar 2024
Rebuilding ROME : Resolving Model Collapse during Sequential Model
  Editing
Rebuilding ROME : Resolving Model Collapse during Sequential Model Editing
Akshat Gupta
Sidharth Baskaran
Gopala Anumanchipalli
KELM
132
34
0
11 Mar 2024
The pitfalls of next-token prediction
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
117
81
0
11 Mar 2024
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Bhavya Vasudeva
Deqing Fu
Tianyi Zhou
Elliott Kau
Youqi Huang
Vatsal Sharan
116
2
0
11 Mar 2024
Editing Conceptual Knowledge for Large Language Models
Editing Conceptual Knowledge for Large Language Models
Xiaohan Wang
Shengyu Mao
Ningyu Zhang
Shumin Deng
Yunzhi Yao
Yue Shen
Lei Liang
Jinjie Gu
Huajun Chen
KELM
94
15
0
10 Mar 2024
MACE: Mass Concept Erasure in Diffusion Models
MACE: Mass Concept Erasure in Diffusion Models
Shilin Lu
Zilan Wang
Leyang Li
Yanzhu Liu
A. Kong
DiffM
94
93
0
10 Mar 2024
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines
Michael Toker
Hadas Orgad
Mor Ventura
Dana Arad
Yonatan Belinkov
DiffM
92
13
0
09 Mar 2024
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Nathaniel Li
Alexander Pan
Anjali Gopal
Summer Yue
Daniel Berrios
...
Yan Shoshitaishvili
Jimmy Ba
K. Esvelt
Alexandr Wang
Dan Hendrycks
ELM
139
196
0
05 Mar 2024
In-Context Sharpness as Alerts: An Inner Representation Perspective for
  Hallucination Mitigation
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Shiqi Chen
Miao Xiong
Junteng Liu
Zhengxuan Wu
Teng Xiao
Siyang Gao
Junxian He
HILM
140
26
0
03 Mar 2024
"Flex Tape Can't Fix That": Bias and Misinformation in Edited Language
  Models
"Flex Tape Can't Fix That": Bias and Misinformation in Edited Language Models
Karina Halevy
Anna Sotnikova
Badr AlKhamissi
Syrielle Montariol
Antoine Bosselut
KELM
92
4
0
29 Feb 2024
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent
  on Language Models
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Frederik Kunstner
Robin Yadav
Alan Milligan
Mark Schmidt
Alberto Bietti
106
34
0
29 Feb 2024
Whispers that Shake Foundations: Analyzing and Mitigating False Premise
  Hallucinations in Large Language Models
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models
Hongbang Yuan
Pengfei Cao
Zhuoran Jin
Yubo Chen
Daojian Zeng
Kang Liu
Jun Zhao
HILM
90
4
0
29 Feb 2024
How do Large Language Models Handle Multilingualism?
How do Large Language Models Handle Multilingualism?
Yiran Zhao
Wenxuan Zhang
Guizhen Chen
Kenji Kawaguchi
Lidong Bing
LRM
108
81
0
29 Feb 2024
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems
  in Commonsense Reasoning
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning
Jiachun Li
Pengfei Cao
Chenhao Wang
Zhuoran Jin
Yubo Chen
Daojian Zeng
Kang Liu
Jun Zhao
LRM
108
10
0
28 Feb 2024
How to think step-by-step: A mechanistic understanding of
  chain-of-thought reasoning
How to think step-by-step: A mechanistic understanding of chain-of-thought reasoning
Subhabrata Dutta
Joykirat Singh
Soumen Chakrabarti
Tanmoy Chakraborty
LRM
105
26
0
28 Feb 2024
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and
  Mitigating Knowledge Conflicts in Language Models
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models
Zhuoran Jin
Pengfei Cao
Hongbang Yuan
Yubo Chen
Jiexin Xu
Huaijun Li
Xiaojian Jiang
Kang Liu
Jun Zhao
262
48
0
28 Feb 2024
Editing Factual Knowledge and Explanatory Ability of Medical Large
  Language Models
Editing Factual Knowledge and Explanatory Ability of Medical Large Language Models
Derong Xu
Ziheng Zhang
Zhihong Zhu
Zhenxi Lin
Qidong Liu
...
Wanyu Wang
Yuyang Ye
Xiangyu Zhao
Yefeng Zheng
Enhong Chen
KELM
86
10
0
28 Feb 2024
Twists, Humps, and Pebbles: Multilingual Speech Recognition Models
  Exhibit Gender Performance Gaps
Twists, Humps, and Pebbles: Multilingual Speech Recognition Models Exhibit Gender Performance Gaps
Giuseppe Attanasio
Beatrice Savoldi
Dennis Fucci
Dirk Hovy
92
9
0
28 Feb 2024
RAVEL: Evaluating Interpretability Methods on Disentangling Language
  Model Representations
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations
Jing-ling Huang
Zhengxuan Wu
Christopher Potts
Mor Geva
Atticus Geiger
135
35
0
27 Feb 2024
TruthX: Alleviating Hallucinations by Editing Large Language Models in
  Truthful Space
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
Shaolei Zhang
Tian Yu
Yang Feng
HILMKELM
108
52
0
27 Feb 2024
Information Flow Routes: Automatically Interpreting Language Models at
  Scale
Information Flow Routes: Automatically Interpreting Language Models at Scale
Javier Ferrando
Elena Voita
121
41
0
27 Feb 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
244
91
0
26 Feb 2024
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Sohee Yang
E. Gribovskaya
Nora Kassner
Mor Geva
Sebastian Riedel
ReLMLRM
131
113
0
26 Feb 2024
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing
  Study
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study
Tianjie Ju
Weiwei Sun
Wei Du
Xinwei Yuan
Zhaochun Ren
Gongshen Liu
KELM
68
33
0
25 Feb 2024
Foot In The Door: Understanding Large Language Model Jailbreaking via
  Cognitive Psychology
Foot In The Door: Understanding Large Language Model Jailbreaking via Cognitive Psychology
Zhenhua Wang
Wei Xie
Baosheng Wang
Enze Wang
Zhiwen Gui
Shuoyoucheng Ma
Kai Chen
91
15
0
24 Feb 2024
How (un)ethical are instruction-centric responses of LLMs? Unveiling the
  vulnerabilities of safety guardrails to harmful queries
How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries
Somnath Banerjee
Sayan Layek
Rima Hazra
Animesh Mukherjee
92
18
0
23 Feb 2024
PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables
  Parameter-Efficient Transfer Learning
PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning
Zhisheng Lin
Han Fu
Chenghao Liu
Zhuo Li
Jianling Sun
MoEMoMe
67
6
0
23 Feb 2024
Interpreting Context Look-ups in Transformers: Investigating
  Attention-MLP Interactions
Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions
Clement Neo
Shay B. Cohen
Fazl Barez
78
5
0
23 Feb 2024
In-Context Learning of a Linear Transformer Block: Benefits of the MLP
  Component and One-Step GD Initialization
In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization
Ruiqi Zhang
Jingfeng Wu
Peter L. Bartlett
115
16
0
22 Feb 2024
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity
  Tracking
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking
Nikhil Prakash
Tamar Rott Shaham
Tal Haklay
Yonatan Belinkov
David Bau
101
67
0
22 Feb 2024
Understanding and Patching Compositional Reasoning in LLMs
Understanding and Patching Compositional Reasoning in LLMs
Zhaoyi Li
Gangwei Jiang
Hong Xie
Linqi Song
Defu Lian
Ying Wei
LRM
137
24
0
22 Feb 2024
Position: Explain to Question not to Justify
Position: Explain to Question not to Justify
Przemysław Biecek
Wojciech Samek
137
17
0
21 Feb 2024
Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate
  Knowledge Neurons in Large Language Models
Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models
Yuheng Chen
Pengfei Cao
Yubo Chen
Yining Wang
Shengping Liu
Kang Liu
Jun Zhao
KELM
102
1
0
21 Feb 2024
Knowledge Graph Enhanced Large Language Model Editing
Knowledge Graph Enhanced Large Language Model Editing
Mengqi Zhang
Xiaotian Ye
Qiang Liu
Fajie Yuan
Shu Wu
Zhumin Chen
KELM
69
23
0
21 Feb 2024
RefuteBench: Evaluating Refuting Instruction-Following for Large
  Language Models
RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Jianhao Yan
Yun Luo
Yue Zhang
ALMLRM
105
10
0
21 Feb 2024
Potential and Challenges of Model Editing for Social Debiasing
Potential and Challenges of Model Editing for Social Debiasing
Jianhao Yan
Futing Wang
Yafu Li
Yue Zhang
KELM
129
9
0
21 Feb 2024
Event-level Knowledge Editing
Event-level Knowledge Editing
Hao Peng
Xiaozhi Wang
Chunyang Li
Kaisheng Zeng
Jiangshan Duo
Yixin Cao
Lei Hou
Juanzi Li
KELM
95
7
0
20 Feb 2024
Stable Knowledge Editing in Large Language Models
Stable Knowledge Editing in Large Language Models
Zihao Wei
Liang Pang
Hanxing Ding
Jingcheng Deng
Huawei Shen
Xueqi Cheng
KELM
119
10
0
20 Feb 2024
Backward Lens: Projecting Language Model Gradients into the Vocabulary
  Space
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
Shahar Katz
Yonatan Belinkov
Mor Geva
Lior Wolf
120
17
1
20 Feb 2024
CausalGym: Benchmarking causal interpretability methods on linguistic
  tasks
CausalGym: Benchmarking causal interpretability methods on linguistic tasks
Aryaman Arora
Daniel Jurafsky
Christopher Potts
70
24
0
19 Feb 2024
Multilinear Mixture of Experts: Scalable Expert Specialization through
  Factorization
Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
James Oldfield
Markos Georgopoulos
Grigorios G. Chrysos
Christos Tzelepis
Yannis Panagakis
M. Nicolaou
Jiankang Deng
Ioannis Patras
MoE
128
10
0
19 Feb 2024
Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic
  Interpretability: A Case Study on Othello-GPT
Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT
Zhengfu He
Xuyang Ge
Qiong Tang
Tianxiang Sun
Qinyuan Cheng
Xipeng Qiu
96
22
0
19 Feb 2024
Transformer-based Causal Language Models Perform Clustering
Transformer-based Causal Language Models Perform Clustering
Xinbo Wu
Lav Varshney
78
6
0
19 Feb 2024
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step
  Reasoning Task
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task
Jannik Brinkmann
Abhay Sheshadri
Victor Levoso
Paul Swoboda
Christian Bartelt
LRM
77
28
0
19 Feb 2024
Learning to Edit: Aligning LLMs with Knowledge Editing
Learning to Edit: Aligning LLMs with Knowledge Editing
Yuxin Jiang
Yufei Wang
Chuhan Wu
Wanjun Zhong
Xingshan Zeng
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Qun Liu
Wei Wang
KELM
96
30
0
19 Feb 2024
Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large
  Language Models
Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models
Tianjie Ju
Yijin Chen
Xinwei Yuan
Zhuosheng Zhang
Wei Du
Yubin Zheng
Gongshen Liu
KELM
98
9
0
19 Feb 2024
Previous
123...141516...202122
Next