ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT
v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXiv (abs)PDFHTML

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,056 papers shown
Title
Identifying and Manipulating Personality Traits in LLMs Through Activation Engineering
Identifying and Manipulating Personality Traits in LLMs Through Activation Engineering
Rumi A. Allbert
James K. Wiles
Vlad Grankovsky
LLMSVAI4CE
164
1
0
10 Dec 2024
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI
  Policy, Research, and Practice
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
A. Feder Cooper
Christopher A. Choquette-Choo
Miranda Bogen
Matthew Jagielski
Katja Filippova
...
Abigail Z. Jacobs
Andreas Terzis
Hanna M. Wallach
Nicolas Papernot
Katherine Lee
AILawMU
184
20
0
09 Dec 2024
Implicit Priors Editing in Stable Diffusion via Targeted Token
  Adjustment
Implicit Priors Editing in Stable Diffusion via Targeted Token Adjustment
Feng He
Chao Zhang
Zhixue Zhao
183
0
0
04 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
Shijie Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
170
22
0
03 Dec 2024
Detecting Memorization in Large Language Models
Detecting Memorization in Large Language Models
Eduardo Slonski
90
0
0
02 Dec 2024
Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Arithmetic Reasoning
Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Arithmetic Reasoning
Keito Kudo
Yoichi Aoki
Tatsuki Kuribayashi
Shusaku Sone
Masaya Taniguchi
Ana Brassard
Keisuke Sakaguchi
Kentaro Inui
ReLMLRM
144
1
0
02 Dec 2024
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image
  Diffusion Models
DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models
Shwetha Ram
T. Neiman
Qianli Feng
Andrew Stuart
S. D. Tran
Trishul Chilimbi
141
2
0
28 Nov 2024
Neutralizing Backdoors through Information Conflicts for Large Language
  Models
Neutralizing Backdoors through Information Conflicts for Large Language Models
Chen Chen
Yuchen Sun
Xueluan Gong
Jiaxin Gao
K. Lam
KELMAAML
167
0
0
27 Nov 2024
One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge
  Neurons in Large Language Models
One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models
Pengfei Cao
Yuheng Chen
Zhuoran Jin
Yubo Chen
Kang Liu
Jun Zhao
KELM
122
0
0
26 Nov 2024
The Two-Hop Curse: LLMs trained on A$\rightarrow$B, B$\rightarrow$C fail to learn A$\rightarrow$C
The Two-Hop Curse: LLMs trained on A→\rightarrow→B, B→\rightarrow→C fail to learn A→\rightarrow→C
Mikita Balesni
Tomek Korbak
Owain Evans
ReLMLRM
145
0
0
25 Nov 2024
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
Sohee Yang
Nora Kassner
E. Gribovskaya
Sebastian Riedel
Mor Geva
LRMKELMReLM
190
9
0
25 Nov 2024
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Qizhou Chen
Chengyu Wang
Dakan Wang
Taolin Zhang
Wangyue Li
Xiaofeng He
KELM
161
1
0
23 Nov 2024
Towards Knowledge Checking in Retrieval-augmented Generation: A
  Representation Perspective
Towards Knowledge Checking in Retrieval-augmented Generation: A Representation Perspective
Shenglai Zeng
Jiankun Zhang
Bingheng Li
Yuping Lin
Tianqi Zheng
...
Hui Liu
Hui Liu
Yue Xing
Monica Xiao Cheng
Jiliang Tang
RALM
131
5
0
21 Nov 2024
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Javier Ferrando
Oscar Obeso
Senthooran Rajamanoharan
Neel Nanda
187
33
0
21 Nov 2024
Visual-Oriented Fine-Grained Knowledge Editing for MultiModal Large
  Language Models
Visual-Oriented Fine-Grained Knowledge Editing for MultiModal Large Language Models
Zhen Zeng
Leijiang Gu
Xun Yang
Zhangling Duan
Zenglin Shi
Meng Wang
KELM
134
2
0
19 Nov 2024
Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering
Zeping Yu
Sophia Ananiadou
495
2
0
17 Nov 2024
AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient
  and Instant Deployment
AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment
Y. Fu
Zhongzhi Yu
Junwei Li
Jiayi Qian
Yongan Zhang
Xiangchi Yuan
Dachuan Shi
Roman Yakunin
Y. Lin
98
4
0
15 Nov 2024
Probing LLM Hallucination from Within: Perturbation-Driven Approach via Internal Knowledge
Probing LLM Hallucination from Within: Perturbation-Driven Approach via Internal Knowledge
Seongmin Lee
Hsiang Hsu
Chun-Fu Chen
Duen Horng
Chau
LRM
101
2
0
14 Nov 2024
Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions
Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions
Moran Yanuka
Assaf Ben-Kish
Yonatan Bitton
Idan Szpektor
Raja Giryes
VLM
163
3
0
13 Nov 2024
Comparing Bottom-Up and Top-Down Steering Approaches on In-Context
  Learning Tasks
Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks
Madeline Brumley
Joe Kwon
David M. Krueger
Dmitrii Krasheninnikov
Usman Anwar
LLMSV
98
9
0
11 Nov 2024
Model Editing for LLMs4Code: How Far are We?
Model Editing for LLMs4Code: How Far are We?
Xiaopeng Li
Shasha Li
Huijun Liu
Jun Ma
Jie Yu
Xiaodong Liu
Jing Wang
Shezheng Song
Weimin Zhang
KELM
104
4
0
11 Nov 2024
Controllable Context Sensitivity and the Knob Behind It
Controllable Context Sensitivity and the Knob Behind It
Julian Minder
Kevin Du
Niklas Stoehr
Giovanni Monea
Chris Wendler
Robert West
Ryan Cotterell
KELM
154
6
0
11 Nov 2024
Gumbel Counterfactual Generation From Language Models
Gumbel Counterfactual Generation From Language Models
Shauli Ravfogel
Anej Svete
Vésteinn Snæbjarnarson
Ryan Cotterell
LRMCML
108
5
0
11 Nov 2024
Continual Memorization of Factoids in Language Models
Continual Memorization of Factoids in Language Models
Howard Chen
Jiayi Geng
Adithya Bhaskar
Dan Friedman
Danqi Chen
KELM
132
1
0
11 Nov 2024
Gradient Localization Improves Lifelong Pretraining of Language Models
Gradient Localization Improves Lifelong Pretraining of Language Models
Jared Fernandez
Yonatan Bisk
Emma Strubell
KELM
99
2
0
07 Nov 2024
Unlearning in- vs. out-of-distribution data in LLMs under gradient-based
  method
Unlearning in- vs. out-of-distribution data in LLMs under gradient-based method
Teodora Baluta
Pascal Lamblin
Daniel Tarlow
Fabian Pedregosa
Gintare Karolina Dziugaite
MU
73
2
0
07 Nov 2024
A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning
A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning
Guan Zhe Hong
Nishanth Dikkala
Enming Luo
Cyrus Rashtchian
Xin Wang
Rina Panigrahy
OffRLLRMNAI
96
0
0
06 Nov 2024
Extracting Unlearned Information from LLMs with Activation Steering
Extracting Unlearned Information from LLMs with Activation Steering
Atakan Seyitoğlu
A. Kuvshinov
Leo Schwinn
Stephan Günnemann
MULLMSV
100
8
0
04 Nov 2024
Learning Where to Edit Vision Transformers
Learning Where to Edit Vision Transformers
Yunqiao Yang
Long-Kai Huang
Shengzhuang Chen
Kede Ma
Ying Wei
KELM
96
1
0
04 Nov 2024
Enhancing Multiple Dimensions of Trustworthiness in LLMs via Sparse
  Activation Control
Enhancing Multiple Dimensions of Trustworthiness in LLMs via Sparse Activation Control
Yuxin Xiao
Chaoqun Wan
Yonggang Zhang
Wenxiao Wang
Binbin Lin
Xiaofei He
Xu Shen
Jieping Ye
53
0
0
04 Nov 2024
The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units
The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units
Badr AlKhamissi
Greta Tuckute
Antoine Bosselut
Martin Schrimpf
MILM
121
12
0
04 Nov 2024
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Dennis Fucci
Marco Gaido
Beatrice Savoldi
Matteo Negri
Mauro Cettolo
L. Bentivogli
276
3
0
03 Nov 2024
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting
  Rare Concepts in Foundation Models
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
Aashiq Muhamed
Mona Diab
Virginia Smith
107
3
0
01 Nov 2024
Commonsense Knowledge Editing Based on Free-Text in LLMs
Commonsense Knowledge Editing Based on Free-Text in LLMs
Xiusheng Huang
Yequan Wang
Jun Zhao
Kang Liu
KELM
79
7
0
31 Oct 2024
Reasons and Solutions for the Decline in Model Performance after Editing
Reasons and Solutions for the Decline in Model Performance after Editing
Xiusheng Huang
Jiaxiang Liu
Yequan Wang
Kang Liu
KELM
103
7
0
31 Oct 2024
Attention Speaks Volumes: Localizing and Mitigating Bias in Language
  Models
Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models
Rishabh Adiga
Besmira Nushi
Varun Chandrasekaran
97
1
0
29 Oct 2024
Learning and Unlearning of Fabricated Knowledge in Language Models
Learning and Unlearning of Fabricated Knowledge in Language Models
Chen Sun
Nolan Miller
A. Zhmoginov
Max Vladymyrov
Mark Sandler
KELMMU
60
1
0
29 Oct 2024
Survey of User Interface Design and Interaction Techniques in Generative
  AI Applications
Survey of User Interface Design and Interaction Techniques in Generative AI Applications
Reuben Luera
Ryan Rossi
Alexa F. Siu
Franck Dernoncourt
Tong Yu
...
Hanieh Salehy
Jian Zhao
Samyadeep Basu
Puneet Mathur
Nedim Lipka
AI4TS
139
1
0
28 Oct 2024
Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From
  Syntax to Semantics
Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From Syntax to Semantics
Isabelle Lee
Joshua Lum
Ziyi Liu
Dani Yogatama
LRM
66
0
0
28 Oct 2024
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with
  Annual Updates
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
106
4
0
28 Oct 2024
Applying sparse autoencoders to unlearn knowledge in language models
Applying sparse autoencoders to unlearn knowledge in language models
Eoin Farrell
Yeu-Tong Lau
Arthur Conmy
MU
109
24
0
25 Oct 2024
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate
  Hallucinations
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
Aryo Pradipta Gema
Chen Jin
Ahmed Abdulaal
Tom Diethe
Philip Teare
Beatrice Alex
Pasquale Minervini
Amrutha Saseendran
102
6
0
24 Oct 2024
Delving into the Reversal Curse: How Far Can Large Language Models
  Generalize?
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?
Zhengkai Lin
Z. Fu
Kai Liu
Liang Xie
Binbin Lin
Wenxiao Wang
D. Cai
Yue Wu
Jieping Ye
LRM
128
3
0
24 Oct 2024
On Explaining with Attention Matrices
On Explaining with Attention Matrices
Omar Naim
Nicholas Asher
74
1
0
24 Oct 2024
Mixture of Parrots: Experts improve memorization more than reasoning
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
120
5
0
24 Oct 2024
Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained
  Models via Model Editing
Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained Models via Model Editing
Dongliang Guo
Mengxuan Hu
Zihan Guan
Junfeng Guo
Thomas Hartvigsen
Sheng Li
AAML
140
2
0
23 Oct 2024
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
Jinghan Jia
Jiancheng Liu
Yihua Zhang
Parikshit Ram
Nathalie Baracaldo
Sijia Liu
MU
169
8
0
23 Oct 2024
The Tug of War Within: Mitigating the Fairness-Privacy Conflicts in Large Language Models
The Tug of War Within: Mitigating the Fairness-Privacy Conflicts in Large Language Models
Chen Qian
Dongrui Liu
Jie Zhang
Yong Liu
Jing Shao
95
1
0
22 Oct 2024
LLMScan: Causal Scan for LLM Misbehavior Detection
LLMScan: Causal Scan for LLM Misbehavior Detection
Mengdi Zhang
Kai Kiat Goh
Peixin Zhang
Jun Sun
Rose Lin Xin
Hongyu Zhang
163
0
0
22 Oct 2024
A Psycholinguistic Evaluation of Language Models' Sensitivity to
  Argument Roles
A Psycholinguistic Evaluation of Language Models' Sensitivity to Argument Roles
Eun-Kyoung Rosa Lee
Sathvik Nair
Naomi Feldman
112
4
0
21 Oct 2024
Previous
123...678...202122
Next