ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.16252
  4. Cited By
Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models

Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models

22 May 2025
Hwiyeong Lee
Uiji Hwang
Hyelim Lim
Taeuk Kim
    MU
ArXiv (abs)PDFHTML

Papers citing "Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models"

17 / 17 papers shown
Title
Negative Preference Optimization: From Catastrophic Collapse to
  Effective Unlearning
Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning
Ruiqi Zhang
Licong Lin
Yu Bai
Song Mei
MU
125
174
0
08 Apr 2024
Digital Forgetting in Large Language Models: A Survey of Unlearning
  Methods
Digital Forgetting in Large Language Models: A Survey of Unlearning Methods
Alberto Blanco-Justicia
N. Jebreel
Benet Manzanares-Salor
David Sánchez
Josep Domingo-Ferrer
Guillem Collell
Kuan Eeik Tan
KELMMU
98
21
0
02 Apr 2024
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Nathaniel Li
Alexander Pan
Anjali Gopal
Summer Yue
Daniel Berrios
...
Yan Shoshitaishvili
Jimmy Ba
K. Esvelt
Alexandr Wang
Dan Hendrycks
ELM
98
185
0
05 Mar 2024
Eight Methods to Evaluate Robust Unlearning in LLMs
Eight Methods to Evaluate Robust Unlearning in LLMs
Aengus Lynch
Phillip Guo
Aidan Ewart
Stephen Casper
Dylan Hadfield-Menell
ELMMU
98
74
0
26 Feb 2024
TOFU: A Task of Fictitious Unlearning for LLMs
TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini
Zhili Feng
Avi Schwarzschild
Zachary Chase Lipton
J. Zico Kolter
MUCLL
121
181
0
11 Jan 2024
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale
  of Two Benchmarks
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
Ting-Yun Chang
Jesse Thomason
Robin Jia
71
18
0
15 Nov 2023
Who's Harry Potter? Approximate Unlearning in LLMs
Who's Harry Potter? Approximate Unlearning in LLMs
Ronen Eldan
M. Russinovich
MUMoMe
154
206
0
03 Oct 2023
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending
  Against Extraction Attacks
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Vaidehi Patil
Peter Hase
Joey Tianyi Zhou
KELMAAML
119
106
0
29 Sep 2023
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and
  Vulnerabilities
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes
Xuanli He
Bennett Kleinberg
Lewis D. Griffin
79
86
0
24 Aug 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
387
3,981
0
29 May 2023
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Kent K. Chang
Mackenzie Cramer
Sandeep Soni
David Bamman
RALM
214
120
0
28 Apr 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language
  Models
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
252
318
0
28 Apr 2023
Does Localization Inform Editing? Surprising Differences in
  Causality-Based Localization vs. Knowledge Editing in Language Models
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
Peter Hase
Joey Tianyi Zhou
Been Kim
Asma Ghandeharioun
MILM
98
186
0
10 Jan 2023
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Joel Jang
Dongkeun Yoon
Sohee Yang
Sungmin Cha
Moontae Lee
Lajanugen Logeswaran
Minjoon Seo
KELMPILMMU
197
234
0
04 Oct 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
  in the Vocabulary Space
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
KELM
118
384
0
28 Mar 2022
Locating and Editing Factual Associations in GPT
Locating and Editing Factual Associations in GPT
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
248
1,357
0
10 Feb 2022
Transformer Feed-Forward Layers Are Key-Value Memories
Transformer Feed-Forward Layers Are Key-Value Memories
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
KELM
161
840
0
29 Dec 2020
1