Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.05868
Cited By
Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning
8 April 2024
Ruiqi Zhang
Licong Lin
Yu Bai
Song Mei
MU
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning"
48 / 48 papers shown
Title
Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
Fu-Yun Wang
Yunhao Shui
Jingtan Piao
Keqiang Sun
Hongsheng Li
22
0
0
16 May 2025
Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment
Xueyao Zhang
Y. Wang
Chaoren Wang
Zehan Li
Zhuo Chen
Zhizheng Wu
135
0
0
07 May 2025
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
Xiaoyu Xu
Minxin Du
Qingqing Ye
Haibo Hu
MU
57
0
0
07 May 2025
Teaching Large Language Models to Reason through Learning and Forgetting
Tianwei Ni
Allen Nie
Sapana Chaudhary
Yao Liu
Huzefa Rangwala
Rasool Fakoor
ReLM
CLL
LRM
142
0
0
15 Apr 2025
Sharpness-Aware Parameter Selection for Machine Unlearning
Saber Malekmohammadi
Hong kyu Lee
Li Xiong
MU
160
0
0
08 Apr 2025
Not All Data Are Unlearned Equally
Aravind Krishnan
Siva Reddy
Marius Mosbach
MU
148
1
0
07 Apr 2025
SUV: Scalable Large Language Model Copyright Compliance with Regularized Selective Unlearning
Tianyang Xu
Xiaoze Liu
Feijie Wu
Xiaoqian Wang
Jing Gao
MU
58
0
0
29 Mar 2025
ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
Haoming Xu
Shuxun Wang
Yanqiu Zhao
Yi Zhong
Ziyan Jiang
Ningyuan Zhao
Shumin Deng
H. Chen
N. Zhang
MoMe
MU
69
0
0
27 Mar 2025
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
89
2
0
18 Mar 2025
UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets
Wenyu Wang
M. Zhang
Xiaotian Ye
Z. Z. Ren
Z. Chen
Pengjie Ren
MU
KELM
174
0
0
06 Mar 2025
CE-U: Cross Entropy Unlearning
Bo Yang
MU
53
0
0
03 Mar 2025
AMUN: Adversarial Machine UNlearning
A. Boroojeny
Hari Sundaram
Varun Chandrasekaran
MU
AAML
48
0
0
02 Mar 2025
A General Framework to Enhance Fine-tuning-based LLM Unlearning
J. Ren
Zhenwei Dai
X. Tang
Hui Liu
Jingying Zeng
...
R. Goutam
Suhang Wang
Yue Xing
Qi He
Hui Liu
MU
163
1
0
25 Feb 2025
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
Vaidehi Patil
Elias Stengel-Eskin
Joey Tianyi Zhou
MU
CLL
78
2
0
20 Feb 2025
LUME: LLM Unlearning with Multitask Evaluations
Anil Ramakrishna
Yixin Wan
Xiaomeng Jin
Kai-Wei Chang
Zhiqi Bu
Bhanukiran Vinzamuri
V. Cevher
Mingyi Hong
Rahul Gupta
CLL
MU
109
7
0
20 Feb 2025
Which Retain Set Matters for LLM Unlearning? A Case Study on Entity Unlearning
Hwan Chang
Hwanhee Lee
MU
42
0
0
17 Feb 2025
ReLearn: Unlearning via Learning for Large Language Models
Haoming Xu
Ningyuan Zhao
Liming Yang
Sendong Zhao
Shumin Deng
Mengru Wang
Bryan Hooi
Nay Oo
H. Chen
N. Zhang
KELM
CLL
MU
165
0
0
16 Feb 2025
Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
Yong-Hyun Park
Sangdoo Yun
Jin-Hwa Kim
Junho Kim
Geonhui Jang
Yonghyun Jeong
Junghyo Jo
Gayoung Lee
76
13
0
17 Jan 2025
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
Jingyang Zhang
Runyi Hu
Xiaoya Li
Tianwei Zhang
Jiwei Li
Fei Wu
G. Wang
Eduard H. Hovy
OffRL
134
7
0
05 Dec 2024
A Review on Machine Unlearning
Haibo Zhang
Toru Nakamura
Takamasa Isohara
Kouichi Sakurai
AILaw
PILM
MU
88
47
0
18 Nov 2024
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Yingzi Ma
Jiongxiao Wang
Fei-Yue Wang
Siyuan Ma
Jiazhao Li
...
B. Li
Yejin Choi
M. Chen
Chaowei Xiao
Chaowei Xiao
MU
58
6
0
05 Nov 2024
RESTOR: Knowledge Recovery through Machine Unlearning
Keivan Rezaei
Khyathi Raghavi Chandu
S. Feizi
Yejin Choi
Faeze Brahman
Abhilasha Ravichander
KELM
CLL
MU
58
0
0
31 Oct 2024
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Zhiqi Bu
Xiaomeng Jin
Bhanukiran Vinzamuri
Anil Ramakrishna
Kai-Wei Chang
V. Cevher
Mingyi Hong
MU
85
6
0
29 Oct 2024
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
Jinghan Jia
Jiancheng Liu
Yihua Zhang
Parikshit Ram
Nathalie Baracaldo
Sijia Liu
MU
35
2
0
23 Oct 2024
CLEAR: Character Unlearning in Textual and Visual Modalities
Alexey Dontsov
Dmitrii Korzh
Alexey Zhavoronkin
Boris Mikheev
Denis Bobkov
Aibek Alanov
Oleg Y. Rogov
Ivan V. Oseledets
Elena Tutubalina
AILaw
VLM
MU
66
5
0
23 Oct 2024
A Closer Look at Machine Unlearning for Large Language Models
Xiaojian Yuan
Tianyu Pang
Chao Du
Kejiang Chen
Weiming Zhang
Min-Bin Lin
MU
41
5
0
10 Oct 2024
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
Chongyu Fan
Jiancheng Liu
Licong Lin
Jinghan Jia
Ruiqi Zhang
Song Mei
Sijia Liu
MU
43
16
0
09 Oct 2024
A Probabilistic Perspective on Unlearning and Alignment for Large Language Models
Yan Scholten
Stephan Günnemann
Leo Schwinn
MU
55
6
0
04 Oct 2024
An Adversarial Perspective on Machine Unlearning for AI Safety
Jakub Łucki
Boyi Wei
Yangsibo Huang
Peter Henderson
F. Tramèr
Javier Rando
MU
AAML
73
32
0
26 Sep 2024
Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models
Anmol Mekala
Vineeth Dorna
Shreya Dubey
Abhishek Lalwani
David Koleczek
Mukund Rungta
Sadid Hasan
Elita Lobo
KELM
MU
38
2
0
20 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
54
1
0
05 Sep 2024
Strong Copyright Protection for Language Models via Adaptive Model Fusion
Javier Abad
Konstantin Donhauser
Francesco Pinto
Fanny Yang
45
4
0
29 Jul 2024
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs
Zhenhua Liu
Tong Zhu
Chuanyuan Tan
Wenliang Chen
PILM
MU
53
8
0
14 Jul 2024
Composable Interventions for Language Models
Arinbjorn Kolbeinsson
Kyle O'Brien
Tianjin Huang
Shanghua Gao
Shiwei Liu
...
Anurag J. Vaidya
Faisal Mahmood
Marinka Zitnik
Tianlong Chen
Thomas Hartvigsen
KELM
MU
89
5
0
09 Jul 2024
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
Bozhong Tian
Xiaozhuan Liang
Siyuan Cheng
Qingbin Liu
Mengru Wang
Dianbo Sui
Xi Chen
Huajun Chen
Ningyu Zhang
MU
29
6
0
02 Jul 2024
Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning
Somnath Basu Roy Chowdhury
Krzysztof Choromanski
Arijit Sehanobish
Avinava Dubey
Snigdha Chaturvedi
MU
61
7
0
24 Jun 2024
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs
S. Kadhe
Farhan Ahmed
Dennis Wei
Nathalie Baracaldo
Inkit Padhi
MoMe
MU
28
7
0
17 Jun 2024
Aligning Transformers with Continuous Feedback via Energy Rank Alignment
Shriram Chennakesavalu
Frank Hu
Sebastian Ibarraran
Grant M. Rotskoff
35
3
0
21 May 2024
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
Ermo Hua
Biqing Qi
Kaiyan Zhang
Yue Yu
Ning Ding
Xingtai Lv
Kai Tian
Bowen Zhou
37
3
0
20 May 2024
SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning
Jinghan Jia
Yihua Zhang
Yimeng Zhang
Jiancheng Liu
Bharat Runwal
James Diffenderfer
B. Kailkhura
Sijia Liu
MU
43
35
0
28 Apr 2024
Rethinking Machine Unlearning for Large Language Models
Sijia Liu
Yuanshun Yao
Jinghan Jia
Stephen Casper
Nathalie Baracaldo
...
Hang Li
Kush R. Varshney
Mohit Bansal
Sanmi Koyejo
Yang Liu
AILaw
MU
72
83
0
13 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
176
449
0
02 Feb 2024
Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges
Nianwen Si
Hao Zhang
Heyu Chang
Wenlin Zhang
Dan Qu
Weiqiang Zhang
KELM
MU
80
26
0
27 Nov 2023
Who's Harry Potter? Approximate Unlearning in LLMs
Ronen Eldan
M. Russinovich
MU
MoMe
101
175
0
03 Oct 2023
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Joel Jang
Dongkeun Yoon
Sohee Yang
Sungmin Cha
Moontae Lee
Lajanugen Logeswaran
Minjoon Seo
KELM
PILM
MU
147
191
0
04 Oct 2022
A Survey of Machine Unlearning
Thanh Tam Nguyen
T. T. Huynh
Phi Le Nguyen
Alan Wee-Chung Liew
Hongzhi Yin
Quoc Viet Hung Nguyen
MU
77
221
0
06 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
1