Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.18290
Cited By
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
29 May 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Direct Preference Optimization: Your Language Model is Secretly a Reward Model"
50 / 2,684 papers shown
Title
SEE-DPO: Self Entropy Enhanced Direct Preference Optimization
Shivanshu Shekhar
Shreyas Singh
Tong Zhang
56
4
0
06 Nov 2024
Mitigating Metric Bias in Minimum Bayes Risk Decoding
Geza Kovacs
Daniel Deutsch
Markus Freitag
47
6
0
05 Nov 2024
Stochastic Monkeys at Play: Random Augmentations Cheaply Break LLM Safety Alignment
Jason Vega
Junsheng Huang
Gaokai Zhang
Hangoo Kang
Minjia Zhang
Gagandeep Singh
44
1
0
05 Nov 2024
V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization
Yuxi Xie
Guanzhen Li
Xiao Xu
Min-Yen Kan
MLLM
VLM
65
17
0
05 Nov 2024
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Xingwu Sun
Yanfeng Chen
Yanwen Huang
Ruobing Xie
Jiaqi Zhu
...
Zhanhui Kang
Yong Yang
Yuhong Liu
Di Wang
Jie Jiang
MoE
ALM
ELM
81
27
0
04 Nov 2024
Culinary Class Wars: Evaluating LLMs using ASH in Cuisine Transfer Task
Hoonick Lee
Mogan Gim
Donghyeon Park
Donghee Choi
Jaewoo Kang
41
0
0
04 Nov 2024
Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback
Guan-Ting Lin
Prashanth Gurunath Shivakumar
Aditya Gourav
Yile Gu
Ankur Gandhe
Hung-yi Lee
I. Bulyko
61
9
0
04 Nov 2024
Sample-Efficient Alignment for LLMs
Zichen Liu
Changyu Chen
Chao Du
Wee Sun Lee
Min Lin
41
4
0
03 Nov 2024
PMoL: Parameter Efficient MoE for Preference Mixing of LLM Alignment
Dongxu Liu
Bing Xu
Yinzhuo Chen
Bufan Xu
Wenpeng Lu
Muyun Yang
Tiejun Zhao
MoE
49
1
0
02 Nov 2024
TODO: Enhancing LLM Alignment with Ternary Preferences
Yuxiang Guo
Lu Yin
Bo Jiang
Jiaqi Zhang
67
1
0
02 Nov 2024
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models
Jianyi Zhang
Da-Cheng Juan
Cyrus Rashtchian
Chun-Sung Ferng
Heinrich Jiang
Yiran Chen
50
4
0
01 Nov 2024
Token-level Proximal Policy Optimization for Query Generation
Yichen Ouyang
Lu Wang
Fangkai Yang
Pu Zhao
Chenghua Huang
...
Saravan Rajmohan
Weiwei Deng
Dongmei Zhang
Feng Sun
Qi Zhang
OffRL
303
4
0
01 Nov 2024
Active Preference-based Learning for Multi-dimensional Personalization
Minhyeon Oh
Seungjoon Lee
Jungseul Ok
36
1
0
01 Nov 2024
MoD: A Distribution-Based Approach for Merging Large Language Models
Quy-Anh Dang
Chris Ngo
MoMe
VLM
59
0
0
01 Nov 2024
Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction
Utsav Singh
Souradip Chakraborty
Wesley A Suttle
Brian M. Sadler
Anit Kumar Sahu
Mubarak Shah
Vinay P. Namboodiri
Amrit Singh Bedi
76
1
0
01 Nov 2024
Enhancing the Traditional Chinese Medicine Capabilities of Large Language Model through Reinforcement Learning from AI Feedback
Song Yu
Xiaofei Xu
Fangfei Xu
Li Li
LM&MA
48
1
0
01 Nov 2024
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Bohan Lyu
Yadi Cao
Duncan Watson-Parris
Leon Bergen
Taylor Berg-Kirkpatrick
Rose Yu
67
3
0
01 Nov 2024
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation
Bowen Li
Zhaoyu Li
Qiwei Du
Jinqi Luo
Wenshan Wang
...
Katia Sycara
Pradeep Kumar Ravikumar
Alexander G. Gray
X. Si
Sebastian A. Scherer
AI4CE
LRM
91
3
0
01 Nov 2024
Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs
Muhammed Saeed
Elgizouli Mohamed
Mukhtar Mohamed
Shaina Raza
Muhammad Abdul-Mageed
Shady Shehata
58
0
0
31 Oct 2024
The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge
Dake Guo
Jixun Yao
Xinfa Zhu
Kangxiang Xia
Zhao Guo
Ziyu Zhang
Yun Wang
Jie Liu
Lei Xie
41
1
0
31 Oct 2024
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models
Junda Wu
Xintong Li
Ruoyu Wang
Yu Xia
Yuxin Xiong
...
Xiang Chen
Branislav Kveton
Lina Yao
Jingbo Shang
Julian McAuley
OffRL
LRM
34
1
0
31 Oct 2024
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Yunjia Qi
Hao Peng
Xinyu Wang
Bin Xu
Lei Hou
Juanzi Li
64
3
0
31 Oct 2024
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios
Junchao Wu
Runzhe Zhan
Derek F. Wong
Shu Yang
Xinyi Yang
Yulin Yuan
Lidia S. Chao
DeLMO
77
2
0
31 Oct 2024
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Yongxu Liu
Argyris Oikonomou
Weiqiang Zheng
Yang Cai
Arman Cohan
52
1
0
30 Oct 2024
Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
Sheryl Hsu
Omar Khattab
Chelsea Finn
Archit Sharma
KELM
RALM
54
6
0
30 Oct 2024
Vision-Language Models Can Self-Improve Reasoning via Reflection
Kanzhi Cheng
Yantao Li
Fangzhi Xu
Jianbing Zhang
Hao Zhou
Yang Liu
ReLM
LRM
67
19
0
30 Oct 2024
VPO: Leveraging the Number of Votes in Preference Optimization
Jae Hyeon Cho
Minkyung Park
Byung-Jun Lee
27
1
0
30 Oct 2024
Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector
Youcheng Huang
Fengbin Zhu
Jingkun Tang
Pan Zhou
Wenqiang Lei
Jiancheng Lv
Tat-Seng Chua
AAML
39
4
0
30 Oct 2024
Smaller Large Language Models Can Do Moral Self-Correction
Guangliang Liu
Zhiyu Xue
Rongrong Wang
K. Johnson
Kristen Marie Johnson
LRM
42
0
0
30 Oct 2024
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Gabrielle Kaili-May Liu
Bowen Shi
Avi Caciularu
Idan Szpektor
Arman Cohan
72
4
0
30 Oct 2024
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
Yihe Deng
Paul Mineiro
LRM
31
3
0
29 Oct 2024
DISCERN: Decoding Systematic Errors in Natural Language for Text Classifiers
Rakesh R Menon
Shashank Srivastava
31
2
0
29 Oct 2024
AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts
Vishal Kumar
Zeyi Liao
Jaylen Jones
Huan Sun
AAML
46
2
0
29 Oct 2024
Sing it, Narrate it: Quality Musical Lyrics Translation
Zhuorui Ye
Jiajun Li
Rongwu Xu
50
1
0
29 Oct 2024
PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference
Kendong Liu
Zhiyu Zhu
Chuanhao Li
Hui Liu
H. Zeng
Junhui Hou
EGVM
51
2
0
29 Oct 2024
SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types
Yutao Mou
Shikun Zhang
Wei Ye
ELM
57
12
0
29 Oct 2024
A Hierarchical Language Model For Interpretable Graph Reasoning
Sambhav Khurana
Xiner Li
Shurui Gui
Shuiwang Ji
LRM
52
0
0
29 Oct 2024
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Zhiqi Bu
Xiaomeng Jin
Bhanukiran Vinzamuri
Anil Ramakrishna
Kai-Wei Chang
Volkan Cevher
Mingyi Hong
MU
91
7
0
29 Oct 2024
f
f
f
-PO: Generalizing Preference Optimization with
f
f
f
-divergence Minimization
Jiaqi Han
Mingjian Jiang
Yuxuan Song
J. Leskovec
Stefano Ermon
64
4
0
29 Oct 2024
Transferable Post-training via Inverse Value Learning
Xinyu Lu
Xueru Wen
Yaojie Lu
Bowen Yu
Hongyu Lin
Haiyang Yu
Le Sun
Xianpei Han
Yongbin Li
28
1
0
28 Oct 2024
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Weijian Luo
C. Zhang
Debing Zhang
Zhengyang Geng
35
4
0
28 Oct 2024
Matryoshka: Learning to Drive Black-Box LLMs with LLMs
Changhao Li
Yuchen Zhuang
Rushi Qiang
Haotian Sun
H. Dai
Chao Zhang
Bo Dai
LRM
33
4
0
28 Oct 2024
Bridging the Gap between Expert and Language Models: Concept-guided Chess Commentary Generation and Evaluation
Jaechang Kim
Jinmin Goh
Inseok Hwang
Jaewoong Cho
Jungseul Ok
ELM
38
1
0
28 Oct 2024
UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function
Zhichao Wang
Bin Bi
Z. Zhu
Xiangbo Mao
Jun Wang
Shiyu Wang
CLL
33
1
0
28 Oct 2024
L3Ms -- Lagrange Large Language Models
Guneet S. Dhillon
Xingjian Shi
Yee Whye Teh
Alex Smola
306
0
0
28 Oct 2024
Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring
Honglin Mu
Han He
Yuxin Zhou
Yunlong Feng
Yang Xu
...
Zeming Liu
Xudong Han
Qi Shi
Qingfu Zhu
Wanxiang Che
AAML
52
1
0
28 Oct 2024
Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
Yifang Chen
David Zhu
SyDa
46
0
0
27 Oct 2024
Accelerating Direct Preference Optimization with Prefix Sharing
Franklin Wang
Sumanth Hegde
41
0
0
27 Oct 2024
Learning from Response not Preference: A Stackelberg Approach for LLM Detoxification using Non-parallel Data
Xinhong Xie
Tao Li
Quanyan Zhu
32
3
0
27 Oct 2024
Fine-Tuning and Evaluating Open-Source Large Language Models for the Army Domain
Daniel C. Ruiz
John Sell
25
1
0
27 Oct 2024
Previous
1
2
3
...
21
22
23
...
52
53
54
Next