ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18290
  4. Cited By
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

29 May 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
    ALM
ArXivPDFHTML

Papers citing "Direct Preference Optimization: Your Language Model is Secretly a Reward Model"

50 / 2,707 papers shown
Title
LIONs: An Empirically Optimized Approach to Align Language Models
LIONs: An Empirically Optimized Approach to Align Language Models
Xiao Yu
Qingyang Wu
Yu Li
Zhou Yu
ALM
60
4
0
09 Jul 2024
Efficient and Accurate Memorable Conversation Model using DPO based on
  sLLM
Efficient and Accurate Memorable Conversation Model using DPO based on sLLM
Youngkyung Seo
Yoonseok Heo
Jun-Seok Koh
Du-Seong Chang
76
0
0
09 Jul 2024
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
Weijia Shi
Jaechan Lee
Yangsibo Huang
Sadhika Malladi
Jieyu Zhao
Ari Holtzman
Daogao Liu
Luke Zettlemoyer
Noah A. Smith
Chiyuan Zhang
MU
ELM
52
51
0
08 Jul 2024
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
Miao Zheng
H. Liang
Fan Yang
Haoze Sun
Tianpeng Li
...
Kun Fang
Weipeng Chen
Bin Cui
Wentao Zhang
Guosheng Dong
RALM
49
3
0
08 Jul 2024
On the Limitations of Compute Thresholds as a Governance Strategy
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
74
16
0
08 Jul 2024
LLMBox: A Comprehensive Library for Large Language Models
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang
Yiwen Hu
Bingqian Li
Wenyang Luo
Zijing Qin
...
Chunxuan Xia
Junyi Li
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
56
1
0
08 Jul 2024
$R^2$-Guard: Robust Reasoning Enabled LLM Guardrail via
  Knowledge-Enhanced Logical Reasoning
R2R^2R2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning
Mintong Kang
Yue Liu
LRM
71
13
0
08 Jul 2024
Variational Best-of-N Alignment
Variational Best-of-N Alignment
Afra Amini
Tim Vieira
Ryan Cotterell
Ryan Cotterell
BDL
48
20
0
08 Jul 2024
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment
Qizhang Feng
Siva Rajesh Kasa
Santhosh Kumar Kasa
Hyokun Yun
C. Teo
S. Bodapati
92
8
0
08 Jul 2024
SmurfCat at PAN 2024 TextDetox: Alignment of Multilingual Transformers
  for Text Detoxification
SmurfCat at PAN 2024 TextDetox: Alignment of Multilingual Transformers for Text Detoxification
Elisei Rykov
Konstantin Zaytsev
Ivan Anisimov
Alexandr Voronin
19
0
0
07 Jul 2024
Large Language Model as an Assignment Evaluator: Insights, Feedback, and
  Challenges in a 1000+ Student Course
Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course
Cheng-Han Chiang
Wei-Chih Chen
Chun-Yi Kuan
Chienchou Yang
Hung-yi Lee
ELM
AI4Ed
54
5
0
07 Jul 2024
RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language
  Models
RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models
Peng Xia
Kangyu Zhu
Haoran Li
Hongtu Zhu
Yun Li
Gang Li
Linjun Zhang
Huaxiu Yao
MedIm
56
32
0
06 Jul 2024
Progress or Regress? Self-Improvement Reversal in Post-training
Progress or Regress? Self-Improvement Reversal in Post-training
Ting Wu
Xuefeng Li
Pengfei Liu
LRM
38
11
0
06 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
61
12
0
06 Jul 2024
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for
  Text-to-Image Generation?
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Zhaorun Chen
Yichao Du
Zichen Wen
Yiyang Zhou
Chenhang Cui
...
Jiawei Zhou
Zhuokai Zhao
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
EGVM
MLLM
76
31
0
05 Jul 2024
Re-Tuning: Overcoming the Compositionality Limits of Large Language
  Models with Recursive Tuning
Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning
Eric Pasewark
Kyle Montgomery
Kefei Duan
Dawn Song
Chenguang Wang
LRM
CLL
ReLM
55
1
0
05 Jul 2024
Hindsight Preference Learning for Offline Preference-based Reinforcement
  Learning
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
Chen-Xiao Gao
Shengjun Fang
Chenjun Xiao
Yang Yu
Zongzhang Zhang
OffRL
40
1
0
05 Jul 2024
Jailbreak Attacks and Defenses Against Large Language Models: A Survey
Jailbreak Attacks and Defenses Against Large Language Models: A Survey
Sibo Yi
Yule Liu
Zhen Sun
Tianshuo Cong
Xinlei He
Jiaxing Song
Ke Xu
Qi Li
AAML
64
96
0
05 Jul 2024
HAF-RM: A Hybrid Alignment Framework for Reward Model Training
HAF-RM: A Hybrid Alignment Framework for Reward Model Training
Shujun Liu
Xiaoyu Shen
Yuhang Lai
Siyuan Wang
Shengbin Yue
Zengfeng Huang
Xuanjing Huang
Zhongyu Wei
36
1
0
04 Jul 2024
Orchestrating LLMs with Different Personalizations
Orchestrating LLMs with Different Personalizations
Jin Peng Zhou
Katie Z Luo
Jingwen Gu
Jason Yuan
Kilian Q. Weinberger
Wen Sun
57
2
0
04 Jul 2024
InternLM-XComposer-2.5: A Versatile Large Vision Language Model
  Supporting Long-Contextual Input and Output
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Pan Zhang
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Rui Qian
...
Kai Chen
Jifeng Dai
Yu Qiao
Dahua Lin
Jiaqi Wang
61
106
0
03 Jul 2024
Reinforcement Learning for Sequence Design Leveraging Protein Language
  Models
Reinforcement Learning for Sequence Design Leveraging Protein Language Models
Jithendaraa Subramanian
Shivakanth Sujit
Niloy Irtisam
Umong Sain
Derek Nowrouzezahrai
Samira Ebrahimi Kahou
Riashat Islam
53
0
0
03 Jul 2024
Improving Conversational Abilities of Quantized Large Language Models
  via Direct Preference Alignment
Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment
Janghwan Lee
Seongmin Park
S. Hong
Minsoo Kim
Du-Seong Chang
Jungwook Choi
37
5
0
03 Jul 2024
From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks
From Theft to Bomb-Making: The Ripple Effect of Unlearning in Defending Against Jailbreak Attacks
Zhexin Zhang
Junxiao Yang
Yida Lu
Pei Ke
Shiyao Cui
Chujie Zheng
Hongning Wang
Minlie Huang
MU
AAML
74
27
0
03 Jul 2024
Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Elmira Amirloo
J. Fauconnier
Christoph Roesmann
Christian Kerl
Rinu Boney
...
Zirui Wang
Afshin Dehghan
Yinfei Yang
Zhe Gan
Peter Grasch
48
7
0
02 Jul 2024
RLHF Can Speak Many Languages: Unlocking Multilingual Preference
  Optimization for LLMs
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang
Arash Ahmadian
Kelly Marchisio
Julia Kreutzer
Ahmet Üstün
Sara Hooker
61
25
0
02 Jul 2024
Towards Human Understanding of Paraphrase Types in ChatGPT
Towards Human Understanding of Paraphrase Types in ChatGPT
Dominik Meier
Jan Philip Wahle
Terry Ruas
Bela Gipp
51
0
0
02 Jul 2024
Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference
  Optimization
Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization
Yuchen Hu
Chen Chen
Siyin Wang
Eng Siong Chng
C. Zhang
55
3
0
02 Jul 2024
Cost-Effective Proxy Reward Model Construction with On-Policy and Active
  Learning
Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning
Yifang Chen
Shuohang Wang
Ziyi Yang
Hiteshi Sharma
Nikos Karampatziakis
Donghan Yu
Kevin Jamieson
Simon Shaolei Du
Yelong Shen
OffRL
56
4
0
02 Jul 2024
The Art of Saying No: Contextual Noncompliance in Language Models
The Art of Saying No: Contextual Noncompliance in Language Models
Faeze Brahman
Sachin Kumar
Vidhisha Balachandran
Pradeep Dasigi
Valentina Pyatkin
...
Jack Hessel
Yulia Tsvetkov
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
79
22
0
02 Jul 2024
AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment
AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment
Yilong Lai
Jialong Wu
Congzhi Zhang
Haowen Sun
Deyu Zhou
66
3
0
02 Jul 2024
Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and
  Aleatoric Awareness
Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Khyathi Chandu
Linjie Li
Anas Awadalla
Ximing Lu
Jae Sung Park
Jack Hessel
Lijuan Wang
Yejin Choi
72
3
0
02 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
85
28
0
02 Jul 2024
Purple-teaming LLMs with Adversarial Defender Training
Purple-teaming LLMs with Adversarial Defender Training
Jingyan Zhou
Kun Li
Junan Li
Jiawen Kang
Minda Hu
Xixin Wu
Helen Meng
AAML
41
1
0
01 Jul 2024
Searching for Best Practices in Retrieval-Augmented Generation
Searching for Best Practices in Retrieval-Augmented Generation
Xiaohua Wang
Zhenghua Wang
Xuan Gao
Feiran Zhang
Yixin Wu
...
Qi Qian
Ruicheng Yin
Changze Lv
Xiaoqing Zheng
Xuanjing Huang
67
47
0
01 Jul 2024
$\text{Memory}^3$: Language Modeling with Explicit Memory
Memory3\text{Memory}^3Memory3: Language Modeling with Explicit Memory
Hongkang Yang
Zehao Lin
Wenjin Wang
Hao Wu
Zhiyu Li
...
Yu Yu
Kai Chen
Feiyu Xiong
Linpeng Tang
Weinan E
50
12
0
01 Jul 2024
FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in Large
  Language Models
FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in Large Language Models
Yiyuan Li
Shichao Sun
Pengfei Liu
LRM
87
0
0
01 Jul 2024
Aligning Target-Aware Molecule Diffusion Models with Exact Energy
  Optimization
Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization
Siyi Gu
Minkai Xu
Alexander Powers
Weili Nie
Tomas Geffner
Karsten Kreis
J. Leskovec
Arash Vahdat
Stefano Ermon
57
7
0
01 Jul 2024
Roleplay-doh: Enabling Domain-Experts to Create LLM-simulated Patients
  via Eliciting and Adhering to Principles
Roleplay-doh: Enabling Domain-Experts to Create LLM-simulated Patients via Eliciting and Adhering to Principles
Ryan Louie
Ananjan Nandi
William Fang
Cheng Chang
Emma Brunskill
Diyi Yang
59
39
0
01 Jul 2024
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Yusu Qian
Hanrong Ye
J. Fauconnier
Peter Grasch
Yinfei Yang
Zhe Gan
117
14
0
01 Jul 2024
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical
  Reasoning
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning
Zimu Lu
Aojun Zhou
Ke Wang
Houxing Ren
Weikang Shi
Junting Pan
Mingjie Zhan
Hongsheng Li
LRM
56
24
0
30 Jun 2024
BAPO: Base-Anchored Preference Optimization for Personalized Alignment
  in Large Language Models
BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models
Gihun Lee
Minchan Jeong
Yujin Kim
Hojung Jung
Jaehoon Oh
Sangmook Kim
Se-Young Yun
48
0
0
30 Jun 2024
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang
Dian Yu
Baolin Peng
Linfeng Song
Ye Tian
Mingyue Huo
Nan Jiang
Haitao Mi
Dong Yu
46
16
0
30 Jun 2024
LLMs-as-Instructors: Learning from Errors Toward Automating Model
  Improvement
LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement
Jiahao Ying
Mingbao Lin
Yixin Cao
Wei Tang
Bo Wang
Qianru Sun
Xuanjing Huang
Shuicheng Yan
LRM
38
8
0
29 Jun 2024
A Bayesian Solution To The Imitation Gap
A Bayesian Solution To The Imitation Gap
Risto Vuorio
Mattie Fellows
Cong Lu
Clémence Grislain
Shimon Whiteson
42
1
0
29 Jun 2024
GraphArena: Evaluating and Exploring Large Language Models on Graph Computation
GraphArena: Evaluating and Exploring Large Language Models on Graph Computation
Jianheng Tang
Qifan Zhang
Yuhan Li
Nuo Chen
Jia Li
52
1
0
29 Jun 2024
ProgressGym: Alignment with a Millennium of Moral Progress
ProgressGym: Alignment with a Millennium of Moral Progress
Tianyi Qiu
Yang Zhang
Xuchuan Huang
Jasmine Xinze Li
Yalan Qin
Yaodong Yang
AI4TS
52
4
0
28 Jun 2024
STLLaVA-Med: Self-Training Large Language and Vision Assistant for
  Medical
STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical
Guohao Sun
Can Qin
Huazhu Fu
Linwei Wang
Zhiqiang Tao
LM&MA
40
3
0
28 Jun 2024
Calibrating LLMs with Preference Optimization on Thought Trees for
  Generating Rationale in Science Question Scoring
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Jiazheng Li
Hainiu Xu
ZHAOYUE SUN
Yuxiang Zhou
David West
Cesare Aloisi
Yulan He
LRM
40
4
0
28 Jun 2024
PopAlign: Population-Level Alignment for Fair Text-to-Image Generation
PopAlign: Population-Level Alignment for Fair Text-to-Image Generation
Shufan Li
Harkanwar Singh
Aditya Grover
EGVM
70
2
0
28 Jun 2024
Previous
123...333435...535455
Next