ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18290
  4. Cited By
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

29 May 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
    ALM
ArXivPDFHTML

Papers citing "Direct Preference Optimization: Your Language Model is Secretly a Reward Model"

50 / 2,637 papers shown
Title
DeAL: Decoding-time Alignment for Large Language Models
DeAL: Decoding-time Alignment for Large Language Models
James Y. Huang
Sailik Sengupta
Daniele Bonadiman
Yi-An Lai
Arshit Gupta
Nikolaos Pappas
Saab Mansour
Katrin Kirchoff
Dan Roth
64
29
0
05 Feb 2024
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural
  language generation from feedback
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Gaurav Pandey
Yatin Nandwani
Tahira Naseem
Mayank Mishra
Guangxuan Xu
Dinesh Raghu
Sachindra Joshi
Asim Munawar
Ramón Fernández Astudillo
BDL
44
3
0
04 Feb 2024
Factuality of Large Language Models in the Year 2024
Factuality of Large Language Models in the Year 2024
Yuxia Wang
Minghan Wang
Muhammad Arslan Manzoor
Fei Liu
Georgi Georgiev
Rocktim Jyoti Das
Preslav Nakov
LRM
HILM
40
21
0
04 Feb 2024
Aligner: Efficient Alignment by Learning to Correct
Aligner: Efficient Alignment by Learning to Correct
Jiaming Ji
Boyuan Chen
Hantao Lou
Chongye Guo
Borong Zhang
Xuehai Pan
Juntao Dai
Tianyi Qiu
Yaodong Yang
34
28
0
04 Feb 2024
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Haowei Lin
Baizhou Huang
Haotian Ye
Qinyu Chen
Zihao Wang
Sujian Li
Jianzhu Ma
Xiaojun Wan
James Zou
Yitao Liang
90
20
0
04 Feb 2024
Large Language Model for Table Processing: A Survey
Large Language Model for Table Processing: A Survey
Weizheng Lu
Jiaming Zhang
Jing Zhang
Yueguo Chen
LMTD
63
26
0
04 Feb 2024
Panacea: Pareto Alignment via Preference Adaptation for LLMs
Panacea: Pareto Alignment via Preference Adaptation for LLMs
Yifan Zhong
Chengdong Ma
Xiaoyuan Zhang
Ziran Yang
Haojun Chen
Qingfu Zhang
Siyuan Qi
Yaodong Yang
62
31
0
03 Feb 2024
A Closer Look at the Limitations of Instruction Tuning
A Closer Look at the Limitations of Instruction Tuning
Sreyan Ghosh
Chandra Kiran Reddy Evuru
Sonal Kumar
Reddy Evuru
Deepali Aneja
Zeyu Jin
R. Duraiswami
Dinesh Manocha
ALM
80
29
0
03 Feb 2024
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent
  Constitution
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution
Wenyue Hua
Xianjun Yang
Zelong Li
Cheng Wei
Yongfeng Zhang
LLMAG
40
13
0
02 Feb 2024
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through
  Process Feedback
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Jian Guan
Wei Wu
Zujie Wen
Peng Xu
Hongning Wang
Minlie Huang
LRM
31
16
0
02 Feb 2024
Continual Learning for Large Language Models: A Survey
Continual Learning for Large Language Models: A Survey
Tongtong Wu
Linhao Luo
Yuan-Fang Li
Shirui Pan
Thuy-Trang Vu
Gholamreza Haffari
CLL
LRM
KELM
47
104
0
02 Feb 2024
Rethinking the Role of Proxy Rewards in Language Model Alignment
Rethinking the Role of Proxy Rewards in Language Model Alignment
Sungdong Kim
Minjoon Seo
SyDa
ALM
31
0
0
02 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
182
463
0
02 Feb 2024
Vaccine: Perturbation-aware Alignment for Large Language Model
Vaccine: Perturbation-aware Alignment for Large Language Model
Tiansheng Huang
Sihao Hu
Ling Liu
55
36
0
02 Feb 2024
A Survey for Foundation Models in Autonomous Driving
A Survey for Foundation Models in Autonomous Driving
Haoxiang Gao
Yaqian Li
Kaiwen Long
Ming Yang
Yiqing Shen
VLM
LRM
58
25
0
02 Feb 2024
Plan-Grounded Large Language Models for Dual Goal Conversational
  Settings
Plan-Grounded Large Language Models for Dual Goal Conversational Settings
Diogo Glória-Silva
Rafael Ferreira
Diogo Tavares
David Semedo
João Magalhães
LLMAG
47
4
0
01 Feb 2024
Towards Efficient Exact Optimization of Language Model Alignment
Towards Efficient Exact Optimization of Language Model Alignment
Haozhe Ji
Cheng Lu
Yilin Niu
Pei Ke
Hongning Wang
Jun Zhu
Jie Tang
Minlie Huang
63
12
0
01 Feb 2024
SymbolicAI: A framework for logic-based approaches combining generative
  models and solvers
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Marius-Constantin Dinu
Claudiu Leoveanu-Condrei
Markus Holzleitner
Werner Zellinger
Sepp Hochreiter
50
10
0
01 Feb 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
143
367
0
01 Feb 2024
Dense Reward for Free in Reinforcement Learning from Human Feedback
Dense Reward for Free in Reinforcement Learning from Human Feedback
Alex J. Chan
Hao Sun
Samuel Holt
M. Schaar
26
32
0
01 Feb 2024
Transforming and Combining Rewards for Aligning Large Language Models
Transforming and Combining Rewards for Aligning Large Language Models
Zihao Wang
Chirag Nagpal
Jonathan Berant
Jacob Eisenstein
Alex DÁmour
Oluwasanmi Koyejo
Victor Veitch
27
11
0
01 Feb 2024
Learning Planning-based Reasoning by Trajectories Collection and Process
  Reward Synthesizing
Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing
Fangkai Jiao
Chengwei Qin
Zhengyuan Liu
Nancy F. Chen
Shafiq Joty
LRM
29
29
0
01 Feb 2024
Safety of Multimodal Large Language Models on Images and Texts
Safety of Multimodal Large Language Models on Images and Texts
Xin Liu
Yichen Zhu
Yunshi Lan
Chao Yang
Yu Qiao
33
29
0
01 Feb 2024
A Survey on Hallucination in Large Vision-Language Models
A Survey on Hallucination in Large Vision-Language Models
Hanchao Liu
Wenyuan Xue
Yifei Chen
Dapeng Chen
Xiutian Zhao
Ke Wang
Liping Hou
Rong-Zhi Li
Wei Peng
LRM
MLLM
35
117
0
01 Feb 2024
Institutional Platform for Secure Self-Service Large Language Model Exploration
Institutional Platform for Secure Self-Service Large Language Model Exploration
V. Bumgardner
Mitchell A. Klusty
W. V. Logan
Samuel E. Armstrong
Caylin D. Hickey
Jeff Talbert
Caylin Hickey
Jeff Talbert
66
1
0
01 Feb 2024
EvoMerge: Neuroevolution for Large Language Models
EvoMerge: Neuroevolution for Large Language Models
Yushu Jiang
VLM
21
1
0
30 Jan 2024
Weaver: Foundation Models for Creative Writing
Weaver: Foundation Models for Creative Writing
Tiannan Wang
Jiamin Chen
Qingrui Jia
Shuai Wang
Ruoyu Fang
...
Xiaohua Xu
Ningyu Zhang
Huajun Chen
Yuchen Eleanor Jiang
Wangchunshu Zhou
37
20
0
30 Jan 2024
Robust Prompt Optimization for Defending Language Models Against
  Jailbreaking Attacks
Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks
Andy Zhou
Bo Li
Haohan Wang
AAML
49
76
0
30 Jan 2024
QACP: An Annotated Question Answering Dataset for Assisting Chinese
  Python Programming Learners
QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners
Rui Xiao
Lu Han
Xiaoying Zhou
Jiong Wang
Na Zong
Pengyu Zhang
AI4Ed
45
1
0
30 Jan 2024
H2O-Danube-1.8B Technical Report
H2O-Danube-1.8B Technical Report
Philipp Singer
Pascal Pfeiffer
Yauhen Babakhin
Maximilian Jeblick
Nischay Dhankhar
Gabor Fodor
SriSatish Ambati
VLM
29
8
0
30 Jan 2024
Gradient-Based Language Model Red Teaming
Gradient-Based Language Model Red Teaming
Nevan Wichers
Carson E. Denison
Ahmad Beirami
24
26
0
30 Jan 2024
Iterative Data Smoothing: Mitigating Reward Overfitting and
  Overoptimization in RLHF
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Banghua Zhu
Michael I. Jordan
Jiantao Jiao
36
25
0
29 Jan 2024
YODA: Teacher-Student Progressive Learning for Language Models
YODA: Teacher-Student Progressive Learning for Language Models
Jianqiao Lu
Wanjun Zhong
Yufei Wang
Zhijiang Guo
Qi Zhu
...
Baojun Wang
Yasheng Wang
Lifeng Shang
Xin Jiang
Qun Liu
LRM
32
7
0
28 Jan 2024
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for
  Hallucination Mitigation
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation
Yuxin Liang
Zhuoyang Song
Hao Wang
Jiaxing Zhang
HILM
43
30
0
27 Jan 2024
An Empirical Study on Large Language Models in Accuracy and Robustness
  under Chinese Industrial Scenarios
An Empirical Study on Large Language Models in Accuracy and Robustness under Chinese Industrial Scenarios
Zongjie Li
Wenying Qiu
Pingchuan Ma
Yichen Li
You Li
Sijia He
Baozheng Jiang
Shuai Wang
Weixi Gu
33
2
0
27 Jan 2024
MULTIVERSE: Exposing Large Language Model Alignment Problems in Diverse
  Worlds
MULTIVERSE: Exposing Large Language Model Alignment Problems in Diverse Worlds
Xiaolong Jin
Zhuo Zhang
Xiangyu Zhang
18
3
0
25 Jan 2024
Can AI Assistants Know What They Don't Know?
Can AI Assistants Know What They Don't Know?
Qinyuan Cheng
Tianxiang Sun
Xiangyang Liu
Wenwei Zhang
Zhangyue Yin
Shimin Li
Linyang Li
Zhengfu He
Kai Chen
Xipeng Qiu
47
24
0
24 Jan 2024
ARGS: Alignment as Reward-Guided Search
ARGS: Alignment as Reward-Guided Search
Maxim Khanov
Jirayu Burapacheep
Yixuan Li
40
48
0
23 Jan 2024
Red Teaming Visual Language Models
Red Teaming Visual Language Models
Mukai Li
Lei Li
Yuwei Yin
Masood Ahmed
Zhenguang Liu
Qi Liu
VLM
51
30
0
23 Jan 2024
Improving Machine Translation with Human Feedback: An Exploration of
  Quality Estimation as a Reward Model
Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model
Zhiwei He
Xing Wang
Wenxiang Jiao
Zhuosheng Zhang
Rui Wang
Shuming Shi
Zhaopeng Tu
ALM
37
24
0
23 Jan 2024
GRATH: Gradual Self-Truthifying for Large Language Models
GRATH: Gradual Self-Truthifying for Large Language Models
Weixin Chen
D. Song
Bo Li
HILM
SyDa
33
5
0
22 Jan 2024
WARM: On the Benefits of Weight Averaged Reward Models
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
123
95
0
22 Jan 2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences
  without Tuning and Feedback
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
Songyang Gao
Qiming Ge
Wei Shen
Shihan Dou
Junjie Ye
...
Yicheng Zou
Zhi Chen
Hang Yan
Qi Zhang
Dahua Lin
57
11
0
21 Jan 2024
Orion-14B: Open-source Multilingual Large Language Models
Orion-14B: Open-source Multilingual Large Language Models
Du Chen
Yi Huang
Xiaopu Li
Yongqiang Li
Yongqiang Liu
Haihui Pan
Leichao Xu
Dacheng Zhang
Zhipeng Zhang
Kun Han
35
4
0
20 Jan 2024
Knowledge Verification to Nip Hallucination in the Bud
Knowledge Verification to Nip Hallucination in the Bud
Fanqi Wan
Xinting Huang
Leyang Cui
Xiaojun Quan
Wei Bi
Shuming Shi
HILM
29
4
0
19 Jan 2024
PHOENIX: Open-Source Language Adaption for Direct Preference
  Optimization
PHOENIX: Open-Source Language Adaption for Direct Preference Optimization
Matthias Uhlig
Sigurd Schacht
Sudarshan Kamath Barkur
ALM
24
1
0
19 Jan 2024
Self-Rewarding Language Models
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLM
SyDa
ALM
LRM
244
304
0
18 Jan 2024
Aligning Large Language Models with Counterfactual DPO
Aligning Large Language Models with Counterfactual DPO
Bradley Butcher
ALM
28
1
0
17 Jan 2024
Canvil: Designerly Adaptation for LLM-Powered User Experiences
Canvil: Designerly Adaptation for LLM-Powered User Experiences
K. J. Kevin Feng
Q. V. Liao
Ziang Xiao
Jennifer Wortman Vaughan
Amy X. Zhang
David W. McDonald
59
17
0
17 Jan 2024
ReFT: Reasoning with Reinforced Fine-Tuning
ReFT: Reasoning with Reinforced Fine-Tuning
Trung Quoc Luong
Xinbo Zhang
Zhanming Jie
Peng Sun
Xiaoran Jin
Hang Li
OffRL
LRM
ReLM
48
95
0
17 Jan 2024
Previous
123...474849...515253
Next