ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18290
  4. Cited By
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

29 May 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
    ALM
ArXivPDFHTML

Papers citing "Direct Preference Optimization: Your Language Model is Secretly a Reward Model"

50 / 2,637 papers shown
Title
Token-Efficient Leverage Learning in Large Language Models
Token-Efficient Leverage Learning in Large Language Models
Yuanhao Zeng
Min Wang
Yihang Wang
Yingxia Shao
42
0
0
01 Apr 2024
Bailong: Bilingual Transfer Learning based on QLoRA and Zip-tie
  Embedding
Bailong: Bilingual Transfer Learning based on QLoRA and Zip-tie Embedding
Lung-Chuan Chen
Zong-Ru Li
ALM
45
0
0
01 Apr 2024
Extensive Self-Contrast Enables Feedback-Free Language Model Alignment
Extensive Self-Contrast Enables Feedback-Free Language Model Alignment
Xiao Liu
Xixuan Song
Yuxiao Dong
Jie Tang
SyDa
36
5
0
31 Mar 2024
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization
Hritik Bansal
Ashima Suvarna
Gantavya Bhatt
Nanyun Peng
Kai-Wei Chang
Aditya Grover
ALM
64
9
0
31 Mar 2024
Configurable Safety Tuning of Language Models with Synthetic Preference
  Data
Configurable Safety Tuning of Language Models with Synthetic Preference Data
Víctor Gallego
40
6
0
30 Mar 2024
Dialectical Alignment: Resolving the Tension of 3H and Security Threats
  of LLMs
Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs
Shu Yang
Jiayuan Su
Han Jiang
Mengdi Li
Keyuan Cheng
Muhammad Asif Ali
Lijie Hu
Di Wang
53
5
0
30 Mar 2024
Instruction-Driven Game Engines on Large Language Models
Instruction-Driven Game Engines on Large Language Models
Hongqiu Wu
Xing-Chen Liu
Haizhen Zhao
Min Zhang
44
1
0
30 Mar 2024
Injecting New Knowledge into Large Language Models via Supervised
  Fine-Tuning
Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning
Nick Mecklenburg
Yiyou Lin
Xiaoxiao Li
Daniel Holstein
Leonardo Nunes
...
Ranveer Chandra
Vijay Aski
Pavan Kumar Reddy Yannam
Tolga Aktas
Todd Hendry
16
24
0
30 Mar 2024
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model
Qi Gou
Cam-Tu Nguyen
35
8
0
28 Mar 2024
Fine-Tuning Language Models with Reward Learning on Policy
Fine-Tuning Language Models with Reward Learning on Policy
Hao Lang
Fei Huang
Yongbin Li
ALM
45
5
0
28 Mar 2024
sDPO: Don't Use Your Data All at Once
sDPO: Don't Use Your Data All at Once
Dahyun Kim
Yungi Kim
Wonho Song
Hyeonwoo Kim
Yunsu Kim
Sanghoon Kim
Chanjun Park
36
31
0
28 Mar 2024
Disentangling Length from Quality in Direct Preference Optimization
Disentangling Length from Quality in Direct Preference Optimization
Ryan Park
Rafael Rafailov
Stefano Ermon
Chelsea Finn
ALM
56
112
0
28 Mar 2024
STaR-GATE: Teaching Language Models to Ask Clarifying Questions
STaR-GATE: Teaching Language Models to Ask Clarifying Questions
Chinmaya Andukuri
Jan-Philipp Fränken
Tobias Gerstenberg
Noah D. Goodman
SyDa
LRM
50
32
0
28 Mar 2024
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large
  Language Models
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Patrick Chao
Edoardo Debenedetti
Alexander Robey
Maksym Andriushchenko
Francesco Croce
...
Nicolas Flammarion
George J. Pappas
F. Tramèr
Hamed Hassani
Eric Wong
ALM
ELM
AAML
57
101
0
28 Mar 2024
What are human values, and how do we align AI to them?
What are human values, and how do we align AI to them?
Oliver Klingefjord
Ryan Lowe
Joe Edelman
38
22
0
27 Mar 2024
Understanding the Learning Dynamics of Alignment with Human Feedback
Understanding the Learning Dynamics of Alignment with Human Feedback
Shawn Im
Yixuan Li
ALM
37
11
0
27 Mar 2024
Safe and Robust Reinforcement Learning: Principles and Practice
Safe and Robust Reinforcement Learning: Principles and Practice
Taku Yamagata
Raúl Santos-Rodríguez
OffRL
48
2
0
27 Mar 2024
Improving Attributed Text Generation of Large Language Models via
  Preference Learning
Improving Attributed Text Generation of Large Language Models via Preference Learning
Dongfang Li
Zetian Sun
Baotian Hu
Zhenyu Liu
Xinshuo Hu
Xuebo Liu
Min Zhang
53
13
0
27 Mar 2024
Rejection Improves Reliability: Training LLMs to Refuse Unknown
  Questions Using RL from Knowledge Feedback
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback
Hongshen Xu
Zichen Zhu
Situo Zhang
Da Ma
Shuai Fan
Lu Chen
Kai Yu
HILM
41
35
0
27 Mar 2024
Assessment of Multimodal Large Language Models in Alignment with Human
  Values
Assessment of Multimodal Large Language Models in Alignment with Human Values
Zhelun Shi
Zhipin Wang
Hongxing Fan
Zaibin Zhang
Lijun Li
Yongting Zhang
Zhen-fei Yin
Lu Sheng
Yu Qiao
Jing Shao
47
16
0
26 Mar 2024
MetaAligner: Towards Generalizable Multi-Objective Alignment of Language
  Models
MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models
Kailai Yang
Zhiwei Liu
Qianqian Xie
Jimin Huang
Tianlin Zhang
Sophia Ananiadou
37
15
0
25 Mar 2024
CLHA: A Simple yet Effective Contrastive Learning Framework for Human
  Alignment
CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment
Feiteng Fang
Liang Zhu
Min Yang
Xi Feng
Jinchang Hou
Qixuan Zhao
Chengming Li
Xiping Hu
Ruifeng Xu
32
0
0
25 Mar 2024
Antigen-Specific Antibody Design via Direct Energy-based Preference
  Optimization
Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization
Xiangxin Zhou
Dongyu Xue
Ruizhe Chen
Zaixiang Zheng
Liang Wang
Quanquan Gu
DiffM
65
20
0
25 Mar 2024
Exploiting Semantic Reconstruction to Mitigate Hallucinations in
  Vision-Language Models
Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models
Minchan Kim
Minyeong Kim
Junik Bae
Suhwan Choi
Sungkyung Kim
Buru Chang
VLM
32
4
0
24 Mar 2024
WangchanLion and WangchanX MRC Eval
WangchanLion and WangchanX MRC Eval
Wannaphong Phatthiyaphaibun
Surapon Nonesung
Patomporn Payoungkhamdee
Peerat Limkonchotiwat
Can Udomcharoenchaikit
Jitkapat Sawatphol
Chompakorn Chaksangchaichot
Ekapol Chuangsuwanich
Sarana Nutanong
62
0
0
24 Mar 2024
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR
  Summarization
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization
Shengyi Huang
Michael Noukhovitch
Arian Hosseini
Kashif Rasul
Weixun Wang
Lewis Tunstall
VLM
30
31
0
24 Mar 2024
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow
  Instructions
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
Orion Weller
Benjamin Chang
Sean MacAvaney
Kyle Lo
Arman Cohan
Benjamin Van Durme
Dawn J Lawrie
Luca Soldaini
63
30
0
22 Mar 2024
DreamReward: Text-to-3D Generation with Human Preference
DreamReward: Text-to-3D Generation with Human Preference
Junliang Ye
Fangfu Liu
Qixiu Li
Zhengyi Wang
Yikai Wang
Xinzhou Wang
Yueqi Duan
Jun Zhu
74
21
0
21 Mar 2024
ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy
ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy
Zonghan Yang
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Liu
LLMAG
LRM
57
9
0
21 Mar 2024
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Han Zhao
Min Zhang
Wei Zhao
Pengxiang Ding
Siteng Huang
Donglin Wang
Mamba
57
69
0
21 Mar 2024
Detoxifying Large Language Models via Knowledge Editing
Detoxifying Large Language Models via Knowledge Editing
Meng Wang
Ningyu Zhang
Ziwen Xu
Zekun Xi
Shumin Deng
Yunzhi Yao
Qishen Zhang
Linyi Yang
Jindong Wang
Huajun Chen
KELM
48
57
0
21 Mar 2024
Reinforcement Learning from Reflective Feedback (RLRF): Aligning and
  Improving LLMs via Fine-Grained Self-Reflection
Reinforcement Learning from Reflective Feedback (RLRF): Aligning and Improving LLMs via Fine-Grained Self-Reflection
Kyungjae Lee
Dasol Hwang
Sunghyun Park
Youngsoo Jang
Moontae Lee
48
8
0
21 Mar 2024
Improving the Robustness of Large Language Models via Consistency
  Alignment
Improving the Robustness of Large Language Models via Consistency Alignment
Zhao Yukun
Lingyong Yan
Weiwei Sun
Guoliang Xing
Shuaiqiang Wang
Meng Chong
Zhicong Cheng
Zhaochun Ren
Yin Dawei
35
19
0
21 Mar 2024
Multi-Level Feedback Generation with Large Language Models for
  Empowering Novice Peer Counselors
Multi-Level Feedback Generation with Large Language Models for Empowering Novice Peer Counselors
Alicja Chaszczewicz
Raj Sanjay Shah
Ryan Louie
B. Arnow
Robert E. Kraut
Diyi Yang
OffRL
36
9
0
21 Mar 2024
Multi-Modal Hallucination Control by Visual Information Grounding
Multi-Modal Hallucination Control by Visual Information Grounding
Alessandro Favero
L. Zancato
Matthew Trager
Siddharth Choudhary
Pramuditha Perera
Alessandro Achille
Ashwin Swaminathan
Stefano Soatto
MLLM
90
63
0
20 Mar 2024
Testing the Limits of Jailbreaking Defenses with the Purple Problem
Testing the Limits of Jailbreaking Defenses with the Purple Problem
Taeyoun Kim
Suhas Kotha
Aditi Raghunathan
AAML
49
6
0
20 Mar 2024
RewardBench: Evaluating Reward Models for Language Modeling
RewardBench: Evaluating Reward Models for Language Modeling
Nathan Lambert
Valentina Pyatkin
Jacob Morrison
Lester James V. Miranda
Bill Yuchen Lin
...
Sachin Kumar
Tom Zick
Yejin Choi
Noah A. Smith
Hanna Hajishirzi
ALM
85
220
0
20 Mar 2024
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Yaowei Zheng
Richong Zhang
Junhao Zhang
Yanhan Ye
Zheyan Luo
Zhangchi Feng
Yongqiang Ma
55
401
0
20 Mar 2024
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in
  Text-to-Image Generation
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation
Jingkun An
Yinghao Zhu
Zongjian Li
Haoran Feng
Bohua Chen
Yemin Shi
Chengwei Pan
43
2
0
20 Mar 2024
Hyacinth6B: A large language model for Traditional Chinese
Hyacinth6B: A large language model for Traditional Chinese
Chih-Wei Song
Yin-Te Tsai
37
0
0
20 Mar 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
90
86
0
20 Mar 2024
Instruction Multi-Constraint Molecular Generation Using a
  Teacher-Student Large Language Model
Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model
Peng Zhou
Jianmin Wang
Chunyan Li
Zixu Wang
Yiping Liu
...
Xibao Cai
Houtim Lai
Wei Liu
Longyue Wang
Xiangxiang Zeng
26
0
0
20 Mar 2024
RouterBench: A Benchmark for Multi-LLM Routing System
RouterBench: A Benchmark for Multi-LLM Routing System
Qitian Jason Hu
Jacob Bieker
Xiuyu Li
Nan Jiang
Benjamin Keigwin
Gaurav Ranganath
Kurt Keutzer
Shriyash Kaustubh Upadhyay
54
38
0
18 Mar 2024
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion
  Distillation
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation
Axel Sauer
Frederic Boesel
Tim Dockhorn
A. Blattmann
Patrick Esser
Robin Rombach
DiffM
55
109
0
18 Mar 2024
Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for
  Language Models
Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models
Yi Luo
Zheng-Wen Lin
Yuhao Zhang
Jiashuo Sun
Chen Lin
Chengjin Xu
Xiangdong Su
Yelong Shen
Jian Guo
Yeyun Gong
LM&MA
ELM
ALM
AI4TS
35
1
0
18 Mar 2024
Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants'
  API Invocation Capabilities
Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants' API Invocation Capabilities
Honglin Mu
Yang Xu
Yunlong Feng
Xiaofeng Han
Yitong Li
Yutai Hou
Wanxiang Che
ELM
28
2
0
17 Mar 2024
Scaling Data Diversity for Fine-Tuning Language Models in Human
  Alignment
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
Feifan Song
Bowen Yu
Hao Lang
Haiyang Yu
Fei Huang
Houfeng Wang
Yongbin Li
ALM
45
11
0
17 Mar 2024
Reward Guided Latent Consistency Distillation
Reward Guided Latent Consistency Distillation
Jiachen Li
Weixi Feng
Wenhu Chen
William Y. Wang
EGVM
36
11
0
16 Mar 2024
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Hakim Sidahmed
Samrat Phatale
Alex Hutcheson
Zhuonan Lin
Zhan Chen
...
Jessica Hoffmann
Hassan Mansoor
Wei Li
Abhinav Rastogi
Lucas Dixon
38
2
0
15 Mar 2024
RAFT: Adapting Language Model to Domain Specific RAG
RAFT: Adapting Language Model to Domain Specific RAG
Tianjun Zhang
Shishir G. Patil
Naman Jain
Sheng Shen
Matei A. Zaharia
Ion Stoica
Joseph E. Gonzalez
RALM
39
182
0
15 Mar 2024
Previous
123...424344...515253
Next