ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18290
  4. Cited By
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

29 May 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
    ALM
ArXivPDFHTML

Papers citing "Direct Preference Optimization: Your Language Model is Secretly a Reward Model"

50 / 2,637 papers shown
Title
LLM360: Towards Fully Transparent Open-Source LLMs
LLM360: Towards Fully Transparent Open-Source LLMs
Zhengzhong Liu
Aurick Qiao
Willie Neiswanger
Hongyi Wang
Bowen Tan
...
Zhiting Hu
Mark Schulze
Preslav Nakov
Timothy Baldwin
Eric Xing
51
70
0
11 Dec 2023
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
O. Ovadia
Menachem Brief
Moshik Mishaeli
Oren Elisha
RALM
40
134
0
10 Dec 2023
NLLG Quarterly arXiv Report 09/23: What are the most influential current
  AI Papers?
NLLG Quarterly arXiv Report 09/23: What are the most influential current AI Papers?
Ran Zhang
Aida Kostikova
Christoph Leiter
Jonas Belouadi
Daniil Larionov
Yanran Chen
Vivian Fresen
Steffen Eger
53
0
0
09 Dec 2023
Fortify the Shortest Stave in Attention: Enhancing Context Awareness of
  Large Language Models for Effective Tool Use
Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
Yuhan Chen
Ang Lv
Ting-En Lin
C. Chen
Yuchuan Wu
Fei Huang
Yongbin Li
Rui Yan
34
24
0
07 Dec 2023
Large Language Models on Graphs: A Comprehensive Survey
Large Language Models on Graphs: A Comprehensive Survey
Bowen Jin
Gang Liu
Chi Han
Meng Jiang
Heng Ji
Jiawei Han
AI4CE
44
141
0
05 Dec 2023
ULMA: Unified Language Model Alignment with Human Demonstration and
  Point-wise Preference
ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference
Tianchi Cai
Xierui Song
Jiyan Jiang
Fei Teng
Jinjie Gu
Guannan Zhang
ALM
21
4
0
05 Dec 2023
RLHF and IIA: Perverse Incentives
RLHF and IIA: Perverse Incentives
Wanqiao Xu
Shi Dong
Xiuyuan Lu
Grace Lam
Zheng Wen
Benjamin Van Roy
37
2
0
02 Dec 2023
Eliciting Latent Knowledge from Quirky Language Models
Eliciting Latent Knowledge from Quirky Language Models
Alex Troy Mallen
Madeline Brumley
Julia Kharchenko
Nora Belrose
HILM
RALM
KELM
24
25
0
02 Dec 2023
Nash Learning from Human Feedback
Nash Learning from Human Feedback
Rémi Munos
Michal Valko
Daniele Calandriello
M. G. Azar
Mark Rowland
...
Nikola Momchev
Olivier Bachem
D. Mankowitz
Doina Precup
Bilal Piot
42
126
0
01 Dec 2023
SeaLLMs -- Large Language Models for Southeast Asia
SeaLLMs -- Large Language Models for Southeast Asia
Xuan-Phi Nguyen
Wenxuan Zhang
Xin Li
Mahani Aljunied
Zhiqiang Hu
...
Yue Deng
Sen Yang
Chaoqun Liu
Hang Zhang
Li Bing
LRM
34
74
0
01 Dec 2023
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from
  Fine-grained Correctional Human Feedback
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
M. Steyvers
Yuan Yao
Haoye Zhang
Taiwen He
Yifeng Han
...
Xinyue Hu
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
Tat-Seng Chua
MLLM
VLM
150
182
0
01 Dec 2023
Sample Efficient Preference Alignment in LLMs via Active Exploration
Sample Efficient Preference Alignment in LLMs via Active Exploration
Viraj Mehta
Vikramjeet Das
Ojash Neopane
Yijia Dai
Ilija Bogunovic
Ilija Bogunovic
Willie Neiswanger
Stefano Ermon
Jeff Schneider
Willie Neiswanger
OffRL
38
12
0
01 Dec 2023
AlignBench: Benchmarking Chinese Alignment of Large Language Models
AlignBench: Benchmarking Chinese Alignment of Large Language Models
Xiao Liu
Xuanyu Lei
Sheng-Ping Wang
Yue Huang
Zhuoer Feng
...
Hongning Wang
Jing Zhang
Minlie Huang
Yuxiao Dong
Jie Tang
ELM
LM&MA
ALM
125
43
0
30 Nov 2023
TurkishBERTweet: Fast and Reliable Large Language Model for Social Media
  Analysis
TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis
Ali Najafi
Onur Varol
VLM
29
12
0
29 Nov 2023
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
  Catching up?
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Chenyu You
ELM
CLL
AI4MH
LRM
ALM
85
27
0
28 Nov 2023
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware
  Direct Preference Optimization
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
Zhiyuan Zhao
Bin Wang
Linke Ouyang
Xiao-wen Dong
Jiaqi Wang
Conghui He
MLLM
VLM
48
106
0
28 Nov 2023
Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld
Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld
Yijun Yang
Tianyi Zhou
Kanxue Li
Dapeng Tao
Lusong Li
Li Shen
Xiaodong He
Jing Jiang
Yuhui Shi
LLMAG
LM&Ro
35
35
0
28 Nov 2023
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi
Tobia Poppi
Federico Cocchi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
27
9
0
27 Nov 2023
Data-Efficient Alignment of Large Language Models with Human Feedback
  Through Natural Language
Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language
Di Jin
Shikib Mehri
Devamanyu Hazarika
Aishwarya Padmakumar
Sungjin Lee
Yang Liu
Mahdi Namazifar
ALM
26
16
0
24 Nov 2023
A density estimation perspective on learning from pairwise human
  preferences
A density estimation perspective on learning from pairwise human preferences
Vincent Dumoulin
Daniel D. Johnson
Pablo Samuel Castro
Hugo Larochelle
Yann Dauphin
37
12
0
23 Nov 2023
Direct Preference-Based Evolutionary Multi-Objective Optimization with Dueling Bandit
Direct Preference-Based Evolutionary Multi-Objective Optimization with Dueling Bandit
Tian Huang
Ke Li
Ke Li
31
1
0
23 Nov 2023
Using Human Feedback to Fine-tune Diffusion Models without Any Reward
  Model
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Kai Yang
Jian Tao
Jiafei Lyu
Chunjiang Ge
Jiaxin Chen
Qimai Li
Weihan Shen
Xiaolong Zhu
Xiu Li
EGVM
23
89
0
22 Nov 2023
Diffusion Model Alignment Using Direct Preference Optimization
Diffusion Model Alignment Using Direct Preference Optimization
Bram Wallace
Meihua Dang
Rafael Rafailov
Linqi Zhou
Aaron Lou
Senthil Purushwalkam
Stefano Ermon
Caiming Xiong
Chenyu You
Nikhil Naik
EGVM
50
229
0
21 Nov 2023
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Hamish Ivison
Yizhong Wang
Valentina Pyatkin
Nathan Lambert
Matthew E. Peters
...
Joel Jang
David Wadden
Noah A. Smith
Iz Beltagy
Hanna Hajishirzi
ALM
ELM
32
181
0
17 Nov 2023
Characterizing Tradeoffs in Language Model Decoding with Informational
  Interpretations
Characterizing Tradeoffs in Language Model Decoding with Informational Interpretations
Chung-Ching Chang
William W. Cohen
Yun-hsuan Sung
21
0
0
16 Nov 2023
FollowEval: A Multi-Dimensional Benchmark for Assessing the
  Instruction-Following Capability of Large Language Models
FollowEval: A Multi-Dimensional Benchmark for Assessing the Instruction-Following Capability of Large Language Models
Yimin Jing
Renren Jin
Jiahao Hu
Huishi Qiu
Xiaohua Wang
Peng Wang
Deyi Xiong
LRM
ELM
30
1
0
16 Nov 2023
HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM
HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM
Zhilin Wang
Yi Dong
Jiaqi Zeng
Virginia Adams
Makesh Narsimhan Sreedhar
...
Olivier Delalleau
Jane Polak Scowcroft
Neel Kant
Aidan Swope
Oleksii Kuchaiev
3DV
22
67
0
16 Nov 2023
Stealthy and Persistent Unalignment on Large Language Models via
  Backdoor Injections
Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections
Yuanpu Cao
Bochuan Cao
Jinghui Chen
34
24
0
15 Nov 2023
When Large Language Models contradict humans? Large Language Models'
  Sycophantic Behaviour
When Large Language Models contradict humans? Large Language Models' Sycophantic Behaviour
Leonardo Ranaldi
Giulia Pucci
27
33
0
15 Nov 2023
Grounding Gaps in Language Model Generations
Grounding Gaps in Language Model Generations
Omar Shaikh
Kristina Gligorić
Ashna Khetan
Matthias Gerstgrasser
Diyi Yang
Dan Jurafsky
29
21
0
15 Nov 2023
Rescue: Ranking LLM Responses with Partial Ordering to Improve Response
  Generation
Rescue: Ranking LLM Responses with Partial Ordering to Improve Response Generation
Yikun Wang
Rui Zheng
Haoming Li
Qi Zhang
Tao Gui
Fei Liu
OffRL
25
3
0
15 Nov 2023
Ever: Mitigating Hallucination in Large Language Models through
  Real-Time Verification and Rectification
Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification
Haoqiang Kang
Juntong Ni
Huaxiu Yao
HILM
LRM
32
34
0
15 Nov 2023
Defending Large Language Models Against Jailbreaking Attacks Through
  Goal Prioritization
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
Zhexin Zhang
Junxiao Yang
Pei Ke
Fei Mi
Hongning Wang
Minlie Huang
AAML
28
116
0
15 Nov 2023
An Empathetic User-Centric Chatbot for Emotional Support
An Empathetic User-Centric Chatbot for Emotional Support
Yanting Pan
Yixuan Tang
Yuchen Niu
26
3
0
15 Nov 2023
Value FULCRA: Mapping Large Language Models to the Multidimensional
  Spectrum of Basic Human Values
Value FULCRA: Mapping Large Language Models to the Multidimensional Spectrum of Basic Human Values
Jing Yao
Xiaoyuan Yi
Xiting Wang
Yifan Gong
Xing Xie
46
23
0
15 Nov 2023
Routing to the Expert: Efficient Reward-guided Ensemble of Large
  Language Models
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models
Keming Lu
Hongyi Yuan
Runji Lin
Junyang Lin
Zheng Yuan
Chang Zhou
Jingren Zhou
MoE
LRM
48
52
0
15 Nov 2023
Safer-Instruct: Aligning Language Models with Automated Preference Data
Safer-Instruct: Aligning Language Models with Automated Preference Data
Taiwei Shi
Kai Chen
Jieyu Zhao
ALM
SyDa
35
21
0
15 Nov 2023
Are You Sure? Challenging LLMs Leads to Performance Drops in The
  FlipFlop Experiment
Are You Sure? Challenging LLMs Leads to Performance Drops in The FlipFlop Experiment
Philippe Laban
Lidiya Murakhovs'ka
Caiming Xiong
Chien-Sheng Wu
LRM
26
19
0
14 Nov 2023
Functionality learning through specification instructions
Functionality learning through specification instructions
Pedro Henrique Luz de Araujo
Benjamin Roth
ELM
41
0
0
14 Nov 2023
Fine-tuning Language Models for Factuality
Fine-tuning Language Models for Factuality
Katherine Tian
Eric Mitchell
Huaxiu Yao
Christopher D. Manning
Chelsea Finn
KELM
HILM
SyDa
19
167
0
14 Nov 2023
Predicting Text Preference Via Structured Comparative Reasoning
Predicting Text Preference Via Structured Comparative Reasoning
Jing Nathan Yan
Tianqi Liu
Justin T Chiu
Jiaming Shen
Zhen Qin
...
Charumathi Lakshmanan
Y. Kurzion
Alexander M. Rush
Jialu Liu
Michael Bendersky
LRM
48
7
0
14 Nov 2023
Direct Preference Optimization for Neural Machine Translation with
  Minimum Bayes Risk Decoding
Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding
Guangyu Yang
Jinghong Chen
Weizhe Lin
Bill Byrne
24
20
0
14 Nov 2023
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM
  Game
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game
Pengyu Cheng
Yifan Yang
Jian Li
Yong Dai
Tianhao Hu
Peixin Cao
Nan Du
Xiaolong Li
28
29
0
14 Nov 2023
Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback
Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback
Seungjun Moon
Hyungjoo Chae
Yongho Song
Taeyoon Kwon
Dongjin Kang
Kai Tzu-iunn Ong
Seung-won Hwang
Jinyoung Yeo
KELM
23
11
0
13 Nov 2023
Towards the Law of Capacity Gap in Distilling Language Models
Towards the Law of Capacity Gap in Distilling Language Models
Chen Zhang
Dawei Song
Zheyu Ye
Yan Gao
ELM
38
20
0
13 Nov 2023
Knowledgeable Preference Alignment for LLMs in Domain-specific Question
  Answering
Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
Yichi Zhang
Zhuo Chen
Yin Fang
Yanxi Lu
Fangming Li
Wen Zhang
Hua-zeng Chen
66
30
0
11 Nov 2023
Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations
Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations
Joey Hong
Sergey Levine
Anca Dragan
OffRL
LLMAG
50
24
0
09 Nov 2023
Black-Box Prompt Optimization: Aligning Large Language Models without
  Model Training
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training
Jiale Cheng
Xiao Liu
Kehan Zheng
Pei Ke
Hongning Wang
Yuxiao Dong
Jie Tang
Minlie Huang
31
79
0
07 Nov 2023
Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment
Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment
Geyang Guo
Ranchi Zhao
Tianyi Tang
Wayne Xin Zhao
Ji-Rong Wen
ALM
45
27
0
07 Nov 2023
AI-TA: Towards an Intelligent Question-Answer Teaching Assistant using
  Open-Source LLMs
AI-TA: Towards an Intelligent Question-Answer Teaching Assistant using Open-Source LLMs
Yann Hicke
Anmol Agarwal
Qianou Ma
Paul Denny
AI4Ed
42
24
0
05 Nov 2023
Previous
123...4950515253
Next