ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18290
  4. Cited By
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

29 May 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
    ALM
ArXivPDFHTML

Papers citing "Direct Preference Optimization: Your Language Model is Secretly a Reward Model"

50 / 2,637 papers shown
Title
BiMediX: Bilingual Medical Mixture of Experts LLM
BiMediX: Bilingual Medical Mixture of Experts LLM
Sara Pieri
Sahal Shaji Mullappilly
Fahad Shahbaz Khan
Rao Muhammad Anwer
Salman Khan
Timothy Baldwin
Hisham Cholakkal
LM&MA
43
12
0
20 Feb 2024
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Arka Pal
Deep Karkhanis
Samuel Dooley
Manley Roberts
Siddartha Naidu
Colin White
OSLM
46
129
0
20 Feb 2024
Can Large Language Models be Good Emotional Supporter? Mitigating
  Preference Bias on Emotional Support Conversation
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation
Dongjin Kang
Sunghwan Kim
Taeyoon Kwon
Seungjun Moon
Hyunsouk Cho
Youngjae Yu
Dongha Lee
Jinyoung Yeo
42
17
0
20 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELM
VLM
46
104
0
20 Feb 2024
TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box
  Identification
TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification
Martin Gubri
Dennis Ulmer
Hwaran Lee
Sangdoo Yun
Seong Joon Oh
SILM
396
5
1
20 Feb 2024
GlórIA -- A Generative and Open Large Language Model for Portuguese
GlórIA -- A Generative and Open Large Language Model for Portuguese
Ricardo Lopes
João Magalhães
David Semedo
37
8
0
20 Feb 2024
Incentive Compatibility for AI Alignment in Sociotechnical Systems:
  Positions and Prospects
Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects
Zhaowei Zhang
Fengshuo Bai
Mingzhi Wang
Haoyang Ye
Chengdong Ma
Yaodong Yang
35
4
0
20 Feb 2024
Instruction-tuned Language Models are Better Knowledge Learners
Instruction-tuned Language Models are Better Knowledge Learners
Zhengbao Jiang
Zhiqing Sun
Weijia Shi
Pedro Rodriguez
Chunting Zhou
Graham Neubig
Xi Lin
Wen-tau Yih
Srinivasan Iyer
KELM
46
34
0
20 Feb 2024
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
Xiao Li
Bolin Zhu
Kaiwen Shi
Sichen Liu
Yin Zhu
Yiwei Liu
Gong Cheng
AIMat
42
0
0
20 Feb 2024
GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence
GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence
Kundan Krishna
S. Ramprasad
Prakhar Gupta
Byron C. Wallace
Zachary Chase Lipton
Jeffrey P. Bigham
HILM
KELM
SyDa
57
9
0
19 Feb 2024
A Critical Evaluation of AI Feedback for Aligning Large Language Models
A Critical Evaluation of AI Feedback for Aligning Large Language Models
Archit Sharma
Sedrick Scott Keh
Eric Mitchell
Chelsea Finn
Kushal Arora
Thomas Kollar
ALM
LLMAG
29
23
0
19 Feb 2024
Emulated Disalignment: Safety Alignment for Large Language Models May
  Backfire!
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
Zhanhui Zhou
Jie Liu
Zhichen Dong
Jiaheng Liu
Chao Yang
Wanli Ouyang
Yu Qiao
15
17
0
19 Feb 2024
Stick to Your Role! Context-dependence and Stability of Personal Value
  Expression in Large Language Models
Stick to Your Role! Context-dependence and Stability of Personal Value Expression in Large Language Models
Grgur Kovač
Rémy Portelas
Masataka Sawayama
Peter Ford Dominey
Pierre-Yves Oudeyer
LLMAG
29
5
0
19 Feb 2024
Amplifying Training Data Exposure through Fine-Tuning with
  Pseudo-Labeled Memberships
Amplifying Training Data Exposure through Fine-Tuning with Pseudo-Labeled Memberships
Myung Gyo Oh
Hong Eun Ahn
L. Park
T.-H. Kwon
MIALM
AAML
37
0
0
19 Feb 2024
Transformer-based Causal Language Models Perform Clustering
Transformer-based Causal Language Models Perform Clustering
Xinbo Wu
Lav Varshney
29
6
0
19 Feb 2024
Your Large Language Model is Secretly a Fairness Proponent and You
  Should Prompt it Like One
Your Large Language Model is Secretly a Fairness Proponent and You Should Prompt it Like One
Tianlin Li
Xiaoyu Zhang
Chao Du
Tianyu Pang
Qian Liu
Qing Guo
Chao Shen
Yang Liu
ALM
45
11
0
19 Feb 2024
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
Zijun Liu
Boqun Kou
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Liu
37
2
0
19 Feb 2024
Direct Large Language Model Alignment Through Self-Rewarding Contrastive
  Prompt Distillation
Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation
Aiwei Liu
Haoping Bai
Zhiyun Lu
Xiang Kong
Simon Wang
Jiulong Shan
Mengsi Cao
Lijie Wen
ALM
42
12
0
19 Feb 2024
Learning to Edit: Aligning LLMs with Knowledge Editing
Learning to Edit: Aligning LLMs with Knowledge Editing
Yuxin Jiang
Yufei Wang
Chuhan Wu
Wanjun Zhong
Xingshan Zeng
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Qun Liu
Wei Wang
KELM
35
23
0
19 Feb 2024
FeB4RAG: Evaluating Federated Search in the Context of Retrieval
  Augmented Generation
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation
Shuai Wang
Ekaterina Khramtsova
Shengyao Zhuang
Guido Zuccon
34
11
0
19 Feb 2024
NOTE: Notable generation Of patient Text summaries through Efficient
  approach based on direct preference optimization
NOTE: Notable generation Of patient Text summaries through Efficient approach based on direct preference optimization
Imjin Ahn
Hansle Gwon
Young-Hak Kim
Tae Joon Jun
Sanghyun Park
48
3
0
19 Feb 2024
Ask Optimal Questions: Aligning Large Language Models with Retriever's
  Preference in Conversational Search
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search
Chanwoong Yoon
Gangwoo Kim
Byeongguk Jeon
Sungdong Kim
Yohan Jo
Jaewoo Kang
RALM
KELM
44
13
0
19 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference
  Dataset and Modular Fine-tuning Schema
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
58
2
0
19 Feb 2024
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned
  Language Models through Task Arithmetic
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
Rishabh Bhardwaj
Do Duc Anh
Soujanya Poria
MoMe
50
38
0
19 Feb 2024
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation
Tejpalsingh Siledar
Swaroop Nath
Sankara Sri Raghava Ravindra Muddu
Rupasai Rangaraju
Swaprava Nath
...
Suman Banerjee
Amey Patil
Sudhanshu Singh
M. Chelliah
Nikesh Garera
ALM
LRM
41
7
0
18 Feb 2024
Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM
Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM
Zijin Hong
Zheng Yuan
Hao Chen
Qinggang Zhang
Feiran Huang
Xiao Huang
46
24
0
18 Feb 2024
Aligning Modalities in Vision Large Language Models via Preference
  Fine-tuning
Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
Yiyang Zhou
Chenhang Cui
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
VLM
MLLM
43
89
0
18 Feb 2024
Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs
Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs
Arian Askari
Roxana Petcu
Chuan Meng
Mohammad Aliannejadi
Amin Abolghasemi
Evangelos Kanoulas
Suzan Verberne
28
9
0
18 Feb 2024
Dissecting Human and LLM Preferences
Dissecting Human and LLM Preferences
Junlong Li
Fan Zhou
Shichao Sun
Yikai Zhang
Hai Zhao
Pengfei Liu
ALM
36
5
0
17 Feb 2024
Aligning Large Language Models by On-Policy Self-Judgment
Aligning Large Language Models by On-Policy Self-Judgment
Sangkyu Lee
Sungdong Kim
Ashkan Yousefpour
Minjoon Seo
Kang Min Yoo
Youngjae Yu
OSLM
33
11
0
17 Feb 2024
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Yougang Lyu
Lingyong Yan
Shuaiqiang Wang
Haibo Shi
Dawei Yin
Pengjie Ren
Zhumin Chen
Maarten de Rijke
Zhaochun Ren
29
5
0
17 Feb 2024
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Arindam Mitra
Hamed Khanpour
Corby Rosset
Ahmed Hassan Awadallah
ALM
MoE
LRM
40
65
0
16 Feb 2024
Model Editing by Standard Fine-Tuning
Model Editing by Standard Fine-Tuning
G. Gangadhar
Karl Stratos
KELM
48
11
0
16 Feb 2024
RLVF: Learning from Verbal Feedback without Overgeneralization
RLVF: Learning from Verbal Feedback without Overgeneralization
Moritz Stephan
Alexander Khazatsky
Eric Mitchell
Annie S. Chen
Sheryl Hsu
Archit Sharma
Chelsea Finn
47
12
0
16 Feb 2024
Multi-modal preference alignment remedies regression of visual
  instruction tuning on language model
Multi-modal preference alignment remedies regression of visual instruction tuning on language model
Shengzhi Li
Rongyu Lin
Shichao Pei
45
22
0
16 Feb 2024
Direct Preference Optimization with an Offset
Direct Preference Optimization with an Offset
Afra Amini
Tim Vieira
Ryan Cotterell
73
55
0
16 Feb 2024
Strong hallucinations from negation and how to fix them
Strong hallucinations from negation and how to fix them
Nicholas Asher
Swarnadeep Bhar
ReLM
LRM
45
4
0
16 Feb 2024
Active Preference Optimization for Sample Efficient RLHF
Active Preference Optimization for Sample Efficient RLHF
Nirjhar Das
Souradip Chakraborty
Aldo Pacchiano
Sayak Ray Chowdhury
36
14
0
16 Feb 2024
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language
  Models
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models
Gagan Bhatia
El Moatez Billah Nagoudi
Hasan Cavusoglu
Muhammad Abdul-Mageed
AIFin
37
18
0
16 Feb 2024
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM
  Workflows
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
Ajay Patel
Colin Raffel
Chris Callison-Burch
SyDa
AI4CE
38
25
0
16 Feb 2024
BioMistral: A Collection of Open-Source Pretrained Large Language Models
  for Medical Domains
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
Yanis Labrak
Adrien Bazoge
Emmanuel Morin
P. Gourraud
Mickael Rouvier
Richard Dufour
111
197
0
15 Feb 2024
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Huizhuo Yuan
Zixiang Chen
Kaixuan Ji
Quanquan Gu
65
24
0
15 Feb 2024
Recovering the Pre-Fine-Tuning Weights of Generative Models
Recovering the Pre-Fine-Tuning Weights of Generative Models
Eliahu Horwitz
Jonathan Kahana
Yedid Hoshen
50
10
0
15 Feb 2024
Rewards-in-Context: Multi-objective Alignment of Foundation Models with
  Dynamic Preference Adjustment
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Rui Yang
Xiaoman Pan
Feng Luo
Shuang Qiu
Han Zhong
Dong Yu
Jianshu Chen
103
69
0
15 Feb 2024
Reward Generalization in RLHF: A Topological Perspective
Reward Generalization in RLHF: A Topological Perspective
Tianyi Qiu
Fanzhi Zeng
Jiaming Ji
Dong Yan
Kaile Wang
Jiayi Zhou
Yang Han
Josef Dai
Xuehai Pan
Yaodong Yang
AI4CE
37
4
0
15 Feb 2024
RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization
  Method for Alignment of Large Language Models
RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models
Saeed Khaki
JinJin Li
Lan Ma
Liu Yang
Prathap Ramachandra
31
19
0
15 Feb 2024
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent
Quentin Gallouedec
E. Beeching
Clément Romac
Emmanuel Dellandrea
37
11
0
15 Feb 2024
Aligning Crowd Feedback via Distributional Preference Reward Modeling
Aligning Crowd Feedback via Distributional Preference Reward Modeling
Dexun Li
Cong Zhang
Kuicai Dong
Derrick-Goh-Xin Deik
Ruiming Tang
Yong Liu
21
15
0
15 Feb 2024
ICDPO: Effectively Borrowing Alignment Capability of Others via
  In-context Direct Preference Optimization
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Feifan Song
Yuxuan Fan
Xin Zhang
Peiyi Wang
Houfeng Wang
32
8
0
14 Feb 2024
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
Zhichen Dong
Zhanhui Zhou
Chao Yang
Jing Shao
Yu Qiao
ELM
52
58
0
14 Feb 2024
Previous
123...454647...515253
Next