ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.08593
  4. Cited By
Fine-Tuning Language Models from Human Preferences
v1v2 (latest)

Fine-Tuning Language Models from Human Preferences

18 September 2019
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
    ALM
ArXiv (abs)PDFHTML

Papers citing "Fine-Tuning Language Models from Human Preferences"

50 / 1,265 papers shown
Title
BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the
  E-commerce Domain
BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain
Kaisi Guan
Qian Cao
Yuchong Sun
Xiting Wang
Ruihua Song
84
1
0
30 Sep 2024
PersonalLLM: Tailoring LLMs to Individual Preferences
PersonalLLM: Tailoring LLMs to Individual Preferences
Thomas P. Zollo
Andrew Siah
Naimeng Ye
Ang Li
Hongseok Namkoong
131
13
0
30 Sep 2024
The Crucial Role of Samplers in Online Direct Preference Optimization
The Crucial Role of Samplers in Online Direct Preference Optimization
Ruizhe Shi
Runlong Zhou
Simon S. Du
126
11
0
29 Sep 2024
Model-based Preference Optimization in Abstractive Summarization without
  Human Feedback
Model-based Preference Optimization in Abstractive Summarization without Human Feedback
Jaepill Choi
Kyubyung Chae
Jiwoo Song
Yohan Jo
Taesup Kim
68
2
0
27 Sep 2024
Evaluation of Large Language Models for Summarization Tasks in the
  Medical Domain: A Narrative Review
Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review
Emma Croxford
Yanjun Gao
Nicholas Pellegrino
Karen K. Wong
Graham Wills
Elliot First
Frank J. Liao
Cherodeep Goswami
Brian Patterson
Majid Afshar
HILMELMLM&MA
129
1
0
26 Sep 2024
Inference-Time Language Model Alignment via Integrated Value Guidance
Inference-Time Language Model Alignment via Integrated Value Guidance
Zhixuan Liu
Zhanhui Zhou
Yuanfu Wang
Chao Yang
Yu Qiao
63
10
0
26 Sep 2024
Self-supervised Preference Optimization: Enhance Your Language Model
  with Preference Degree Awareness
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
Jian Li
Haojing Huang
Yujia Zhang
Pengfei Xu
Xi Chen
Rui Song
Lida Shi
Jingwen Wang
Hao Xu
48
0
0
26 Sep 2024
Just Say What You Want: Only-prompting Self-rewarding Online Preference
  Optimization
Just Say What You Want: Only-prompting Self-rewarding Online Preference Optimization
Ruijie Xu
Zhihan Liu
Yongfei Liu
Shipeng Yan
Zhaoran Wang
Zhi-Li Zhang
Xuming He
ALM
88
1
0
26 Sep 2024
Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization
Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization
Kaden Uhlig
Joern Wuebker
Raphael Reinauer
John DeNero
109
0
0
26 Sep 2024
SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning
SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning
Rimvydas Rubavicius
Peter David Fagan
A. Lascarides
Subramanian Ramamoorthy
LM&Ro
446
0
0
26 Sep 2024
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Qining Zhang
Lei Ying
OffRL
141
4
0
25 Sep 2024
Uncovering Latent Chain of Thought Vectors in Language Models
Uncovering Latent Chain of Thought Vectors in Language Models
Jason Zhang
Scott Viteri
LLMSVLRM
141
3
0
21 Sep 2024
RRM: Robust Reward Model Training Mitigates Reward Hacking
RRM: Robust Reward Model Training Mitigates Reward Hacking
Tianqi Liu
Wei Xiong
Jie Jessie Ren
Lichang Chen
Junru Wu
...
Yuan Liu
Bilal Piot
Abe Ittycheriah
Aviral Kumar
Mohammad Saleh
AAML
93
23
0
20 Sep 2024
Aligning Language Models Using Follow-up Likelihood as Reward Signal
Aligning Language Models Using Follow-up Likelihood as Reward Signal
Chen Zhang
Dading Chong
Feng Jiang
Chengguang Tang
Anningzhe Gao
Guohua Tang
Haizhou Li
ALM
105
2
0
20 Sep 2024
LLMR: Knowledge Distillation with a Large Language Model-Induced Reward
LLMR: Knowledge Distillation with a Large Language Model-Induced Reward
Dongheng Li
Yongchang Hao
Lili Mou
114
2
0
19 Sep 2024
Autoformalization of Game Descriptions using Large Language Models
Autoformalization of Game Descriptions using Large Language Models
Agnieszka Mensfelt
Kostas Stathis
Vince Trencsenyi
OffRLAI4CELRM
74
4
0
18 Sep 2024
Reward-Robust RLHF in LLMs
Reward-Robust RLHF in LLMs
Yuzi Yan
Xingzhou Lou
Jialian Li
Yiping Zhang
Jian Xie
Chao Yu
Yu Wang
Dong Yan
Yuan Shen
99
13
0
18 Sep 2024
From Lists to Emojis: How Format Bias Affects Model Alignment
From Lists to Emojis: How Format Bias Affects Model Alignment
Xuanchang Zhang
Wei Xiong
Lichang Chen
Dinesh Manocha
Heng Huang
Tong Zhang
ALM
106
13
0
18 Sep 2024
CoCA: Regaining Safety-awareness of Multimodal Large Language Models
  with Constitutional Calibration
CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Jiahui Gao
Renjie Pi
Tianyang Han
Han Wu
Lanqing Hong
Lingpeng Kong
Xin Jiang
Zhenguo Li
125
8
0
17 Sep 2024
Quantile Regression for Distributional Reward Models in RLHF
Quantile Regression for Distributional Reward Models in RLHF
Nicolai Dorka
102
26
0
16 Sep 2024
Generalizing Alignment Paradigm of Text-to-Image Generation with
  Preferences through $f$-divergence Minimization
Generalizing Alignment Paradigm of Text-to-Image Generation with Preferences through fff-divergence Minimization
Haoyuan Sun
Bo Xia
Yongzhe Chang
Xueqian Wang
EGVM
64
6
0
15 Sep 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
147
9
0
13 Sep 2024
Semi-Supervised Reward Modeling via Iterative Self-Training
Semi-Supervised Reward Modeling via Iterative Self-Training
Yifei He
Haoxiang Wang
Ziyan Jiang
Alexandros Papangelis
Han Zhao
OffRL
99
4
0
10 Sep 2024
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented
  Generation
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation
Lei Liang
Mengshu Sun
Zhengke Gui
Zhongshu Zhu
Zhouyu Jiang
...
Qing Cui
Wen Zhang
Huajun Chen
Wenguang Chen
Jun Zhou
95
19
0
10 Sep 2024
MemoVis: A GenAI-Powered Tool for Creating Companion Reference Images
  for 3D Design Feedback
MemoVis: A GenAI-Powered Tool for Creating Companion Reference Images for 3D Design Feedback
Chen Chen
Cuong Nguyen
Thibault Groueix
Vladimir G. Kim
Nadir Weibel
DiffM
62
4
0
09 Sep 2024
Sparse Rewards Can Self-Train Dialogue Agents
Sparse Rewards Can Self-Train Dialogue Agents
B. Lattimer
Varun Gangal
Ryan McDonald
Yi Yang
LLMAG
96
2
0
06 Sep 2024
Beyond Following: Mixing Active Initiative into Computational Creativity
Beyond Following: Mixing Active Initiative into Computational Creativity
Zhiyu Lin
Upol Ehsan
Rohan Agarwal
Samihan Dani
Vidushi Vashishth
Mark O. Riedl
72
0
0
06 Sep 2024
User-Driven Value Alignment: Understanding Users' Perceptions and
  Strategies for Addressing Biased and Discriminatory Statements in AI
  Companions
User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions
Xianzhe Fan
Qing Xiao
Xuhui Zhou
Jiaxin Pei
Maarten Sap
Zhicong Lu
Hong Shen
133
8
0
01 Sep 2024
MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for
  Retrieval-Augmented Large Language Models
MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models
Yujing Wang
Hainan Zhang
Liang Pang
Liang Pang
Hongwei Zheng
Zhiming Zheng
76
2
0
30 Aug 2024
A Gradient Analysis Framework for Rewarding Good and Penalizing Bad
  Examples in Language Models
A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models
Yi-Lin Tuan
William Yang Wang
98
1
0
29 Aug 2024
A Statistical Framework for Data-dependent Retrieval-Augmented Models
A Statistical Framework for Data-dependent Retrieval-Augmented Models
Soumya Basu
A. S. Rawat
Manzil Zaheer
RALM
83
0
0
27 Aug 2024
Constraining Participation: Affordances of Feedback Features in
  Interfaces to Large Language Models
Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models
Ned Cooper
Alexandra Zafiroglu
74
0
0
27 Aug 2024
How will advanced AI systems impact democracy?
How will advanced AI systems impact democracy?
Christopher Summerfield
Lisa Argyle
Michiel Bakker
Teddy Collins
Esin Durmus
...
Elizabeth Seger
Divya Siddarth
Henrik Skaug Sætra
MH Tessler
M. Botvinick
107
5
0
27 Aug 2024
Revisiting Image Captioning Training Paradigm via Direct CLIP-based
  Optimization
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
Nicholas Moratelli
Davide Caffagni
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
CLIP
95
3
0
26 Aug 2024
LLM-PBE: Assessing Data Privacy in Large Language Models
LLM-PBE: Assessing Data Privacy in Large Language Models
Qinbin Li
Junyuan Hong
Chulin Xie
Jeffrey Tan
Rachel Xin
...
Dan Hendrycks
Zhangyang Wang
Bo Li
Bingsheng He
Dawn Song
ELMPILM
116
18
0
23 Aug 2024
FIRST: Teach A Reliable Large Language Model Through Efficient
  Trustworthy Distillation
FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation
Kashun Shum
Minrui Xu
Jianshu Zhang
Zixin Chen
Shizhe Diao
Hanze Dong
Jipeng Zhang
Muhammad Omer Raza
78
5
0
22 Aug 2024
Advances in Preference-based Reinforcement Learning: A Review
Advances in Preference-based Reinforcement Learning: A Review
Youssef Abdelkareem
Shady Shehata
Fakhri Karray
OffRL
96
10
0
21 Aug 2024
Epistemic Injustice in Generative AI
Epistemic Injustice in Generative AI
Jackie Kay
Atoosa Kasirzadeh
Shakir Mohamed
AILaw
75
12
0
21 Aug 2024
RePair: Automated Program Repair with Process-based Feedback
RePair: Automated Program Repair with Process-based Feedback
Yuze Zhao
Zhenya Huang
Yixiao Ma
Rui Li
Kai Zhang
Hao Jiang
Qi Liu
Linbo Zhu
Yu Su
KELM
85
9
0
21 Aug 2024
Fine-Tuning a Local LLaMA-3 Large Language Model for Automated
  Privacy-Preserving Physician Letter Generation in Radiation Oncology
Fine-Tuning a Local LLaMA-3 Large Language Model for Automated Privacy-Preserving Physician Letter Generation in Radiation Oncology
Yihao Hou
Christoph Bert
A. Gomaa
G. Lahmer
D. Hoefler
...
S. Semrau
Andreas Maier
R. Fietkau
Yixing Huang
F. Putz
LM&MA
83
3
0
20 Aug 2024
Minor SFT loss for LLM fine-tune to increase performance and reduce
  model deviation
Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation
Shiming Xie
Hong Chen
Fred Yu
Zeye Sun
Xiuyu Wu
58
0
0
20 Aug 2024
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing
  Hallucinations in LVLMs
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs
Yassine Ouali
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
VLMMLLM
107
25
0
19 Aug 2024
Personalizing Reinforcement Learning from Human Feedback with
  Variational Preference Learning
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning
S. Poddar
Yanming Wan
Hamish Ivison
Abhishek Gupta
Natasha Jaques
104
50
0
19 Aug 2024
Minor DPO reject penalty to increase training robustness
Minor DPO reject penalty to increase training robustness
Shiming Xie
Hong Chen
Fred Yu
Zeye Sun
Xiuyu Wu
Yingfan Hu
73
4
0
19 Aug 2024
Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs
Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs
Simon D Angus
Lachlan O’Neill
24
0
0
19 Aug 2024
Zero-Shot Object-Centric Representation Learning
Zero-Shot Object-Centric Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Anirudh Goyal
Mike Mozer
Yoshua Bengio
Georg Martius
Maximilian Seitzer
VLMOCL
90
8
0
17 Aug 2024
SEAL: Systematic Error Analysis for Value ALignment
SEAL: Systematic Error Analysis for Value ALignment
Manon Revel
Matteo Cargnelutti
Tyna Eloundou
Greg Leppert
94
5
0
16 Aug 2024
What should I wear to a party in a Greek taverna? Evaluation for
  Conversational Agents in the Fashion Domain
What should I wear to a party in a Greek taverna? Evaluation for Conversational Agents in the Fashion Domain
Antonis Maronikolakis
Ana Peleteiro Ramallo
Weiwei Cheng
Thomas Kober
LLMAG
50
2
0
13 Aug 2024
On the Generalization of Preference Learning with DPO
On the Generalization of Preference Learning with DPO
Shawn Im
Yixuan Li
75
2
0
06 Aug 2024
KnowPO: Knowledge-aware Preference Optimization for Controllable
  Knowledge Selection in Retrieval-Augmented Language Models
KnowPO: Knowledge-aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models
Ruizhe Zhang
Yongxin Xu
Yuzhen Xiao
Runchuan Zhu
Xinke Jiang
Xu Chu
Junfeng Zhao
Yasha Wang
80
4
0
06 Aug 2024
Previous
123...789...242526
Next