ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.08593
  4. Cited By
Fine-Tuning Language Models from Human Preferences
v1v2 (latest)

Fine-Tuning Language Models from Human Preferences

18 September 2019
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
    ALM
ArXiv (abs)PDFHTML

Papers citing "Fine-Tuning Language Models from Human Preferences"

50 / 1,265 papers shown
Title
Directly Fine-Tuning Diffusion Models on Differentiable Rewards
Directly Fine-Tuning Diffusion Models on Differentiable Rewards
Amita Gajewar
Paul Vicol
G. Bansal
David J Fleet
110
177
0
29 Sep 2023
Human Feedback is not Gold Standard
Human Feedback is not Gold Standard
Tom Hosking
Phil Blunsom
Max Bartolo
ALM
124
55
0
28 Sep 2023
Beyond Reverse KL: Generalizing Direct Preference Optimization with
  Diverse Divergence Constraints
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
Chaoqi Wang
Yibo Jiang
Yuguang Yang
Han Liu
Yuxin Chen
90
108
0
28 Sep 2023
The Trickle-down Impact of Reward (In-)consistency on RLHF
The Trickle-down Impact of Reward (In-)consistency on RLHF
Lingfeng Shen
Sihao Chen
Linfeng Song
Lifeng Jin
Baolin Peng
Haitao Mi
Daniel Khashabi
Dong Yu
93
23
0
28 Sep 2023
Don't throw away your value model! Generating more preferable text with
  Value-Guided Monte-Carlo Tree Search decoding
Don't throw away your value model! Generating more preferable text with Value-Guided Monte-Carlo Tree Search decoding
Jiacheng Liu
Andrew Cohen
Ramakanth Pasunuru
Yejin Choi
Hannaneh Hajishirzi
Asli Celikyilmaz
123
33
0
26 Sep 2023
Large Language Model Alignment: A Survey
Large Language Model Alignment: A Survey
Tianhao Shen
Renren Jin
Yufei Huang
Chuang Liu
Weilong Dong
Zishan Guo
Xinwei Wu
Yan Liu
Deyi Xiong
LM&MA
112
206
0
26 Sep 2023
Fine-tuning and aligning question answering models for complex
  information extraction tasks
Fine-tuning and aligning question answering models for complex information extraction tasks
Matthias Engelbach
Dennis Klau
Felix Scheerer
Jens Drawehn
Maximilien Kintz
63
6
0
26 Sep 2023
Aligning Large Multimodal Models with Factually Augmented RLHF
Aligning Large Multimodal Models with Factually Augmented RLHF
Zhiqing Sun
Sheng Shen
Shengcao Cao
Haotian Liu
Chunyuan Li
...
Liangyan Gui
Yu-Xiong Wang
Yiming Yang
Kurt Keutzer
Trevor Darrell
VLM
143
396
0
25 Sep 2023
Adapt then Unlearn: Exploring Parameter Space Semantics for Unlearning in Generative Adversarial Networks
Adapt then Unlearn: Exploring Parameter Space Semantics for Unlearning in Generative Adversarial Networks
Piyush Tiwary
Atri Guha
Subhodip Panda
Prathosh A.P.
MUGAN
139
9
0
25 Sep 2023
Natural Language based Context Modeling and Reasoning for Ubiquitous
  Computing with Large Language Models: A Tutorial
Natural Language based Context Modeling and Reasoning for Ubiquitous Computing with Large Language Models: A Tutorial
Haoyi Xiong
Jiang Bian
Sijia Yang
Xiaofei Zhang
Linghe Kong
Daqing Zhang
LRMLLMAG
107
5
0
24 Sep 2023
Probing the Moral Development of Large Language Models through Defining
  Issues Test
Probing the Moral Development of Large Language Models through Defining Issues Test
Kumar Tanmay
Aditi Khandelwal
Utkarsh Agarwal
Monojit Choudhury
LRM
58
17
0
23 Sep 2023
From Text to Source: Results in Detecting Large Language Model-Generated
  Content
From Text to Source: Results in Detecting Large Language Model-Generated Content
Wissam Antoun
Benoît Sagot
Djamé Seddah
DeLMO
85
13
0
23 Sep 2023
Goal-Oriented Prompt Attack and Safety Evaluation for LLMs
Goal-Oriented Prompt Attack and Safety Evaluation for LLMs
Chengyuan Liu
Fubang Zhao
Lizhi Qing
Yangyang Kang
Changlong Sun
Kun Kuang
Leilei Gan
AAML
75
21
0
21 Sep 2023
SCREWS: A Modular Framework for Reasoning with Revisions
SCREWS: A Modular Framework for Reasoning with Revisions
K. Shridhar
Harsh Jhamtani
Hao Fang
Benjamin Van Durme
Jason Eisner
Patrick Xia
KELMLRM
74
14
0
20 Sep 2023
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated
  Jailbreak Prompts
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
230
353
0
19 Sep 2023
Human-AI Interactions and Societal Pitfalls
Human-AI Interactions and Societal Pitfalls
Francisco Castro
Jian Gao
Sébastien Martin
81
3
0
19 Sep 2023
Understanding Catastrophic Forgetting in Language Models via Implicit
  Inference
Understanding Catastrophic Forgetting in Language Models via Implicit Inference
Suhas Kotha
Jacob Mitchell Springer
Aditi Raghunathan
CLL
128
71
0
18 Sep 2023
Automatic Personalized Impression Generation for PET Reports Using Large
  Language Models
Automatic Personalized Impression Generation for PET Reports Using Large Language Models
Xin Tie
Muheon Shin
Ali Pirasteh
Nevein Ibrahim
Zachary Huemann
...
K. M. Kelly
John W. Garrett
Junjie Hu
Steve Y. Cho
Tyler Bradshaw
LM&MA
114
10
0
18 Sep 2023
Exploring the impact of low-rank adaptation on the performance,
  efficiency, and regularization of RLHF
Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Simeng Sun
Dhawal Gupta
Mohit Iyyer
89
20
0
16 Sep 2023
Chain-of-Thought Reasoning is a Policy Improvement Operator
Chain-of-Thought Reasoning is a Policy Improvement Operator
Hugh Zhang
David C. Parkes
ReLMLM&RoLRM
67
13
0
15 Sep 2023
RAIN: Your Language Models Can Align Themselves without Finetuning
RAIN: Your Language Models Can Align Themselves without Finetuning
Yuhui Li
Fangyun Wei
Jinjing Zhao
Chao Zhang
Hongyang R. Zhang
SILM
104
118
0
13 Sep 2023
Generative AI
Generative AI
Stefan Feuerriegel
Jochen Hartmann
Christian Janiesch
Patrick Zschech
121
658
0
13 Sep 2023
Statistical Rejection Sampling Improves Preference Optimization
Statistical Rejection Sampling Improves Preference Optimization
Tianqi Liu
Yao-Min Zhao
Rishabh Joshi
Misha Khalman
Mohammad Saleh
Peter J. Liu
Jialu Liu
137
249
0
13 Sep 2023
Mitigating the Alignment Tax of RLHF
Mitigating the Alignment Tax of RLHF
Yong Lin
Hangyu Lin
Wei Xiong
Shizhe Diao
Zeming Zheng
...
Han Zhao
Nan Jiang
Heng Ji
Yuan Yao
Tong Zhang
MoMeCLL
112
81
0
12 Sep 2023
PACE-LM: Prompting and Augmentation for Calibrated Confidence Estimation
  with GPT-4 in Cloud Incident Root Cause Analysis
PACE-LM: Prompting and Augmentation for Calibrated Confidence Estimation with GPT-4 in Cloud Incident Root Cause Analysis
Dylan Zhang
Xuchao Zhang
Chetan Bansal
P. Las-Casas
Rodrigo Fonseca
Saravan Rajmohan
78
3
0
11 Sep 2023
Flesch or Fumble? Evaluating Readability Standard Alignment of
  Instruction-Tuned Language Models
Flesch or Fumble? Evaluating Readability Standard Alignment of Instruction-Tuned Language Models
Joseph Marvin Imperial
Harish Tayyar Madabushi
ELM
68
14
0
11 Sep 2023
Zero-Shot Robustification of Zero-Shot Models
Zero-Shot Robustification of Zero-Shot Models
Dyah Adila
Changho Shin
Lin Cai
Frederic Sala
86
20
0
08 Sep 2023
Everyone Deserves A Reward: Learning Customized Human Preferences
Everyone Deserves A Reward: Learning Customized Human Preferences
Pengyu Cheng
Jiawen Xie
Ke Bai
Yong Dai
Nan Du
81
36
0
06 Sep 2023
Efficient RLHF: Reducing the Memory Usage of PPO
Efficient RLHF: Reducing the Memory Usage of PPO
Michael Santacroce
Yadong Lu
Han Yu
Yuan-Fang Li
Yelong Shen
71
32
0
01 Sep 2023
Reinforcement Learning with Human Feedback for Realistic Traffic
  Simulation
Reinforcement Learning with Human Feedback for Realistic Traffic Simulation
Yulong Cao
Boris Ivanovic
Chaowei Xiao
Marco Pavone
68
15
0
01 Sep 2023
AI Deception: A Survey of Examples, Risks, and Potential Solutions
AI Deception: A Survey of Examples, Risks, and Potential Solutions
Peter S. Park
Simon Goldstein
Aidan O'Gara
Michael Chen
Dan Hendrycks
79
164
0
28 Aug 2023
Reinforcement Learning for Generative AI: A Survey
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
202
13
0
28 Aug 2023
SoTaNa: The Open-Source Software Development Assistant
SoTaNa: The Open-Source Software Development Assistant
Ensheng Shi
Fengji Zhang
Yanlin Wang
B. Chen
Lun Du
Hongyu Zhang
Shi Han
Dongmei Zhang
Hongbin Sun
70
13
0
25 Aug 2023
SayCanPay: Heuristic Planning with Large Language Models using Learnable
  Domain Knowledge
SayCanPay: Heuristic Planning with Large Language Models using Learnable Domain Knowledge
Rishi Hazra
Pedro Zuidberg Dos Martires
Luc de Raedt
LM&RoLLMAG
87
38
0
24 Aug 2023
From Instructions to Intrinsic Human Values -- A Survey of Alignment
  Goals for Big Models
From Instructions to Intrinsic Human Values -- A Survey of Alignment Goals for Big Models
Jing Yao
Xiaoyuan Yi
Xiting Wang
Jindong Wang
Xing Xie
ALM
95
44
0
23 Aug 2023
Towards an On-device Agent for Text Rewriting
Towards an On-device Agent for Text Rewriting
Yun Zhu
Yinxiao Liu
Felix Stahlberg
Shankar Kumar
Yu-hui Chen
Liangchen Luo
Lei Shu
Renjie Liu
Jindong Chen
Lei Meng
LLMAG
59
7
0
22 Aug 2023
Tackling Vision Language Tasks Through Learning Inner Monologues
Tackling Vision Language Tasks Through Learning Inner Monologues
Diji Yang
Kezhen Chen
Jinmeng Rao
Xiaoyuan Guo
Yawen Zhang
Jie Yang
Yize Zhang
MLLM
99
11
0
19 Aug 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRMOSLM
303
468
0
18 Aug 2023
Data Race Detection Using Large Language Models
Data Race Detection Using Large Language Models
Le Chen
Xianzhong Ding
M. Emani
T. Vanderbruggen
Pei-Hung Lin
Chunhua Liao
113
18
0
15 Aug 2023
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Youliang Yuan
Wenxiang Jiao
Wenxuan Wang
Jen-tse Huang
Pinjia He
Shuming Shi
Zhaopeng Tu
SILM
121
285
0
12 Aug 2023
Detecting and Preventing Hallucinations in Large Vision Language Models
Detecting and Preventing Hallucinations in Large Vision Language Models
Anisha Gunjal
Jihan Yin
Erhan Bas
MLLMVLM
96
175
0
11 Aug 2023
ZYN: Zero-Shot Reward Models with Yes-No Questions for RLAIF
ZYN: Zero-Shot Reward Models with Yes-No Questions for RLAIF
Víctor Gallego
SyDa
74
4
0
11 Aug 2023
Fuzz4All: Universal Fuzzing with Large Language Models
Fuzz4All: Universal Fuzzing with Large Language Models
Chun Xia
Matteo Paltenghi
Jia Le Tian
Michael Pradel
Lingming Zhang
ELM
91
134
0
09 Aug 2023
Reinforcement Learning for Generative AI: State of the Art,
  Opportunities and Open Research Challenges
Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges
Giorgio Franceschelli
Mirco Musolesi
AI4CE
139
22
0
31 Jul 2023
Learning to Model the World with Language
Learning to Model the World with Language
Jessy Lin
Yuqing Du
Olivia Watkins
Danijar Hafner
Pieter Abbeel
Dan Klein
Anca Dragan
LM&RoSyDa
121
55
0
31 Jul 2023
Language models as master equation solvers
Language models as master equation solvers
Chuanbo Liu
Jin Wang
67
0
0
29 Jul 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALMOffRL
158
533
0
27 Jul 2023
Evaluating the Moral Beliefs Encoded in LLMs
Evaluating the Moral Beliefs Encoded in LLMs
Nino Scherrer
Claudia Shi
Amir Feder
David M. Blei
89
140
0
26 Jul 2023
ChatGPT and Persuasive Technologies for the Management and Delivery of
  Personalized Recommendations in Hotel Hospitality
ChatGPT and Persuasive Technologies for the Management and Delivery of Personalized Recommendations in Hotel Hospitality
Manolis Remountakis
Konstantinos I. Kotis
Babis Kourtzis
G. Tsekouras
91
4
0
26 Jul 2023
How to use LLMs for Text Analysis
How to use LLMs for Text Analysis
Petter Tornberg
36
11
0
24 Jul 2023
Previous
123...181920...242526
Next