ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.08593
  4. Cited By
Fine-Tuning Language Models from Human Preferences

Fine-Tuning Language Models from Human Preferences

18 September 2019
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
    ALM
ArXivPDFHTML

Papers citing "Fine-Tuning Language Models from Human Preferences"

50 / 375 papers shown
Title
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Youliang Yuan
Wenxiang Jiao
Wenxuan Wang
Jen-tse Huang
Pinjia He
Shuming Shi
Zhaopeng Tu
SILM
76
234
0
12 Aug 2023
In-Context Learning Learns Label Relationships but Is Not Conventional
  Learning
In-Context Learning Learns Label Relationships but Is Not Conventional Learning
Jannik Kossen
Y. Gal
Tom Rainforth
40
28
0
23 Jul 2023
Leveraging Contextual Counterfactuals Toward Belief Calibration
Leveraging Contextual Counterfactuals Toward Belief Calibration
Qiuyi Zhang
Zhang
Michael S. Lee
Sherol Chen
29
1
0
13 Jul 2023
Secrets of RLHF in Large Language Models Part I: PPO
Secrets of RLHF in Large Language Models Part I: PPO
Rui Zheng
Shihan Dou
Songyang Gao
Yuan Hua
Wei Shen
...
Hang Yan
Tao Gui
Qi Zhang
Xipeng Qiu
Xuanjing Huang
ALM
OffRL
41
159
0
11 Jul 2023
Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts
Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts
Mounica Maddela
Megan Ung
Jing Xu
Andrea Madotto
H. Foran
Y-Lan Boureau
LRM
36
21
0
06 Jul 2023
Jailbroken: How Does LLM Safety Training Fail?
Jailbroken: How Does LLM Safety Training Fail?
Alexander Wei
Nika Haghtalab
Jacob Steinhardt
107
852
0
05 Jul 2023
Natural Language Generation and Understanding of Big Code for
  AI-Assisted Programming: A Review
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
M. Wong
Shangxin Guo
Ching Nam Hang
Siu-Wai Ho
C. Tan
42
78
0
04 Jul 2023
Evaluating Shutdown Avoidance of Language Models in Textual Scenarios
Evaluating Shutdown Avoidance of Language Models in Textual Scenarios
Teun van der Weij
Simon Lermen
Leon Lang
LLMAG
22
4
0
03 Jul 2023
Let Me Teach You: Pedagogical Foundations of Feedback for Language
  Models
Let Me Teach You: Pedagogical Foundations of Feedback for Language Models
Beatriz Borges
Niket Tandon
Tanja Kaser
Antoine Bosselut
22
4
0
01 Jul 2023
Personality Traits in Large Language Models
Personality Traits in Large Language Models
Gregory Serapio-García
Mustafa Safdari
Clément Crepy
Luning Sun
Stephen Fitz
P. Romero
Marwa Abdulhai
Aleksandra Faust
Maja J. Matarić
LM&MA
LLMAG
58
119
0
01 Jul 2023
Aligning Synthetic Medical Images with Clinical Knowledge using Human
  Feedback
Aligning Synthetic Medical Images with Clinical Knowledge using Human Feedback
Shenghuan Sun
Gregory M. Goldgof
A. Butte
Ahmed Alaa
MedIm
24
12
0
16 Jun 2023
Domain-specific ChatBots for Science using Embeddings
Domain-specific ChatBots for Science using Embeddings
Kevin G. Yager
32
8
0
15 Jun 2023
Chart2Vec: A Universal Embedding of Context-Aware Visualizations
Chart2Vec: A Universal Embedding of Context-Aware Visualizations
Qing Chen
Ying Chen
Ruishi Zou
Wei Shuai
Yi Guo
Jiazhe Wang
Nana Cao
20
3
0
14 Jun 2023
Turning large language models into cognitive models
Turning large language models into cognitive models
Marcel Binz
Eric Schulz
32
54
0
06 Jun 2023
Fine-Grained Human Feedback Gives Better Rewards for Language Model
  Training
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Zeqiu Wu
Yushi Hu
Weijia Shi
Nouha Dziri
Alane Suhr
Prithviraj Ammanabrolu
Noah A. Smith
Mari Ostendorf
Hannaneh Hajishirzi
ALM
35
304
0
02 Jun 2023
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Shentao Yang
Shujian Zhang
Congying Xia
Yihao Feng
Caiming Xiong
Mi Zhou
29
23
0
01 Jun 2023
CFL: Causally Fair Language Models Through Token-level Attribute
  Controlled Generation
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
Rahul Madhavan
Rishabh Garg
Kahini Wadhawan
S. Mehta
29
5
0
01 Jun 2023
An Invariant Learning Characterization of Controlled Text Generation
An Invariant Learning Characterization of Controlled Text Generation
Carolina Zheng
Claudia Shi
Keyon Vafa
Amir Feder
David M. Blei
OOD
38
8
0
31 May 2023
Taming AI Bots: Controllability of Neural States in Large Language
  Models
Taming AI Bots: Controllability of Neural States in Large Language Models
Stefano Soatto
Paulo Tabuada
Pratik Chaudhari
Tianwei Liu
LLMAG
LM&Ro
18
13
0
29 May 2023
Reward Collapse in Aligning Large Language Models
Reward Collapse in Aligning Large Language Models
Ziang Song
Tianle Cai
Jason D. Lee
Weijie J. Su
ALM
33
22
0
28 May 2023
Coarse-Tuning Models of Code with Reinforcement Learning Feedback
Coarse-Tuning Models of Code with Reinforcement Learning Feedback
Abhinav C. P. Jain
Chima Adiole
Swarat Chaudhuri
Thomas W. Reps
Chris Jermaine Rice University
ALM
25
2
0
25 May 2023
Large Language Models for User Interest Journeys
Large Language Models for User Interest Journeys
Konstantina Christakopoulou
Alberto Lalama
Cj Adams
Iris Qu
Yifat Amir
...
Dina Bseiso
Sarah Scodel
Lucas Dixon
Ed H. Chi
Minmin Chen
19
25
0
24 May 2023
Policy Learning based on Deep Koopman Representation
Policy Learning based on Deep Koopman Representation
Wenjian Hao
Paulo Heredia
Bowen Huang
Zehui Lu
Zihao Liang
Shaoshuai Mou
36
1
0
24 May 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence
  Scores from Language Models Fine-Tuned with Human Feedback
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
54
289
0
24 May 2023
In-Context Impersonation Reveals Large Language Models' Strengths and
  Biases
In-Context Impersonation Reveals Large Language Models' Strengths and Biases
Leonard Salewski
Stephan Alaniz
Isabel Rio-Torto
Eric Schulz
Zeynep Akata
44
151
0
24 May 2023
Improving Factuality and Reasoning in Language Models through Multiagent
  Debate
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Yilun Du
Shuang Li
Antonio Torralba
J. Tenenbaum
Igor Mordatch
LLMAG
LRM
46
614
0
23 May 2023
Training Diffusion Models with Reinforcement Learning
Training Diffusion Models with Reinforcement Learning
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
44
318
0
22 May 2023
Large Language Models are Not Yet Human-Level Evaluators for Abstractive
  Summarization
Large Language Models are Not Yet Human-Level Evaluators for Abstractive Summarization
Chenhui Shen
Liying Cheng
Xuan-Phi Nguyen
Yang You
Lidong Bing
ELM
ALM
47
64
0
22 May 2023
Continually Improving Extractive QA via Human Feedback
Continually Improving Extractive QA via Human Feedback
Ge Gao
Hung-Ting Chen
Yoav Artzi
Eunsol Choi
26
12
0
21 May 2023
Collaborative Development of NLP models
Collaborative Development of NLP models
Fereshte Khani
Marco Tulio Ribeiro
32
2
0
20 May 2023
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Hiroki Furuta
Kuang-Huei Lee
Ofir Nachum
Yutaka Matsuo
Aleksandra Faust
S. Gu
Izzeddin Gur
LM&Ro
36
93
0
19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
45
83
0
19 May 2023
Language Models Meet World Models: Embodied Experiences Enhance Language
  Models
Language Models Meet World Models: Embodied Experiences Enhance Language Models
Jiannan Xiang
Tianhua Tao
Yi Gu
Tianmin Shu
Zirui Wang
Zichao Yang
Zhiting Hu
ALM
LLMAG
LM&Ro
CLL
36
94
0
18 May 2023
CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to
  Guardrail Models for Virtual Assistants
CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants
A. Sun
Varun Nair
Elliot Schumacher
Anitha Kannan
32
3
0
27 Apr 2023
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
Hanze Dong
Wei Xiong
Deepanshu Goyal
Yihan Zhang
Winnie Chow
Rui Pan
Shizhe Diao
Jipeng Zhang
Kashun Shum
Tong Zhang
ALM
18
408
0
13 Apr 2023
Are LLMs All You Need for Task-Oriented Dialogue?
Are LLMs All You Need for Task-Oriented Dialogue?
Vojtvech Hudevcek
Ondrej Dusek
26
57
0
13 Apr 2023
OpenAGI: When LLM Meets Domain Experts
OpenAGI: When LLM Meets Domain Experts
Yingqiang Ge
Wenyue Hua
Kai Mei
Jianchao Ji
Juntao Tan
Shuyuan Xu
Zelong Li
Yongfeng Zhang
VLM
LRM
38
212
0
10 Apr 2023
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language
  Models
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models
Emilio Ferrara
SILM
36
247
0
07 Apr 2023
Beyond Summarization: Designing AI Support for Real-World Expository
  Writing Tasks
Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks
Zejiang Shen
Tal August
Pao Siangliulue
Kyle Lo
Jonathan Bragg
Jeff Hammerbacher
Doug Downey
Joseph Chee Chang
David Sontag
ELM
20
18
0
05 Apr 2023
REFINER: Reasoning Feedback on Intermediate Representations
REFINER: Reasoning Feedback on Intermediate Representations
Debjit Paul
Mete Ismayilzada
Maxime Peyrard
Beatriz Borges
Antoine Bosselut
Robert West
Boi Faltings
ReLM
LRM
29
171
0
04 Apr 2023
The Vector Grounding Problem
The Vector Grounding Problem
Dimitri Coelho Mollo
Raphael Milliere
41
26
0
04 Apr 2023
Evaluating Large Language Models on a Highly-specialized Topic,
  Radiation Oncology Physics
Evaluating Large Language Models on a Highly-specialized Topic, Radiation Oncology Physics
J. Holmes
Zheng Liu
Lian-Cheng Zhang
Yuzhen Ding
Terence T. Sio
...
Jonathan B. Ashman
Xiang Li
Tianming Liu
Jiajian Shen
Wei Liu
LM&MA
AI4CE
ELM
30
120
0
01 Apr 2023
Humans in Humans Out: On GPT Converging Toward Common Sense in both
  Success and Failure
Humans in Humans Out: On GPT Converging Toward Common Sense in both Success and Failure
Philipp E. Koralus
Vincent Wang-Ma'scianica
LRM
6
13
0
30 Mar 2023
Training Language Models with Language Feedback at Scale
Training Language Models with Language Feedback at Scale
Jérémy Scheurer
Jon Ander Campos
Tomasz Korbak
Jun Shern Chan
Angelica Chen
Kyunghyun Cho
Ethan Perez
ALM
48
103
0
28 Mar 2023
On the Creativity of Large Language Models
On the Creativity of Large Language Models
Giorgio Franceschelli
Mirco Musolesi
72
52
0
27 Mar 2023
SPEC: Summary Preference Decomposition for Low-Resource Abstractive
  Summarization
SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization
Yi-Syuan Chen
Yun-Zhu Song
Hong-Han Shuai
33
6
0
24 Mar 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and
  Global Optimality
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
33
0
0
22 Mar 2023
Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on
  Consistency with Human Preferences
Exploring ChatGPT's Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
Yunjie Ji
Yan Gong
Yiping Peng
Chao Ni
Peiyan Sun
Dongyu Pan
Baochang Ma
Xiangang Li
ELM
ALM
AI4MH
30
37
0
14 Mar 2023
Conversational AI-Powered Design: ChatGPT as Designer, User, and Product
Conversational AI-Powered Design: ChatGPT as Designer, User, and Product
A. Kocaballi
24
38
0
15 Feb 2023
Benchmarking Large Language Models for News Summarization
Benchmarking Large Language Models for News Summarization
Tianyi Zhang
Faisal Ladhak
Esin Durmus
Percy Liang
Kathleen McKeown
Tatsunori B. Hashimoto
ELM
43
485
0
31 Jan 2023
Previous
12345678
Next