ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.08593
  4. Cited By
Fine-Tuning Language Models from Human Preferences
v1v2 (latest)

Fine-Tuning Language Models from Human Preferences

18 September 2019
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
    ALM
ArXiv (abs)PDFHTML

Papers citing "Fine-Tuning Language Models from Human Preferences"

15 / 1,265 papers shown
Title
Neural Language Generation: Formulation, Methods, and Evaluation
Neural Language Generation: Formulation, Methods, and Evaluation
Cristina Garbacea
Qiaozhu Mei
158
30
0
31 Jul 2020
SummEval: Re-evaluating Summarization Evaluation
SummEval: Re-evaluating Summarization Evaluation
Alexander R. Fabbri
Wojciech Kry'sciñski
Bryan McCann
Caiming Xiong
R. Socher
Dragomir R. Radev
HILM
142
724
0
24 Jul 2020
Investigation of Sentiment Controllable Chatbot
Investigation of Sentiment Controllable Chatbot
Hung-yi Lee
Cheng-Hao Ho
Chien-Fu Lin
Chiung-Chih Chang
Chih-Wei Lee
Yau-Shian Wang
Tsung-Yuan Hsu
Kuan-Yu Chen
76
4
0
11 Jul 2020
Technical Report: Auxiliary Tuning and its Application to Conditional
  Text Generation
Technical Report: Auxiliary Tuning and its Application to Conditional Text Generation
Yoel Zeldes
Dan Padnos
Or Sharir
Barak Peleg
123
19
0
30 Jun 2020
Quantifying Differences in Reward Functions
Quantifying Differences in Reward Functions
Adam Gleave
Michael Dennis
Shane Legg
Stuart J. Russell
Jan Leike
OffRL
169
68
0
24 Jun 2020
ColdGANs: Taming Language GANs with Cautious Sampling Strategies
ColdGANs: Taming Language GANs with Cautious Sampling Strategies
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
GANSyDa
68
18
0
08 Jun 2020
CoCon: A Self-Supervised Approach for Controlled Text Generation
CoCon: A Self-Supervised Approach for Controlled Text Generation
Alvin Chan
Yew-Soon Ong
B. Pung
Aston Zhang
Jie Fu
82
86
0
05 Jun 2020
MLE-guided parameter search for task loss minimization in neural
  sequence modeling
MLE-guided parameter search for task loss minimization in neural sequence modeling
Sean Welleck
Kyunghyun Cho
68
10
0
04 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
1.1K
42,651
0
28 May 2020
Multi-agent Communication meets Natural Language: Synergies between
  Functional and Structural Language Learning
Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning
Angeliki Lazaridou
Anna Potapenko
O. Tieleman
LLMAG
141
98
0
14 May 2020
Fill in the BLANC: Human-free quality estimation of document summaries
Fill in the BLANC: Human-free quality estimation of document summaries
Oleg V. Vasilyev
Vedant Dharnidharka
John Bohannon
3DH
103
119
0
23 Feb 2020
Reducing Non-Normative Text Generation from Language Models
Reducing Non-Normative Text Generation from Language Models
Xiangyu Peng
Siyan Li
Spencer Frazier
Mark O. Riedl
60
8
0
23 Jan 2020
Learning Norms from Stories: A Prior for Value Aligned Agents
Learning Norms from Stories: A Prior for Value Aligned Agents
Spencer Frazier
Md Sultan al Nahian
Mark O. Riedl
Brent Harrison
73
39
0
07 Dec 2019
Plug and Play Language Models: A Simple Approach to Controlled Text
  Generation
Plug and Play Language Models: A Simple Approach to Controlled Text Generation
Sumanth Dathathri
Andrea Madotto
Janice Lan
Jane Hung
Eric Frank
Piero Molino
J. Yosinski
Rosanne Liu
KELM
155
980
0
04 Dec 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human
  Preferences in Dialog
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
157
343
0
30 Jun 2019
Previous
123...242526