ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.01325
  4. Cited By
Learning to summarize from human feedback

Learning to summarize from human feedback

2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
    ALM
ArXivPDFHTML

Papers citing "Learning to summarize from human feedback"

50 / 1,443 papers shown
Title
Aligning Language Models with Human Preferences via a Bayesian Approach
Aligning Language Models with Human Preferences via a Bayesian Approach
Jiashuo Wang
Haozhao Wang
Shichao Sun
Wenjie Li
ALM
48
22
0
09 Oct 2023
Regulation and NLP (RegNLP): Taming Large Language Models
Regulation and NLP (RegNLP): Taming Large Language Models
Catalina Goanta
Nikolaos Aletras
Ilias Chalkidis
S. Ranchordas
Gerasimos Spanakis
AILaw
21
3
0
09 Oct 2023
Generative Judge for Evaluating Alignment
Generative Judge for Evaluating Alignment
Junlong Li
Shichao Sun
Weizhe Yuan
Run-Ze Fan
Hai Zhao
Pengfei Liu
ELM
ALM
35
80
0
09 Oct 2023
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to
  RLHF
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Yi Dong
Zhilin Wang
Makesh Narsimhan Sreedhar
Xianchao Wu
Oleksii Kuchaiev
ALM
LLMSV
47
65
0
09 Oct 2023
Negative Object Presence Evaluation (NOPE) to Measure Object
  Hallucination in Vision-Language Models
Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models
Holy Lovenia
Wenliang Dai
Samuel Cahyawijaya
Ziwei Ji
Pascale Fung
MLLM
36
51
0
09 Oct 2023
Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning
  from Human Feedback
Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback
Wei Shen
Rui Zheng
Wenyu Zhan
Jun Zhao
Shihan Dou
Tao Gui
Qi Zhang
Xuanjing Huang
ALM
48
44
0
08 Oct 2023
Crystal: Introspective Reasoners Reinforced with Self-Feedback
Crystal: Introspective Reasoners Reinforced with Self-Feedback
Jiacheng Liu
Ramakanth Pasunuru
Hannaneh Hajishirzi
Yejin Choi
Asli Celikyilmaz
LRM
ReLM
44
22
0
07 Oct 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
50
48
0
06 Oct 2023
Reward Dropout Improves Control: Bi-objective Perspective on Reinforced
  LM
Reward Dropout Improves Control: Bi-objective Perspective on Reinforced LM
Changhun Lee
Chiehyeon Lim
34
0
0
06 Oct 2023
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation
Zixi Zhang
Greg Chadwick
Hugo McNally
Yiren Zhao
Robert D. Mullins
Jianyi Cheng
Robert Mullins
Yiren Zhao
31
20
0
06 Oct 2023
Aligning Text-to-Image Diffusion Models with Reward Backpropagation
Aligning Text-to-Image Diffusion Models with Reward Backpropagation
Mihir Prabhudesai
Anirudh Goyal
Deepak Pathak
Katerina Fragkiadaki
42
112
0
05 Oct 2023
A Long Way to Go: Investigating Length Correlations in RLHF
A Long Way to Go: Investigating Length Correlations in RLHF
Prasann Singhal
Tanya Goyal
Jiacheng Xu
Greg Durrett
49
145
0
05 Oct 2023
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct
  Preference Optimization
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
Zhanhui Zhou
Jie Liu
Chao Yang
Jing Shao
Yu Liu
Xiangyu Yue
Wanli Ouyang
Yu Qiao
40
49
0
05 Oct 2023
Misusing Tools in Large Language Models With Visual Adversarial Examples
Misusing Tools in Large Language Models With Visual Adversarial Examples
Xiaohan Fu
Zihan Wang
Shuheng Li
Rajesh K. Gupta
Niloofar Mireshghallah
Taylor Berg-Kirkpatrick
Earlence Fernandes
AAML
37
24
0
04 Oct 2023
$\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program
  Synthesis
B\mathcal{B}B-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis
Zishun Yu
Yunzhe Tao
Liyu Chen
Tao Sun
Hongxia Yang
32
9
0
04 Oct 2023
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
Chang Gao
Wenxuan Zhang
Guizhen Chen
Wai Lam
58
5
0
04 Oct 2023
Reward Model Ensembles Help Mitigate Overoptimization
Reward Model Ensembles Help Mitigate Overoptimization
Thomas Coste
Usman Anwar
Robert Kirk
David M. Krueger
NoLa
ALM
28
122
0
04 Oct 2023
The Empty Signifier Problem: Towards Clearer Paradigms for
  Operationalising "Alignment" in Large Language Models
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models
Hannah Rose Kirk
Bertie Vidgen
Paul Röttger
Scott A. Hale
50
2
0
03 Oct 2023
Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation
Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation
Benjamin Steenhoek
Michele Tufano
Neel Sundaresan
Alexey Svyatkovskiy
OffRL
ALM
60
18
0
03 Oct 2023
Automatic Pair Construction for Contrastive Post-training
Automatic Pair Construction for Contrastive Post-training
Canwen Xu
Corby Rosset
Ethan C. Chau
Luciano Del Corro
Shweti Mahajan
Julian McAuley
Jennifer Neville
Ahmed Hassan Awadallah
Nikhil Rao
ALM
27
4
0
03 Oct 2023
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
Rafael Ferreira
Diogo Tavares
Diogo Glória-Silva
Rodrigo Valerio
João Bordalo
Ines Simoes
Vasco Ramos
David Semedo
João Magalhães
24
4
0
03 Oct 2023
AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable
  Diffusion Model
AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model
Zibin Dong
Yifu Yuan
Jianye Hao
Fei Ni
Yao Mu
Yan Zheng
Yujing Hu
Tangjie Lv
Changjie Fan
Zhipeng Hu
47
29
0
03 Oct 2023
Avalon's Game of Thoughts: Battle Against Deception through Recursive
  Contemplation
Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation
Shenzhi Wang
Chang Liu
Zilong Zheng
Siyuan Qi
Shuo Chen
Qisen Yang
Andrew Zhao
Chaofei Wang
Shiji Song
Gao Huang
LLMAG
42
66
0
02 Oct 2023
Tool-Augmented Reward Modeling
Tool-Augmented Reward Modeling
Lei Li
Yekun Chai
Shuohuan Wang
Yu Sun
Hao Tian
Ningyu Zhang
Hua Wu
OffRL
46
13
0
02 Oct 2023
Enabling Language Models to Implicitly Learn Self-Improvement
Enabling Language Models to Implicitly Learn Self-Improvement
Ziqi Wang
Le Hou
Tianjian Lu
Yuexin Wu
Yunxuan Li
Hongkun Yu
Heng Ji
ReLM
LRM
18
6
0
02 Oct 2023
No Offense Taken: Eliciting Offensiveness from Language Models
No Offense Taken: Eliciting Offensiveness from Language Models
Anugya Srivastava
Rahul Ahuja
Rohith Mukku
22
3
0
02 Oct 2023
Parameter-Efficient Tuning Helps Language Model Alignment
Parameter-Efficient Tuning Helps Language Model Alignment
Tianci Xue
Ziqi Wang
Heng Ji
ALM
38
6
0
01 Oct 2023
Adapting LLM Agents with Universal Feedback in Communication
Adapting LLM Agents with Universal Feedback in Communication
Kuan-Chieh Wang
Yadong Lu
Michael Santacroce
Yeyun Gong
Chao Zhang
Yelong Shen
LLMAG
36
7
0
01 Oct 2023
From Language Modeling to Instruction Following: Understanding the
  Behavior Shift in LLMs after Instruction Tuning
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
Xuansheng Wu
Wenlin Yao
Jianshu Chen
Xiaoman Pan
Xiaoyang Wang
Ninghao Liu
Dong Yu
LRM
28
28
0
30 Sep 2023
Consistent Aggregation of Objectives with Diverse Time Preferences
  Requires Non-Markovian Rewards
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
40
6
0
30 Sep 2023
Directly Fine-Tuning Diffusion Models on Differentiable Rewards
Directly Fine-Tuning Diffusion Models on Differentiable Rewards
Amita Gajewar
Paul Vicol
G. Bansal
David J Fleet
30
150
0
29 Sep 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
108
1,622
0
28 Sep 2023
Large Language Model Alignment: A Survey
Large Language Model Alignment: A Survey
Tianhao Shen
Renren Jin
Yufei Huang
Chuang Liu
Weilong Dong
Zishan Guo
Xinwei Wu
Yan Liu
Deyi Xiong
LM&MA
29
179
0
26 Sep 2023
Art or Artifice? Large Language Models and the False Promise of
  Creativity
Art or Artifice? Large Language Models and the False Promise of Creativity
Tuhin Chakrabarty
Philippe Laban
Divyansh Agarwal
Smaranda Muresan
Chien-Sheng Wu
32
118
0
25 Sep 2023
Aligning Large Multimodal Models with Factually Augmented RLHF
Aligning Large Multimodal Models with Factually Augmented RLHF
Zhiqing Sun
Sheng Shen
Shengcao Cao
Haotian Liu
Chunyuan Li
...
Liangyan Gui
Yu-xiong Wang
Yiming Yang
Kurt Keutzer
Trevor Darrell
VLM
52
324
0
25 Sep 2023
MentaLLaMA: Interpretable Mental Health Analysis on Social Media with
  Large Language Models
MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models
Kailai Yang
Tianlin Zhang
Zi-Zhou Kuang
Qianqian Xie
Jimin Huang
Sophia Ananiadou
AI4MH
38
47
0
24 Sep 2023
Frustrated with Code Quality Issues? LLMs can Help!
Frustrated with Code Quality Issues? LLMs can Help!
Nalin Wadhwa
Jui Pradhan
Atharv Sonwane
Surya Prakash Sahu
Nagarajan Natarajan
Aditya Kanade
Suresh Parthasarathy
S. Rajamani
43
3
0
22 Sep 2023
AceGPT, Localizing Large Language Models in Arabic
AceGPT, Localizing Large Language Models in Arabic
Huang Huang
Fei Yu
Jianqing Zhu
Xuening Sun
Hao Cheng
...
Lian Zhang
Ruoyu Sun
Xiang Wan
Haizhou Li
Jinchao Xu
32
48
0
21 Sep 2023
Are Large Language Models Really Robust to Word-Level Perturbations?
Are Large Language Models Really Robust to Word-Level Perturbations?
Haoyu Wang
Guozheng Ma
Cong Yu
Ning Gui
Linrui Zhang
...
Sen Zhang
Li Shen
Xueqian Wang
Peilin Zhao
Dacheng Tao
KELM
31
22
0
20 Sep 2023
Toward Unified Controllable Text Generation via Regular Expression
  Instruction
Toward Unified Controllable Text Generation via Regular Expression Instruction
Xin Zheng
Hongyu Lin
Xianpei Han
Le Sun
57
4
0
19 Sep 2023
Drive as You Speak: Enabling Human-Like Interaction with Large Language
  Models in Autonomous Vehicles
Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles
Can Cui
Yunsheng Ma
Xu Cao
Wenqian Ye
Ziran Wang
24
107
0
19 Sep 2023
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Baolin Peng
Linfeng Song
Ye Tian
Lifeng Jin
Haitao Mi
Dong Yu
40
17
0
18 Sep 2023
Understanding Catastrophic Forgetting in Language Models via Implicit
  Inference
Understanding Catastrophic Forgetting in Language Models via Implicit Inference
Suhas Kotha
Jacob Mitchell Springer
Aditi Raghunathan
CLL
49
62
0
18 Sep 2023
SYNDICOM: Improving Conversational Commonsense with Error-Injection and
  Natural Language Feedback
SYNDICOM: Improving Conversational Commonsense with Error-Injection and Natural Language Feedback
Christopher Richardson
Anirudh S. Sundar
Larry Heck
LRM
30
4
0
18 Sep 2023
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM
Bochuan Cao
Yu Cao
Lu Lin
Jinghui Chen
AAML
36
136
0
18 Sep 2023
Exploring the impact of low-rank adaptation on the performance,
  efficiency, and regularization of RLHF
Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Simeng Sun
Dhawal Gupta
Mohit Iyyer
29
17
0
16 Sep 2023
ICLEF: In-Context Learning with Expert Feedback for Explainable Style
  Transfer
ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer
Arkadiy Saakyan
Smaranda Muresan
31
3
0
15 Sep 2023
RAIN: Your Language Models Can Align Themselves without Finetuning
RAIN: Your Language Models Can Align Themselves without Finetuning
Yuhui Li
Fangyun Wei
Jinjing Zhao
Chao Zhang
Hongyang R. Zhang
SILM
44
108
0
13 Sep 2023
Cognitive Mirage: A Review of Hallucinations in Large Language Models
Cognitive Mirage: A Review of Hallucinations in Large Language Models
Hongbin Ye
Tong Liu
Aijia Zhang
Wei Hua
Weiqiang Jia
HILM
53
77
0
13 Sep 2023
Statistical Rejection Sampling Improves Preference Optimization
Statistical Rejection Sampling Improves Preference Optimization
Tianqi Liu
Yao-Min Zhao
Rishabh Joshi
Misha Khalman
Mohammad Saleh
Peter J. Liu
Jialu Liu
66
215
0
13 Sep 2023
Previous
123...202122...272829
Next