ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.01325
  4. Cited By
Learning to summarize from human feedback
v1v2v3 (latest)

Learning to summarize from human feedback

2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
    ALM
ArXiv (abs)PDFHTML

Papers citing "Learning to summarize from human feedback"

50 / 1,548 papers shown
Title
Assessment of Multimodal Large Language Models in Alignment with Human
  Values
Assessment of Multimodal Large Language Models in Alignment with Human Values
Zhelun Shi
Zhipin Wang
Hongxing Fan
Zaibin Zhang
Lijun Li
Yongting Zhang
Zhen-fei Yin
Lu Sheng
Yu Qiao
Jing Shao
77
22
0
26 Mar 2024
MetaAligner: Towards Generalizable Multi-Objective Alignment of Language
  Models
MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models
Kailai Yang
Zhiwei Liu
Qianqian Xie
Jimin Huang
Tianlin Zhang
Sophia Ananiadou
86
18
0
25 Mar 2024
If CLIP Could Talk: Understanding Vision-Language Model Representations
  Through Their Preferred Concept Descriptions
If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions
Reza Esfandiarpoor
Cristina Menghini
Stephen H. Bach
CoGeVLM
97
12
0
25 Mar 2024
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR
  Summarization
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization
Shengyi Huang
Michael Noukhovitch
Arian Hosseini
Kashif Rasul
Weixun Wang
Lewis Tunstall
VLM
112
38
0
24 Mar 2024
Risk and Response in Large Language Models: Evaluating Key Threat
  Categories
Risk and Response in Large Language Models: Evaluating Key Threat Categories
Bahareh Harandizadeh
A. Salinas
Fred Morstatter
98
4
0
22 Mar 2024
DreamReward: Text-to-3D Generation with Human Preference
DreamReward: Text-to-3D Generation with Human Preference
Junliang Ye
Fangfu Liu
Qixiu Li
Zhengyi Wang
Yikai Wang
Xinzhou Wang
Yueqi Duan
Jun Zhu
112
29
0
21 Mar 2024
Reinforcement Learning from Reflective Feedback (RLRF): Aligning and
  Improving LLMs via Fine-Grained Self-Reflection
Reinforcement Learning from Reflective Feedback (RLRF): Aligning and Improving LLMs via Fine-Grained Self-Reflection
Kyungjae Lee
Dasol Hwang
Sunghyun Park
Youngsoo Jang
Moontae Lee
70
8
0
21 Mar 2024
Chain-of-Interaction: Enhancing Large Language Models for Psychiatric
  Behavior Understanding by Dyadic Contexts
Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts
Guangzeng Han
Weisi Liu
Xiaolei Huang
Brian Borsari
79
22
0
20 Mar 2024
Llama meets EU: Investigating the European Political Spectrum through
  the Lens of LLMs
Llama meets EU: Investigating the European Political Spectrum through the Lens of LLMs
Ilias Chalkidis
Stephanie Brandl
57
9
0
20 Mar 2024
Diffusion Model for Data-Driven Black-Box Optimization
Diffusion Model for Data-Driven Black-Box Optimization
Zihao Li
Hui Yuan
Kaixuan Huang
Chengzhuo Ni
Yinyu Ye
Minshuo Chen
Mengdi Wang
DiffM
109
13
0
20 Mar 2024
Contextual Moral Value Alignment Through Context-Based Aggregation
Contextual Moral Value Alignment Through Context-Based Aggregation
Pierre Dognin
Jesus Rios
Ronny Luss
Inkit Padhi
Matthew D Riemer
Miao Liu
P. Sattigeri
Manish Nagireddy
Kush R. Varshney
Djallel Bouneffouf
69
6
0
19 Mar 2024
LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation
  Benchmark for Chinese Large Language Models
LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation Benchmark for Chinese Large Language Models
Chuang Liu
Renren Jin
Yuqi Ren
Deyi Xiong
ELM
125
0
0
19 Mar 2024
Improving Dialogue Agents by Decomposing One Global Explicit Annotation
  with Local Implicit Multimodal Feedback
Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback
Dong Won Lee
Hae Won Park
Yoon Kim
C. Breazeal
Louis-Philippe Morency
111
0
0
17 Mar 2024
Scaling Data Diversity for Fine-Tuning Language Models in Human
  Alignment
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
Feifan Song
Bowen Yu
Hao Lang
Haiyang Yu
Fei Huang
Houfeng Wang
Yongbin Li
ALM
86
15
0
17 Mar 2024
Reward Guided Latent Consistency Distillation
Reward Guided Latent Consistency Distillation
Jiachen Li
Weixi Feng
Wenhu Chen
William Y. Wang
EGVM
89
15
0
16 Mar 2024
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Hakim Sidahmed
Samrat Phatale
Alex Hutcheson
Zhuonan Lin
Zhan Chen
...
Jessica Hoffmann
Hassan Mansoor
Wei Li
Abhinav Rastogi
Lucas Dixon
82
3
0
15 Mar 2024
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Zhiqing Sun
Longhui Yu
Yikang Shen
Weiyang Liu
Yiming Yang
Sean Welleck
Chuang Gan
93
69
0
14 Mar 2024
Unveiling the Generalization Power of Fine-Tuned Large Language Models
Unveiling the Generalization Power of Fine-Tuned Large Language Models
Haoran Yang
Yumeng Zhang
Jiaqi Xu
Hongyuan Lu
Pheng Ann Heng
Wai Lam
128
40
0
14 Mar 2024
Strengthening Multimodal Large Language Model with Bootstrapped
  Preference Optimization
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization
Renjie Pi
Tianyang Han
Wei Xiong
Jipeng Zhang
Runtao Liu
Boyao Wang
Tong Zhang
MLLM
141
48
0
13 Mar 2024
Human Alignment of Large Language Models through Online Preference
  Optimisation
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello
Daniel Guo
Rémi Munos
Mark Rowland
Yunhao Tang
...
Michal Valko
Tianqi Liu
Rishabh Joshi
Zeyu Zheng
Bilal Piot
110
67
0
13 Mar 2024
HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain
  Reinforcement Learning From AI Feedback
HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback
Ang Li
Qiugen Xiao
Peng Cao
Jian Tang
Yi Yuan
...
Weidong Guo
Yukang Gan
Jeffrey Xu Yu
D. Wang
Ying Shan
VLMALM
93
10
0
13 Mar 2024
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese
  Large Language Models
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models
Yan Liu
Renren Jin
Ling Shi
Zheng Yao
Deyi Xiong
LRM
74
5
0
12 Mar 2024
ORPO: Monolithic Preference Optimization without Reference Model
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong
Noah Lee
James Thorne
OSLM
122
268
0
12 Mar 2024
MoAI: Mixture of All Intelligence for Large Language and Vision Models
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
MLLMVLM
138
23
0
12 Mar 2024
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked
  Preferences
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
Pulkit Pattnaik
Rishabh Maheshwary
Kelechi Ogueji
Vikas Yadav
Sathwik Tejaswi Madhusudhan
75
22
0
12 Mar 2024
$\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking
  Reinforcement Learning Algorithms in Generative Language Model
(N,K)\mathbf{(N,K)}(N,K)-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Yufeng Zhang
Liyu Chen
Boyi Liu
Yingxiang Yang
Qiwen Cui
Yunzhe Tao
Hongxia Yang
227
0
0
11 Mar 2024
The pitfalls of next-token prediction
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
117
81
0
11 Mar 2024
ALaRM: Align Language Models via Hierarchical Rewards Modeling
ALaRM: Align Language Models via Hierarchical Rewards Modeling
Yuhang Lai
Siyuan Wang
Shujun Liu
Xuanjing Huang
Zhongyu Wei
89
5
0
11 Mar 2024
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Katie Kang
Eric Wallace
Claire Tomlin
Aviral Kumar
Sergey Levine
HILMLRM
111
58
0
08 Mar 2024
Overcoming Reward Overoptimization via Adversarial Policy Optimization
  with Lightweight Uncertainty Estimation
Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Xiaoying Zhang
Jean-François Ton
Wei Shen
Hongning Wang
Yang Liu
78
15
0
08 Mar 2024
Teaching Large Language Models to Reason with Reinforcement Learning
Teaching Large Language Models to Reason with Reinforcement Learning
Alex Havrilla
Yuqing Du
Sharath Chandra Raparthy
Christoforos Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Sainbayar Sukhbaatar
Roberta Raileanu
ReLMLRM
115
94
0
07 Mar 2024
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models
Wanru Zhao
Yaxin Du
Nicholas D. Lane
Siheng Chen
Yanfeng Wang
94
4
0
07 Mar 2024
Proxy-RLHF: Decoupling Generation and Alignment in Large Language Model
  with Proxy
Proxy-RLHF: Decoupling Generation and Alignment in Large Language Model with Proxy
Yu Zhu
Chuxiong Sun
Wenfei Yang
Wenqiang Wei
Simin Niu
...
Zhiyu Li
Shifeng Zhang
Feiyu Xiong
Jie Hu
Mingchuan Yang
62
3
0
07 Mar 2024
On the Essence and Prospect: An Investigation of Alignment Approaches
  for Big Models
On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Xinpeng Wang
Shitong Duan
Xiaoyuan Yi
Jing Yao
Shanlin Zhou
Zhihua Wei
Peng Zhang
Dongkuan Xu
Maosong Sun
Xing Xie
OffRL
125
17
0
07 Mar 2024
Negating Negatives: Alignment without Human Positive Samples via
  Distributional Dispreference Optimization
Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Shitong Duan
Xiaoyuan Yi
Peng Zhang
Tun Lu
Xing Xie
Ning Gu
73
7
0
06 Mar 2024
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
Hanlei Jin
Yang Zhang
Dan Meng
Jun Wang
Jinghua Tan
249
96
0
05 Mar 2024
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Cassidy Laidlaw
Shivam Singhal
Anca Dragan
AAML
88
16
0
05 Mar 2024
Enhancing LLM Safety via Constrained Direct Preference Optimization
Enhancing LLM Safety via Constrained Direct Preference Optimization
Zixuan Liu
Xiaolin Sun
Zizhan Zheng
91
29
0
04 Mar 2024
Accelerating Greedy Coordinate Gradient via Probe Sampling
Accelerating Greedy Coordinate Gradient via Probe Sampling
Yiran Zhao
Wenyue Zheng
Tianle Cai
Xuan Long Do
Kenji Kawaguchi
Anirudh Goyal
Michael Shieh
96
13
0
02 Mar 2024
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
Shanghaoran Quan
MoEOffRL
82
10
0
02 Mar 2024
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Sayak Ray Chowdhury
Anush Kini
Nagarajan Natarajan
106
70
0
01 Mar 2024
Improving Socratic Question Generation using Data Augmentation and
  Preference Optimization
Improving Socratic Question Generation using Data Augmentation and Preference Optimization
Nischal Ashok Kumar
Andrew Lan
120
9
0
01 Mar 2024
EROS: Entity-Driven Controlled Policy Document Summarization
EROS: Entity-Driven Controlled Policy Document Summarization
Joykirat Singh
Sehban Fazili
Rohan Jain
Md. Shad Akhtar
76
1
0
29 Feb 2024
Curiosity-driven Red-teaming for Large Language Models
Curiosity-driven Red-teaming for Large Language Models
Zhang-Wei Hong
Idan Shenfeld
Tsun-Hsuan Wang
Yung-Sung Chuang
Aldo Pareja
James R. Glass
Akash Srivastava
Pulkit Agrawal
LRM
124
45
0
29 Feb 2024
PopALM: Popularity-Aligned Language Models for Social Media Trendy
  Response Prediction
PopALM: Popularity-Aligned Language Models for Social Media Trendy Response Prediction
Erxin Yu
Jing Li
Chunpu Xu
63
6
0
29 Feb 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Xupeng Miao
Gabriele Oliaro
Xinhao Cheng
Vineeth Kada
Ruohan Gao
...
April Yang
Yingcheng Wang
Mengdi Wu
Colin Unger
Zhihao Jia
MoE
187
11
0
29 Feb 2024
Sample-Efficient Preference-based Reinforcement Learning with Dynamics
  Aware Rewards
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
78
6
0
28 Feb 2024
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
Xinyu Lu
Bowen Yu
Yaojie Lu
Hongyu Lin
Haiyang Yu
Le Sun
Xianpei Han
Yongbin Li
123
14
0
27 Feb 2024
Speak Out of Turn: Safety Vulnerability of Large Language Models in
  Multi-turn Dialogue
Speak Out of Turn: Safety Vulnerability of Large Language Models in Multi-turn Dialogue
Zhenhong Zhou
Jiuyang Xiang
Haopeng Chen
Quan Liu
Zherui Li
Sen Su
102
25
0
27 Feb 2024
From Large Language Models and Optimization to Decision Optimization
  CoPilot: A Research Manifesto
From Large Language Models and Optimization to Decision Optimization CoPilot: A Research Manifesto
Segev Wasserkrug
Léonard Boussioux
D. Hertog
F. Mirzazadeh
Ilker Birbil
Jannis Kurtz
Donato Maragno
LLMAG
100
3
0
26 Feb 2024
Previous
123...161718...293031
Next