Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.01325
Cited By
Learning to summarize from human feedback
2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to summarize from human feedback"
50 / 1,443 papers shown
Title
Diffusion Model for Data-Driven Black-Box Optimization
Zihao Li
Hui Yuan
Kaixuan Huang
Chengzhuo Ni
Yinyu Ye
Minshuo Chen
Mengdi Wang
DiffM
45
10
0
20 Mar 2024
Contextual Moral Value Alignment Through Context-Based Aggregation
Pierre Dognin
Jesus Rios
Ronny Luss
Inkit Padhi
Matthew D Riemer
Miao Liu
P. Sattigeri
Manish Nagireddy
Kush R. Varshney
Djallel Bouneffouf
44
5
0
19 Mar 2024
LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation Benchmark for Chinese Large Language Models
Chuang Liu
Renren Jin
Yuqi Ren
Deyi Xiong
ELM
43
0
0
19 Mar 2024
Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback
Dong Won Lee
Hae Won Park
Yoon Kim
C. Breazeal
Louis-Philippe Morency
37
0
0
17 Mar 2024
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
Feifan Song
Bowen Yu
Hao Lang
Haiyang Yu
Fei Huang
Houfeng Wang
Yongbin Li
ALM
45
11
0
17 Mar 2024
Reward Guided Latent Consistency Distillation
Jiachen Li
Weixi Feng
Wenhu Chen
William Y. Wang
EGVM
36
11
0
16 Mar 2024
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Hakim Sidahmed
Samrat Phatale
Alex Hutcheson
Zhuonan Lin
Zhan Chen
...
Jessica Hoffmann
Hassan Mansoor
Wei Li
Abhinav Rastogi
Lucas Dixon
38
2
0
15 Mar 2024
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Zhiqing Sun
Longhui Yu
Yikang Shen
Weiyang Liu
Yiming Yang
Sean Welleck
Chuang Gan
36
55
0
14 Mar 2024
Unveiling the Generalization Power of Fine-Tuned Large Language Models
Haoran Yang
Yumeng Zhang
Jiaqi Xu
Hongyuan Lu
Pheng Ann Heng
Wai Lam
50
30
0
14 Mar 2024
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization
Renjie Pi
Tianyang Han
Wei Xiong
Jipeng Zhang
Runtao Liu
Rui Pan
Tong Zhang
MLLM
55
34
0
13 Mar 2024
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello
Daniel Guo
Rémi Munos
Mark Rowland
Yunhao Tang
...
Michal Valko
Tianqi Liu
Rishabh Joshi
Zeyu Zheng
Bilal Piot
52
60
0
13 Mar 2024
HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback
Ang Li
Qiugen Xiao
Peng Cao
Jian Tang
Yi Yuan
...
Weidong Guo
Yukang Gan
Jeffrey Xu Yu
D. Wang
Ying Shan
VLM
ALM
44
10
0
13 Mar 2024
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models
Yan Liu
Renren Jin
Ling Shi
Zheng Yao
Deyi Xiong
LRM
37
4
0
12 Mar 2024
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong
Noah Lee
James Thorne
OSLM
44
213
0
12 Mar 2024
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
MLLM
VLM
53
20
0
12 Mar 2024
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
Pulkit Pattnaik
Rishabh Maheshwary
Kelechi Ogueji
Vikas Yadav
Sathwik Tejaswi Madhusudhan
42
18
0
12 Mar 2024
(
N
,
K
)
\mathbf{(N,K)}
(
N
,
K
)
-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model
Yufeng Zhang
Liyu Chen
Boyi Liu
Yingxiang Yang
Qiwen Cui
Yunzhe Tao
Hongxia Yang
119
0
0
11 Mar 2024
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
39
63
0
11 Mar 2024
ALaRM: Align Language Models via Hierarchical Rewards Modeling
Yuhang Lai
Siyuan Wang
Shujun Liu
Xuanjing Huang
Zhongyu Wei
37
4
0
11 Mar 2024
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Katie Kang
Eric Wallace
Claire Tomlin
Aviral Kumar
Sergey Levine
HILM
LRM
49
49
0
08 Mar 2024
Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Xiaoying Zhang
Jean-François Ton
Wei Shen
Hongning Wang
Yang Liu
39
14
0
08 Mar 2024
Teaching Large Language Models to Reason with Reinforcement Learning
Alex Havrilla
Yuqing Du
Sharath Chandra Raparthy
Christoforos Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Sainbayar Sukhbaatar
Roberta Raileanu
ReLM
LRM
39
71
0
07 Mar 2024
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models
Wanru Zhao
Yaxin Du
Nicholas D. Lane
Siheng Chen
Yanfeng Wang
45
3
0
07 Mar 2024
Proxy-RLHF: Decoupling Generation and Alignment in Large Language Model with Proxy
Yu Zhu
Chuxiong Sun
Wenfei Yang
Wenqiang Wei
Simin Niu
...
Zhiyu Li
Shifeng Zhang
Zhiyu Li
Jie Hu
Mingchuan Yang
42
3
0
07 Mar 2024
On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Xinpeng Wang
Shitong Duan
Xiaoyuan Yi
Jing Yao
Shanlin Zhou
Zhihua Wei
Peng Zhang
Dongkuan Xu
Maosong Sun
Xing Xie
OffRL
50
16
0
07 Mar 2024
Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization
Shitong Duan
Xiaoyuan Yi
Peng Zhang
Tun Lu
Xing Xie
Ning Gu
40
4
0
06 Mar 2024
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
Hanlei Jin
Yang Zhang
Dan Meng
Jun Wang
Jinghua Tan
68
81
0
05 Mar 2024
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Cassidy Laidlaw
Shivam Singhal
Anca Dragan
AAML
45
11
0
05 Mar 2024
Enhancing LLM Safety via Constrained Direct Preference Optimization
Zixuan Liu
Xiaolin Sun
Zizhan Zheng
48
20
0
04 Mar 2024
Accelerating Greedy Coordinate Gradient via Probe Sampling
Yiran Zhao
Wenyue Zheng
Tianle Cai
Xuan Long Do
Kenji Kawaguchi
Anirudh Goyal
Michael Shieh
51
11
0
02 Mar 2024
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
Shanghaoran Quan
MoE
OffRL
52
9
0
02 Mar 2024
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Sayak Ray Chowdhury
Anush Kini
Nagarajan Natarajan
45
58
0
01 Mar 2024
Improving Socratic Question Generation using Data Augmentation and Preference Optimization
Nischal Ashok Kumar
Andrew Lan
43
8
0
01 Mar 2024
EROS: Entity-Driven Controlled Policy Document Summarization
Joykirat Singh
Sehban Fazili
Rohan Jain
Md. Shad Akhtar
41
1
0
29 Feb 2024
Curiosity-driven Red-teaming for Large Language Models
Zhang-Wei Hong
Idan Shenfeld
Tsun-Hsuan Wang
Yung-Sung Chuang
Aldo Pareja
James R. Glass
Akash Srivastava
Pulkit Agrawal
LRM
39
39
0
29 Feb 2024
PopALM: Popularity-Aligned Language Models for Social Media Trendy Response Prediction
Erxin Yu
Jing Li
Chunpu Xu
35
3
0
29 Feb 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Xupeng Miao
Gabriele Oliaro
Xinhao Cheng
Vineeth Kada
Ruohan Gao
...
April Yang
Yingcheng Wang
Mengdi Wu
Colin Unger
Zhihao Jia
MoE
94
9
0
29 Feb 2024
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
45
6
0
28 Feb 2024
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
Xinyu Lu
Bowen Yu
Yaojie Lu
Hongyu Lin
Haiyang Yu
Le Sun
Xianpei Han
Yongbin Li
78
13
0
27 Feb 2024
Speak Out of Turn: Safety Vulnerability of Large Language Models in Multi-turn Dialogue
Zhenhong Zhou
Jiuyang Xiang
Haopeng Chen
Quan Liu
Zherui Li
Sen Su
42
20
0
27 Feb 2024
From Large Language Models and Optimization to Decision Optimization CoPilot: A Research Manifesto
Segev Wasserkrug
Léonard Boussioux
D. Hertog
F. Mirzazadeh
Ilker Birbil
Jannis Kurtz
Donato Maragno
LLMAG
56
3
0
26 Feb 2024
Detecting Machine-Generated Texts by Multi-Population Aware Optimization for Maximum Mean Discrepancy
Shuhai Zhang
Yiliao Song
Jiahao Yang
Yuanqing Li
Bo Han
Mingkui Tan
DeLMO
42
5
0
25 Feb 2024
Don't Forget Your Reward Values: Language Model Alignment via Value-based Calibration
Xin Mao
Fengming Li
Huimin Xu
Wei Zhang
Anh Tuan Luu
ALM
50
6
0
25 Feb 2024
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models
Chaoya Jiang
Wei Ye
Mengfan Dong
Hongrui Jia
Haiyang Xu
Mingshi Yan
Ji Zhang
Shikun Zhang
VLM
MLLM
48
15
0
24 Feb 2024
Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts
Yuejiang Liu
Alexandre Alahi
39
18
0
23 Feb 2024
Prejudice and Volatility: A Statistical Framework for Measuring Social Discrimination in Large Language Models
Yiran Liu
Ke Yang
Zehan Qi
Xiao-Yang Liu
Yang Yu
U. I. Urbana-Champaign
47
1
0
23 Feb 2024
Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control
Masatoshi Uehara
Yulai Zhao
Kevin Black
Ehsan Hajiramezanali
Gabriele Scalia
N. Diamant
Alex Tseng
Tommaso Biancalani
Sergey Levine
47
42
0
23 Feb 2024
Machine Unlearning of Pre-trained Large Language Models
Jin Yao
Eli Chien
Minxin Du
Xinyao Niu
Tianhao Wang
Zezhou Cheng
Xiang Yue
MU
56
35
0
23 Feb 2024
Optimizing Language Models for Human Preferences is a Causal Inference Problem
Victoria Lin
Eli Ben-Michael
Louis-Philippe Morency
43
3
0
22 Feb 2024
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
Zicheng Lin
Zhibin Gou
Tian Liang
Ruilin Luo
Haowei Liu
Yujiu Yang
LRM
42
44
0
22 Feb 2024
Previous
1
2
3
...
14
15
16
...
27
28
29
Next