Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.01325
Cited By
v1
v2
v3 (latest)
Learning to summarize from human feedback
2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning to summarize from human feedback"
50 / 1,548 papers shown
Title
HAF-RM: A Hybrid Alignment Framework for Reward Model Training
Shujun Liu
Xiaoyu Shen
Yuhang Lai
Siyuan Wang
Shengbin Yue
Zengfeng Huang
Xuanjing Huang
Zhongyu Wei
126
1
0
04 Jul 2024
Orchestrating LLMs with Different Personalizations
Jin Peng Zhou
Katie Z Luo
Jingwen Gu
Jason Yuan
Kilian Q. Weinberger
Wen Sun
64
2
0
04 Jul 2024
Uncertainty-Guided Optimization on Large Language Model Search Trees
Julia Grosse
Ruotian Wu
Ahmad Rashid
Philipp Hennig
Pascal Poupart
Agustinus Kristiadi
109
3
0
04 Jul 2024
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
Asaf B. Cassel
Aviv A. Rosenberg
96
1
0
03 Jul 2024
Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Elmira Amirloo
J. Fauconnier
Christoph Roesmann
Christian Kerl
Rinu Boney
...
Zirui Wang
Afshin Dehghan
Yinfei Yang
Zhe Gan
Peter Grasch
88
7
0
02 Jul 2024
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang
Arash Ahmadian
Kelly Marchisio
Julia Kreutzer
Ahmet Üstün
Sara Hooker
103
28
0
02 Jul 2024
Beyond Numeric Rewards: In-Context Dueling Bandits with LLM Agents
Fanzeng Xia
Hao Liu
Yisong Yue
Tongxin Li
188
1
0
02 Jul 2024
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
174
9
0
02 Jul 2024
LLM See, LLM Do: Guiding Data Generation to Target Non-Differentiable Objectives
Luísa Shimabucoro
Sebastian Ruder
Julia Kreutzer
Marzieh Fadaee
Sara Hooker
SyDa
78
5
0
01 Jul 2024
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Yun-Nung Chen
VLM
ALM
71
5
0
01 Jul 2024
Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization
Siyi Gu
Minkai Xu
Alexander Powers
Weili Nie
Tomas Geffner
Karsten Kreis
J. Leskovec
Arash Vahdat
Stefano Ermon
101
11
0
01 Jul 2024
Exploring Advanced Large Language Models with LLMsuite
Giorgio Roffo
LLMAG
36
0
0
01 Jul 2024
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning
Zimu Lu
Aojun Zhou
Ke Wang
Houxing Ren
Weikang Shi
Junting Pan
Mingjie Zhan
Hongsheng Li
LRM
107
25
0
30 Jun 2024
BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models
Gihun Lee
Minchan Jeong
Yujin Kim
Hojung Jung
Jaehoon Oh
Sangmook Kim
Se-Young Yun
81
3
0
30 Jun 2024
PerSEval: Assessing Personalization in Text Summarizers
Sourish Dasgupta
Ankush Chander
Parth Borad
Isha Motiyani
Tanmoy Chakraborty
90
1
0
29 Jun 2024
Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs
Tamzeed Mahfuz
Satak Kumar Dey
Ruwad Naswan
Hasnaen Adil
Khondker Salman Sayeed
Haz Sameen Shahgir
84
1
0
29 Jun 2024
Beyond Human Preferences: Exploring Reinforcement Learning Trajectory Evaluation and Improvement through LLMs
Zichao Shen
Tianchen Zhu
Qingyun Sun
Shiqi Gao
Jianxin Li
OffRL
67
1
0
28 Jun 2024
Suri: Multi-constraint Instruction Following for Long-form Text Generation
Chau Minh Pham
Simeng Sun
Mohit Iyyer
ALM
LRM
129
23
0
27 Jun 2024
Averaging log-likelihoods in direct alignment
Nathan Grinsztajn
Yannis Flet-Berliac
M. G. Azar
Florian Strub
Bill Wu
...
Chris Cremer
Arash Ahmadian
Yash Chandak
Olivier Pietquin
Matthieu Geist
MoMe
96
6
0
27 Jun 2024
Decoding-Time Language Model Alignment with Multiple Objectives
Ruizhe Shi
Yifang Chen
Yushi Hu
Alisa Liu
Hannaneh Hajishirzi
Noah A. Smith
Simon Du
140
43
0
27 Jun 2024
The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm
Aakanksha
Arash Ahmadian
Beyza Ermis
Seraphina Goldfarb-Tarrant
Julia Kreutzer
Marzieh Fadaee
Sara Hooker
127
39
0
26 Jun 2024
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation
Guanting Dong
Yutao Zhu
Chenghao Zhang
Zechen Wang
Zhicheng Dou
Ji-Rong Wen
RALM
110
13
0
26 Jun 2024
Themis: Towards Flexible and Interpretable NLG Evaluation
Xinyu Hu
Li Lin
Mingqi Gao
Xunjian Yin
Xiaojun Wan
ELM
94
8
0
26 Jun 2024
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
Haibo Jin
Leyang Hu
Xinuo Li
Peiyan Zhang
Chonghan Chen
Jun Zhuang
Haohan Wang
PILM
115
33
0
26 Jun 2024
Preference Elicitation for Offline Reinforcement Learning
Alizée Pace
Bernhard Schölkopf
Gunnar Rätsch
Giorgia Ramponi
OffRL
140
1
0
26 Jun 2024
Domain Adaptation of Echocardiography Segmentation Via Reinforcement Learning
Arnaud Judge
Thierry Judge
Nicolas Duchateau
Roman A. Sandler
Joseph Z. Sokol
Olivier Bernard
Pierre-Marc Jodoin
OOD
61
0
0
25 Jun 2024
A Moonshot for AI Oracles in the Sciences
Bryan Kaiser
Tailin Wu
Maike Sonnewald
Colin Thackray
Skylar Callis
AI4CE
60
0
0
25 Jun 2024
Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning
Sen Yang
Leyang Cui
Deng Cai
Xinting Huang
Shuming Shi
Wai Lam
80
9
0
25 Jun 2024
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment
Thom Lake
Eunsol Choi
Greg Durrett
121
14
0
25 Jun 2024
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models
Sean Welleck
Amanda Bertsch
Matthew Finlayson
Hailey Schoelkopf
Alex Xie
Graham Neubig
Ilia Kulikov
Zaid Harchaoui
161
77
0
24 Jun 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
136
23
0
24 Jun 2024
Towards Comprehensive Preference Data Collection for Reward Modeling
Yulan Hu
Qingyang Li
Sheng Ouyang
Ge Chen
Kaihui Chen
Lijun Mei
Xucheng Ye
Fuzheng Zhang
Yong Liu
SyDa
128
4
0
24 Jun 2024
Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models?
Yuu Jinnai
112
5
0
24 Jun 2024
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Bolian Li
Yifan Wang
A. Grama
Ruqi Zhang
Ruqi Zhang
AI4TS
173
15
0
24 Jun 2024
Multi-Objective Linguistic Control of Large Language Models
Dang Nguyen
Jiuhai Chen
Dinesh Manocha
109
1
0
23 Jun 2024
PORT: Preference Optimization on Reasoning Traces
Salem Lahlou
Abdalgader Abubaker
Hakim Hacid
LRM
124
5
0
23 Jun 2024
Robust Reinforcement Learning from Corrupted Human Feedback
Alexander Bukharin
Ilgee Hong
Haoming Jiang
Zichong Li
Qingru Zhang
Zixuan Zhang
Tuo Zhao
107
8
0
21 Jun 2024
A SMART Mnemonic Sounds like "Glue Tonic": Mixing LLMs with Student Feedback to Make Mnemonic Learning Stick
Nishant Balepur
Matthew Shu
Alexander Hoyle
Alison Robey
Shi Feng
Seraphina Goldfarb-Tarrant
Jordan Boyd-Graber
109
4
0
21 Jun 2024
Hybrid Alignment Training for Large Language Models
Chenglong Wang
Hang Zhou
Kaiyan Chang
Bei Li
Yongyu Mu
Tong Xiao
Tongran Liu
Jingbo Zhu
109
5
0
21 Jun 2024
Timo: Towards Better Temporal Reasoning for Language Models
Zhaochen Su
Jun Zhang
Tong Zhu
Xiaoye Qu
Juntao Li
Min Zhang
Yu Cheng
LRM
100
23
0
20 Jun 2024
Aligning Large Language Models with Diverse Political Viewpoints
Dominik Stammbach
Philine Widmer
Eunjung Cho
Çağlar Gülçehre
Elliott Ash
103
5
0
20 Jun 2024
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs
Raeid Saqur
82
3
0
20 Jun 2024
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback
Zhirui Chen
Vincent Y. F. Tan
OffRL
104
1
0
18 Jun 2024
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level
Jie Liu
Zhanhui Zhou
Jiaheng Liu
Xingyuan Bu
Chao Yang
Han-Sen Zhong
Wanli Ouyang
77
21
0
17 Jun 2024
Measuring memorization in RLHF for code completion
Aneesh Pappu
Billy Porter
Ilia Shumailov
Jamie Hayes
101
3
0
17 Jun 2024
BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM
Zhewen Shen
Aditya Joshi
Ruey-Cheng Chen
CLL
109
2
0
17 Jun 2024
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
Haopeng Zhang
Philip S. Yu
Jiawei Zhang
146
27
0
17 Jun 2024
Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Rong Bao
Rui Zheng
Shihan Dou
Xiao Wang
Enyu Zhou
Bo Wang
Qi Zhang
Liang Ding
Dacheng Tao
ALM
145
0
0
17 Jun 2024
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang
Shiqi Shen
Guangyao Shen
Zhi Gong
Yankai Lin
Zhi Gong
Yankai Lin
Ji-Rong Wen
141
16
0
17 Jun 2024
Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis
Yuping Lin
Pengfei He
Han Xu
Yue Xing
Makoto Yamada
Hui Liu
Jiliang Tang
95
17
0
16 Jun 2024
Previous
1
2
3
...
11
12
13
...
29
30
31
Next