Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.01325
Cited By
Learning to summarize from human feedback
2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to summarize from human feedback"
50 / 1,443 papers shown
Title
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation
Guillaume Huguet
James Vuckovic
Kilian Fatras
Eric Thibodeau-Laufer
Pablo Lemos
...
Jarrid Rector-Brooks
Tara Akhound-Sadegh
Michael M. Bronstein
Alexander Tong
A. Bose
39
28
0
30 May 2024
Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh
Yifan Hu
Iason Chaimalas
Viraj Mehta
Pier Giuseppe Sessa
Haitham Bou-Ammar
Ilija Bogunovic
33
26
0
30 May 2024
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Chen Zhang
Chengguang Tang
Dading Chong
Ke Shi
Guohua Tang
Feng Jiang
Haizhou Li
43
4
0
30 May 2024
InstructionCP: A fast approach to transfer Large Language Models into target language
Kuang-Ming Chen
Hung-yi Lee
CLL
49
3
0
30 May 2024
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Shenghuan Sun
Gregory M. Goldgof
Alexander Schubert
Zhiqing Sun
Thomas Hartvigsen
A. Butte
Ahmed Alaa
LM&MA
47
4
0
29 May 2024
One-Shot Safety Alignment for Large Language Models via Optimal Dualization
Xinmeng Huang
Shuo Li
Yan Sun
Osbert Bastani
Hamed Hassani
Dongsheng Ding
49
4
0
29 May 2024
Preference Learning Algorithms Do Not Learn Preference Rankings
Angelica Chen
Sadhika Malladi
Lily H. Zhang
Xinyi Chen
Qiuyi Zhang
Rajesh Ranganath
Kyunghyun Cho
38
24
0
29 May 2024
Participation in the age of foundation models
Harini Suresh
Emily Tseng
Meg Young
Mary L. Gray
Emma Pierson
Karen Levy
51
20
0
29 May 2024
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
Zhanhui Zhou
Zhixuan Liu
Jie Liu
Zhichen Dong
Chao Yang
Yu Qiao
ALM
49
20
0
29 May 2024
Language Generation with Strictly Proper Scoring Rules
Chenze Shao
Fandong Meng
Yijin Liu
Jie Zhou
70
5
0
29 May 2024
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
Jiachen Li
Weixi Feng
Tsu-Jui Fu
Xinyi Wang
Sugato Basu
Wenhu Chen
William Y. Wang
VGen
39
27
0
29 May 2024
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation
Fengshuo Bai
Rui Zhao
Hongming Zhang
Sijia Cui
Ying Wen
Yaodong Yang
Bo Xu
Lei Han
OffRL
32
6
0
29 May 2024
Robust Preference Optimization through Reward Model Distillation
Adam Fisch
Jacob Eisenstein
Vicky Zayats
Alekh Agarwal
Ahmad Beirami
Chirag Nagpal
Peter Shaw
Jonathan Berant
81
22
0
29 May 2024
Decoding moral judgement from text: a pilot study
Diana E. Gherman
Thorsten O. Zander
22
0
0
28 May 2024
QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translation
Gonccalo R. A. Faria
Sweta Agrawal
António Farinhas
Ricardo Rei
José G. C. de Souza
André F. T. Martins
30
4
0
28 May 2024
Multi-modal Generation via Cross-Modal In-Context Learning
Amandeep Kumar
Muzammal Naseer
Sanath Narayan
Rao Muhammad Anwer
Salman Khan
Hisham Cholakkal
MLLM
56
0
0
28 May 2024
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
47
38
0
28 May 2024
The Evolution of Multimodal Model Architectures
S. Wadekar
Abhishek Chaurasia
Aman Chadha
Eugenio Culurciello
45
15
0
28 May 2024
Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment
Jiaxiang Li
Siliang Zeng
Hoi-To Wai
Chenliang Li
Alfredo García
Mingyi Hong
66
15
0
28 May 2024
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
Yuanpu Cao
Tianrong Zhang
Bochuan Cao
Ziyi Yin
Lu Lin
Fenglong Ma
Jinghui Chen
LLMSV
37
20
0
28 May 2024
Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales
Ju-Seung Byun
Andrew Perrault
34
1
0
27 May 2024
Revision Matters: Generative Design Guided by Revision Edits
Tao Li
Chin-Yi Cheng
Amber Xie
Gang Li
Yang Li
44
1
0
27 May 2024
Triple Preference Optimization: Achieving Better Alignment with Less Data in a Single Step Optimization
Amir Saeidi
Shivanshu Verma
Aswin Rrv
Chitta Baral
40
0
0
26 May 2024
RLSF: Reinforcement Learning via Symbolic Feedback
Piyush Jha
Prithwish Jana
Arnav Arora
Vijay Ganesh
LRM
49
3
0
26 May 2024
Multi-Reference Preference Optimization for Large Language Models
Hung Le
Quan Tran
D. Nguyen
Kien Do
Saloni Mittal
Kelechi Ogueji
Svetha Venkatesh
65
0
0
26 May 2024
MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time
Jikun Kang
Xin Zhe Li
Xi Chen
Amirreza Kazemi
Qianyi Sun
...
Xu He
Quan He
Feng Wen
Jianye Hao
Jun Yao
LRM
ReLM
34
17
0
25 May 2024
InstructPatentGPT: Training patent language models to follow instructions with human feedback
Jieh-Sheng Lee
ALM
49
6
0
25 May 2024
Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention
Andrew Li
Xianle Feng
Siddhant Narang
Austin Peng
Tianle Cai
Raj Sanjay Shah
Sashank Varma
LRM
44
6
0
25 May 2024
Learning Generalizable Human Motion Generator with Reinforcement Learning
Yunyao Mao
Xiaoyang Liu
Wen-gang Zhou
Zhenbo Lu
Houqiang Li
48
2
0
24 May 2024
Direct Preference Optimization With Unobserved Preference Heterogeneity
Keertana Chidambaram
Karthik Vinay Seetharaman
Vasilis Syrgkanis
46
7
0
23 May 2024
RE-Adapt: Reverse Engineered Adaptation of Large Language Models
William Fleshman
Benjamin Van Durme
VLM
31
3
0
23 May 2024
Axioms for AI Alignment from Human Feedback
Luise Ge
Daniel Halpern
Evi Micha
Ariel D. Procaccia
Itai Shapira
Yevgeniy Vorobeychik
Junlin Wu
49
16
0
23 May 2024
SimPO: Simple Preference Optimization with a Reference-Free Reward
Yu Meng
Mengzhou Xia
Danqi Chen
68
372
0
23 May 2024
Defining error accumulation in ML atmospheric simulators
R. Parthipan
Mohit Anand
Hannah M. Christensen
J. S. Hosking
Damon J. Wischik
29
1
0
23 May 2024
Multi-turn Reinforcement Learning from Preference Human Feedback
Lior Shani
Aviv Rosenberg
Asaf B. Cassel
Oran Lang
Daniele Calandriello
...
Bilal Piot
Idan Szpektor
Avinatan Hassidim
Yossi Matias
Rémi Munos
49
26
0
23 May 2024
Reinforcing Language Agents via Policy Optimization with Action Decomposition
Muning Wen
Bo Liu
Weinan Zhang
Jun Wang
Ying Wen
51
8
0
23 May 2024
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast
Chufan Shi
Cheng Yang
Xinyu Zhu
Jiahao Wang
Taiqiang Wu
Siheng Li
Deng Cai
Yujiu Yang
Yu Meng
MoE
58
9
0
23 May 2024
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition
Chan-Jan Hsu
Yi-Chang Chen
Feng-Ting Liao
Pei-Chen Ho
Yu-Hsiang Wang
Po-Chun Hsu
Da-Shan Shiu
31
2
0
23 May 2024
Annotation-Efficient Preference Optimization for Language Model Alignment
Yuu Jinnai
Ukyo Honda
47
0
0
22 May 2024
LIRE: listwise reward enhancement for preference alignment
Mingye Zhu
Yi Liu
Lei Zhang
Junbo Guo
Zhendong Mao
26
7
0
22 May 2024
Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
Rheeya Uppaal
Apratim De
Yiting He
Yiquao Zhong
Junjie Hu
46
9
0
22 May 2024
Can We Treat Noisy Labels as Accurate?
Yuxiang Zheng
Zhongyi Han
Yilong Yin
Xin Gao
Tongliang Liu
38
1
0
21 May 2024
Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents
San Kim
Gary Geunbae Lee
AAML
43
3
0
21 May 2024
SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling
Xingzhou Lou
Junge Zhang
Jian Xie
Lifeng Liu
Dong Yan
Kaiqi Huang
45
11
0
21 May 2024
Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming
Jiaxu Liu
Xiangyu Yin
Sihao Wu
Jianhong Wang
Meng Fang
Xinping Yi
Xiaowei Huang
36
5
0
21 May 2024
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Kihyun Kim
Jiawei Zhang
Asuman Ozdaglar
P. Parrilo
OffRL
51
1
0
20 May 2024
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Jian Hu
Xibin Wu
Weixun Wang
OpenLLMAI Team
Dehao Zhang
Yu Cao
AI4CE
VLM
33
96
0
20 May 2024
Hummer: Towards Limited Competitive Preference Dataset
Li Jiang
Yusen Wu
Junwu Xiong
Jingqing Ruan
Yichuan Ding
Qingpei Guo
Zujie Wen
Jun Zhou
Xiaotie Deng
39
6
0
19 May 2024
Learning from Imperfect Human Feedback: a Tale from Corruption-Robust Dueling
Yuwei Cheng
Fan Yao
Xuefeng Liu
Haifeng Xu
48
1
0
18 May 2024
Automated Multi-level Preference for MLLMs
Mengxi Zhang
Wenhao Wu
Yu Lu
Yuxin Song
Kang Rong
...
Jianbo Zhao
Fanglong Liu
Yifan Sun
Haocheng Feng
Jingdong Wang
MLLM
83
10
0
18 May 2024
Previous
1
2
3
...
11
12
13
...
27
28
29
Next