Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.01325
Cited By
Learning to summarize from human feedback
2 September 2020
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to summarize from human feedback"
50 / 1,440 papers shown
Title
MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment
Tianze Wang
Dongnan Gui
Yifan Hu
Shuhang Lin
Linjun Zhang
40
0
0
25 Feb 2025
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Jaydeep Borkar
Matthew Jagielski
Katherine Lee
Niloofar Mireshghallah
David A. Smith
Christopher A. Choquette-Choo
PILM
83
1
0
24 Feb 2025
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Taiyi Wang
Zhihao Wu
Jianheng Liu
Jianye Hao
Jun Wang
Kun Shao
OffRL
44
13
0
24 Feb 2025
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance
Chenghua Huang
Lu Wang
Fangkai Yang
Pu Zhao
Zechao Li
Qingwei Lin
Dongmei Zhang
Saravan Rajmohan
Qi Zhang
OffRL
57
1
0
24 Feb 2025
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Rylan Schaeffer
Punit Singh Koura
Binh Tang
R. Subramanian
Aaditya K. Singh
...
Vedanuj Goswami
Sergey Edunov
Dieuwke Hupkes
Sanmi Koyejo
Sharan Narang
ALM
71
0
0
24 Feb 2025
Evaluating the Effectiveness of Large Language Models in Automated News Article Summarization
Lionel Richy Panlap Houamegni
Fatih Gedikli
44
0
0
24 Feb 2025
RLTHF: Targeted Human Feedback for LLM Alignment
Yifei Xu
Tusher Chakraborty
Emre Kıcıman
Bibek Aryal
Eduardo Rodrigues
...
Rafael Padilha
Leonardo Nunes
Shobana Balakrishnan
Songwu Lu
Ranveer Chandra
118
1
0
24 Feb 2025
Sequence-level Large Language Model Training with Contrastive Preference Optimization
Zhili Feng
Dhananjay Ram
Cole Hawkins
Aditya Rawal
Jinman Zhao
Sheng Zha
62
0
0
23 Feb 2025
Retrieval-Augmented Fine-Tuning With Preference Optimization For Visual Program Generation
Deokhyung Kang
Jeonghun Cho
Yejin Jeon
Sunbin Jang
Minsub Lee
Jawoon Cho
Gary Geunbae Lee
43
0
0
23 Feb 2025
C-3DPO: Constrained Controlled Classification for Direct Preference Optimization
Kavosh Asadi
Julien Han
Xingzi Xu
Dominique Perrault-Joncas
Shoham Sabach
Karim Bouyarmane
Mohammad Ghavamzadeh
34
0
0
22 Feb 2025
IPO: Your Language Model is Secretly a Preference Classifier
Shivank Garg
Ayush Singh
Shweta Singh
Paras Chopra
163
1
0
22 Feb 2025
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
SeongYeub Chu
JongWoo Kim
MunYong Yi
60
3
0
21 Feb 2025
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Teng Xiao
Yige Yuan
Ziyang Chen
Mingxiao Li
Shangsong Liang
Z. Ren
V. Honavar
100
6
0
21 Feb 2025
Faster WIND: Accelerating Iterative Best-of-
N
N
N
Distillation for LLM Alignment
Tong Yang
Jincheng Mei
H. Dai
Zixin Wen
Shicong Cen
Dale Schuurmans
Yuejie Chi
Bo Dai
45
4
0
20 Feb 2025
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
65
23
0
20 Feb 2025
Simplify RLHF as Reward-Weighted SFT: A Variational Method
Yuhao Du
Zehan Li
Pengyu Cheng
Zhihong Chen
Yuejiao Xie
Xiang Wan
Anningzhe Gao
38
1
0
20 Feb 2025
Oreo: A Plug-in Context Reconstructor to Enhance Retrieval-Augmented Generation
Sha Li
Naren Ramakrishnan
RALM
KELM
154
1
0
18 Feb 2025
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Yongtao Wu
Luca Viano
Yihang Chen
Zhenyu Zhu
Kimon Antonakopoulos
Quanquan Gu
V. Cevher
54
0
0
18 Feb 2025
S
2
^2
2
R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Ruotian Ma
Peisong Wang
Cheng Liu
Xingyan Liu
Jiaqi Chen
Bang Zhang
Xin Zhou
Nan Du
Jia Li
LRM
62
2
0
18 Feb 2025
Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?
Xueru Wen
Jie Lou
Yunfan LU
Hongyu Lin
Xing Yu
Xinyu Lu
Xianpei Han
Xianpei Han
Debing Zhang
Le Sun
ALM
61
5
0
17 Feb 2025
Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models
Yingshui Tan
Yilei Jiang
Heng Chang
Jiaheng Liu
Xingyuan Bu
Wenbo Su
Xiangyu Yue
Xiaoyong Zhu
Bo Zheng
ALM
84
1
0
17 Feb 2025
A Critical Look At Tokenwise Reward-Guided Text Generation
Ahmad Rashid
Ruotian Wu
Julia Grosse
Agustinus Kristiadi
Pascal Poupart
OffRL
76
0
0
17 Feb 2025
ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease Diagnosis
Xu Wang
Jiaju Kang
Puyu Han
Yubao Zhao
Qian Liu
Liwenfei He
Lingqiong Zhang
Lingyun Dai
Yongcheng Wang
Jie Tao
LM&MA
65
1
0
16 Feb 2025
Bone Soups: A Seek-and-Soup Model Merging Approach for Controllable Multi-Objective Generation
Guofu Xie
Xiao Zhang
Ting Yao
Yunsheng Shi
MoMe
63
1
0
15 Feb 2025
Preference learning made easy: Everything should be understood through win rate
Lily H. Zhang
Rajesh Ranganath
87
0
0
14 Feb 2025
Diffusion Models Through a Global Lens: Are They Culturally Inclusive?
Zahra Bayramli
Ayhan Suleymanzade
Na Min An
Huzama Ahmad
Eunsu Kim
Junyeong Park
James Thorne
Alice H. Oh
91
0
0
13 Feb 2025
AuPair: Golden Example Pairs for Code Repair
Aditi Mavalankar
Hassan Mansoor
Zita Marinho
Masha Samsikova
Tom Schaul
KELM
LRM
137
0
0
12 Feb 2025
DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization
Xuefeng Liu
Songhao Jiang
Siyu Chen
Zhuoran Yang
Yuxin Chen
Ian Foster
Rick L. Stevens
LM&MA
OffRL
58
0
0
11 Feb 2025
Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning
Jean Vassoyan
Nathanaël Beau
Roman Plaud
OffRL
106
1
0
10 Feb 2025
AI Alignment at Your Discretion
Maarten Buyl
Hadi Khalaf
C. M. Verdun
Lucas Monteiro Paes
Caio Vieira Machado
Flavio du Pin Calmon
45
0
0
10 Feb 2025
Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails
Yijun Yang
L. Wang
Xiao Yang
Lanqing Hong
Jun Zhu
AAML
66
0
0
09 Feb 2025
Refining Positive and Toxic Samples for Dual Safety Self-Alignment of LLMs with Minimal Human Interventions
Jingxin Xu
Guoshun Nan
Sheng Guan
Sicong Leng
Yong-Jin Liu
Zixiao Wang
Yuyang Ma
Zhili Zhou
Yanzhao Hou
Xiaofeng Tao
LM&MA
60
0
0
08 Feb 2025
Design Considerations in Offline Preference-based RL
Alekh Agarwal
Christoph Dann
T. V. Marinov
OffRL
56
0
0
08 Feb 2025
Enhancing Knowledge Graph Construction: Evaluating with Emphasis on Hallucination, Omission, and Graph Similarity Metrics
Hussam Ghanem
C. Cruz
63
0
0
07 Feb 2025
Mirror Descent Actor Critic via Bounded Advantage Learning
Ryo Iwaki
93
0
0
06 Feb 2025
Aero-LLM: A Distributed Framework for Secure UAV Communication and Intelligent Decision-Making
Balakrishnan Dharmalingam
Rajdeep Mukherjee
Brett Piggott
Guohuan Feng
Anyi Liu
49
1
0
05 Feb 2025
CTR-Driven Advertising Image Generation with Multimodal Large Language Models
Xingye Chen
Wei Feng
Zhenbang Du
Weizhen Wang
Yuxiao Chen
...
Jingping Shao
Yuanjie Shao
Xinge You
Changxin Gao
Nong Sang
OffRL
47
2
0
05 Feb 2025
Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge
Daniel Tamayo
Aitor Gonzalez-Agirre
Javier Hernando
Marta Villegas
KELM
93
3
0
04 Feb 2025
Agent-Based Uncertainty Awareness Improves Automated Radiology Report Labeling with an Open-Source Large Language Model
Hadas Ben-Atya
N. Gavrielov
Zvi Badash
G. Focht
R. Cytter-Kuint
Talar Hagopian
Dan Turner
M. Freiman
66
0
0
02 Feb 2025
RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception
Joshua R. Waite
Md Zahid Hasan
Qisai Liu
Zhanhong Jiang
Chinmay Hegde
S. Sarkar
OffRL
SyDa
179
1
0
31 Jan 2025
A Three-Branch Checks-and-Balances Frameworkfor Context-Aware Ethical Alignment of Large Language Models
Edward Y. Chang
AILaw
58
0
0
31 Jan 2025
The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking
Yuchun Miao
Sen Zhang
Liang Ding
Yuqi Zhang
Lefei Zhang
Dacheng Tao
81
3
0
31 Jan 2025
Learning to Summarize from LLM-generated Feedback
Hwanjun Song
Taewon Yun
Yuho Lee
Jihwan Oh
Gihun Lee
Jason (Jinglun) Cai
Hang Su
73
3
0
28 Jan 2025
Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction
Kritarth Prasad
Mohammadi Zaki
Pratik Rakesh Singh
Pankaj Wasnik
33
0
0
28 Jan 2025
Controllable Protein Sequence Generation with LLM Preference Optimization
Xiangyu Liu
Yi Liu
Silei Chen
Wei Hu
36
0
0
28 Jan 2025
Beyond correlation: The Impact of Human Uncertainty in Measuring the Effectiveness of Automatic Evaluation and LLM-as-a-Judge
Aparna Elangovan
Jongwoo Ko
Lei Xu
Mahsa Elyasi
Ling Liu
S. Bodapati
Dan Roth
52
6
0
28 Jan 2025
LiPO: Listwise Preference Optimization through Learning-to-Rank
Tianqi Liu
Zhen Qin
Junru Wu
Jiaming Shen
Misha Khalman
...
Mohammad Saleh
Simon Baumgartner
Jialu Liu
Peter J. Liu
Xuanhui Wang
141
49
0
28 Jan 2025
Reference-free Evaluation Metrics for Text Generation: A Survey
Takumi Ito
Kees van Deemter
Jun Suzuki
ELM
41
2
0
21 Jan 2025
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs
Jiaxing Wu
Lin Ning
Luyang Liu
Harrison Lee
Neo Wu
Chao Wang
Sushant Prakash
S. O’Banion
Bradley Green
Jun Xie
71
1
0
20 Jan 2025
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
Yinhong Liu
Han Zhou
Zhijiang Guo
Ehsan Shareghi
Ivan Vulić
Anna Korhonen
Nigel Collier
ALM
135
69
0
20 Jan 2025
Previous
1
2
3
4
5
...
27
28
29
Next