Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.11270
Cited By
Principled Reinforcement Learning with Human Feedback from Pairwise or
K
K
K
-wise Comparisons
26 January 2023
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons"
50 / 147 papers shown
Title
Online Iterative Self-Alignment for Radiology Report Generation
Ting Xiao
Lei Shi
Yang Zhang
HaoFeng Yang
Zhe Wang
Chenjia Bai
2
0
0
17 May 2025
Learning Guarantee of Reward Modeling Using Deep Neural Networks
Yuanhang Luo
Yeheng Ge
Ruijian Han
Guohao Shen
34
0
0
10 May 2025
Semantic Probabilistic Control of Language Models
Kareem Ahmed
Catarina G Belém
Padhraic Smyth
Sameer Singh
42
0
0
04 May 2025
Contextual Online Uncertainty-Aware Preference Learning for Human Feedback
Nan Lu
Ethan X. Fang
Junwei Lu
155
0
0
27 Apr 2025
Reinforcement Learning from Multi-level and Episodic Human Feedback
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
50
0
0
20 Apr 2025
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
João Loula
Benjamin LeBrun
Li Du
Ben Lipkin
Clemente Pasti
...
Ryan Cotterel
Vikash K. Mansinghka
Alexander K. Lew
Tim Vieira
Timothy J. O'Donnell
34
2
0
17 Apr 2025
Active Human Feedback Collection via Neural Contextual Dueling Bandits
Arun Verma
Xiaoqiang Lin
Zhongxiang Dai
Daniela Rus
Bryan Kian Hsiang Low
32
0
0
16 Apr 2025
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Kai Ye
Hongyi Zhou
Jin Zhu
Francesco Quinzan
C. Shi
32
1
0
03 Apr 2025
Stochastic Trajectory Prediction under Unstructured Constraints
Hao Ma
Zhiqiang Pu
Shijie Wang
Boyin Liu
Huimu Wang
Yanyan Liang
Jianqiang Yi
63
0
0
18 Mar 2025
Strategyproof Reinforcement Learning from Human Feedback
Thomas Kleine Buening
Jiarui Gan
Debmalya Mandal
Marta Z. Kwiatkowska
52
0
0
13 Mar 2025
Evaluating and Aligning Human Economic Risk Preferences in LLMs
Jiaheng Liu
Yi Yang
Kar Yan Tam
67
0
0
09 Mar 2025
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
Hyungkyu Kang
Min-hwan Oh
OffRL
47
0
0
07 Mar 2025
Towards User-level Private Reinforcement Learning with Human Feedback
Jun Zhang
Mingxi Lei
Meng Ding
Mengdi Li
Zihang Xiang
Difei Xu
Jinhui Xu
Di Wang
47
0
0
22 Feb 2025
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
65
23
0
20 Feb 2025
Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability
Qingyue Zhao
Kaixuan Ji
Heyang Zhao
Tong Zhang
Q. Gu
OffRL
45
0
0
09 Feb 2025
Online Clustering of Dueling Bandits
Zhiyong Wang
Jiahang Sun
Mingze Kong
Jize Xie
Qinghua Hu
J. C. Lui
Zhongxiang Dai
83
0
0
04 Feb 2025
Clone-Robust AI Alignment
Ariel D. Procaccia
Benjamin G. Schiffer
Shirley Zhang
35
1
0
17 Jan 2025
Pairwise Comparisons without Stochastic Transitivity: Model, Theory and Applications
Sze Ming Lee
Yunxiao Chen
39
0
0
13 Jan 2025
On the Partial Identifiability in Reward Learning: Choosing the Best Reward
Filippo Lazzati
Alberto Maria Metelli
38
0
0
10 Jan 2025
Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques
Natalia Zhang
X. Wang
Qiwen Cui
Runlong Zhou
Sham Kakade
Simon S. Du
OffRL
48
0
0
10 Jan 2025
LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency
Xiao-Yin Liu
Guotao Li
Xiao-Hu Zhou
Z. Hou
OffRL
44
0
0
31 Dec 2024
Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model
Songjun Tu
Jingbo Sun
Qichao Zhang
Xiangyuan Lan
Dongbin Zhao
75
2
0
22 Dec 2024
Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration
Avinandan Bose
Zhihan Xiong
Aadirupa Saha
S. Du
Maryam Fazel
76
1
0
13 Dec 2024
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
78
0
0
12 Nov 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
57
3
0
07 Nov 2024
Sample-Efficient Alignment for LLMs
Zichen Liu
Changyu Chen
Chao Du
Wee Sun Lee
Min-Bin Lin
36
3
0
03 Nov 2024
Active Preference-based Learning for Multi-dimensional Personalization
Minhyeon Oh
Seungjoon Lee
Jungseul Ok
31
1
0
01 Nov 2024
On The Global Convergence Of Online RLHF With Neural Parametrization
Mudit Gaur
Amrit Singh Bedi
Raghu Pasupathy
Vaneet Aggarwal
28
0
0
21 Oct 2024
Mitigating Forgetting in LLM Supervised Fine-Tuning and Preference Learning
H. Fernando
Han Shen
Parikshit Ram
Yi Zhou
Horst Samulowitz
Nathalie Baracaldo
Tianyi Chen
CLL
59
2
0
20 Oct 2024
A Theoretical Survey on Foundation Models
Shi Fu
Yuzhu Chen
Yingjie Wang
Dacheng Tao
28
0
0
15 Oct 2024
Evolutionary Retrofitting
Mathurin Videau
M. Zameshina
Alessandro Leite
Laurent Najman
Marc Schoenauer
O. Teytaud
38
0
0
15 Oct 2024
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Shenao Zhang
Zhihan Liu
Boyi Liu
Wenjie Qu
Yingxiang Yang
Y. Liu
Liyu Chen
Tao Sun
Ziyi Wang
101
3
0
10 Oct 2024
DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback
Guojun Xiong
Ujwal Dinesha
Debajoy Mukherjee
Jian Li
Srinivas Shakkottai
42
2
0
07 Oct 2024
Reward Learning From Preference With Ties
Jinsong Liu
Dongdong Ge
Ruihao Zhu
29
3
0
05 Oct 2024
Grounded Answers for Multi-agent Decision-making Problem through Generative World Model
Zeyang Liu
Xinrui Yang
Shiguang Sun
Long Qian
Lipeng Wan
Xingyu Chen
Xuguang Lan
22
2
0
03 Oct 2024
DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning
Yebowen Hu
Xiaoyang Wang
Wenlin Yao
Yiming Lu
Daoan Zhang
H. Foroosh
Dong Yu
Fei Liu
36
4
0
02 Oct 2024
The Crucial Role of Samplers in Online Direct Preference Optimization
Ruizhe Shi
Runlong Zhou
Simon S. Du
58
8
0
29 Sep 2024
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Qining Zhang
Lei Ying
OffRL
37
2
0
25 Sep 2024
Beyond Preferences in AI Alignment
Tan Zhi-Xuan
Micah Carroll
Matija Franklin
Hal Ashton
41
16
0
30 Aug 2024
Critique-out-Loud Reward Models
Zachary Ankner
Mansheej Paul
Brandon Cui
Jonathan D. Chang
Prithviraj Ammanabrolu
ALM
LRM
40
28
0
21 Aug 2024
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
Heewoong Choi
Sangwon Jung
Hongjoon Ahn
Taesup Moon
OffRL
44
2
0
08 Aug 2024
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Seongho Son
William Bankes
Sayak Ray Chowdhury
Brooks Paige
Ilija Bogunovic
42
4
0
26 Jul 2024
Conversational Dueling Bandits in Generalized Linear Models
Shuhua Yang
Hui Yuan
Xiaoying Zhang
Mengdi Wang
Hong Zhang
Huazheng Wang
41
1
0
26 Jul 2024
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback
Arun Verma
Zhongxiang Dai
Xiaoqiang Lin
Patrick Jaillet
K. H. Low
37
5
0
24 Jul 2024
BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models
Gihun Lee
Minchan Jeong
Yujin Kim
Hojung Jung
Jaehoon Oh
Sangmook Kim
Se-Young Yun
35
1
0
30 Jun 2024
Preference Elicitation for Offline Reinforcement Learning
Alizée Pace
Bernhard Schölkopf
Gunnar Rätsch
Giorgia Ramponi
OffRL
69
1
0
26 Jun 2024
Bandits with Preference Feedback: A Stackelberg Game Perspective
Barna Pásztor
Parnian Kassraie
Andreas Krause
40
2
0
24 Jun 2024
Robust Reinforcement Learning from Corrupted Human Feedback
Alexander Bukharin
Ilgee Hong
Haoming Jiang
Zichong Li
Qingru Zhang
Zixuan Zhang
Tuo Zhao
41
4
0
21 Jun 2024
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback
Zhirui Chen
Vincent Y. F. Tan
OffRL
46
1
0
18 Jun 2024
Is poisoning a real threat to LLM alignment? Maybe more so than you think
Pankayaraj Pathmanathan
Souradip Chakraborty
Xiangyu Liu
Yongyuan Liang
Furong Huang
AAML
48
13
0
17 Jun 2024
1
2
3
Next