Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.10776
Cited By
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback
16 April 2024
Qiwei Di
Jiafan He
Quanquan Gu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback"
38 / 38 papers shown
Title
A Model Selection Approach for Corruption Robust Reinforcement Learning
Chen-Yu Wei
Christoph Dann
Julian Zimmert
109
45
0
31 Dec 2024
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback
Arun Verma
Zhongxiang Dai
Xiaoqiang Lin
Patrick Jaillet
K. H. Low
138
5
0
24 Jul 2024
Learning from Imperfect Human Feedback: a Tale from Corruption-Robust Dueling
Yuwei Cheng
Fan Yao
Xuefeng Liu
Haifeng Xu
75
1
0
18 May 2024
Feel-Good Thompson Sampling for Contextual Dueling Bandits
Xuheng Li
Heyang Zhao
Quanquan Gu
63
13
0
09 Apr 2024
Best-of-Both-Worlds Algorithms for Linear Contextual Bandits
Yuko Kuroki
Alberto Rumi
Taira Tsuchiya
Fabio Vitale
Nicolò Cesa-Bianchi
73
7
0
24 Dec 2023
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
Qiwei Di
Tao Jin
Yue Wu
Heyang Zhao
Farzad Farnoud
Quanquan Gu
64
13
0
02 Oct 2023
Borda Regret Minimization for Generalized Linear Dueling Bandits
Yue Wu
Tao Jin
Hao Lou
Farzad Farnoud
Quanquan Gu
61
11
0
15 Mar 2023
Principled Reinforcement Learning with Human Feedback from Pairwise or
K
K
K
-wise Comparisons
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
77
201
0
26 Jan 2023
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
Chen Ye
Wei Xiong
Quanquan Gu
Tong Zhang
126
31
0
12 Dec 2022
Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions
Jiafan He
Dongruo Zhou
Tong Zhang
Quanquan Gu
87
47
0
13 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
760
12,835
0
04 Mar 2022
Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models
Viktor Bengs
Aadirupa Saha
Eyke Hüllermeier
27
23
0
09 Feb 2022
Jointly Efficient and Optimal Algorithms for Logistic Bandits
Louis Faury
Marc Abeille
Kwang-Sung Jun
Clément Calauzènes
49
20
0
06 Jan 2022
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability
Aadirupa Saha
A. Krishnamurthy
56
36
0
24 Nov 2021
Linear Contextual Bandits with Adversarial Corruptions
Heyang Zhao
Dongruo Zhou
Quanquan Gu
AAML
70
24
0
25 Oct 2021
Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks
Qin Ding
Cho-Jui Hsieh
James Sharpnack
AAML
46
33
0
05 Jun 2021
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
Mengxiao Zhang
Xiaojin Zhang
68
49
0
11 Feb 2021
Robust Policy Gradient against Strong Data Corruption
Xuezhou Zhang
Yiding Chen
Xiaojin Zhu
Wen Sun
AAML
82
38
0
11 Feb 2021
Adversarial Dueling Bandits
Aadirupa Saha
Tomer Koren
Yishay Mansour
60
27
0
27 Oct 2020
Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits
Marc Abeille
Louis Faury
Clément Calauzènes
118
37
0
23 Oct 2020
The Ingredients of Real-World Robotic Reinforcement Learning
Henry Zhu
Justin Yu
Abhishek Gupta
Dhruv Shah
Kristian Hartikainen
Avi Singh
Vikash Kumar
Sergey Levine
OffRL
100
176
0
27 Apr 2020
Improved Optimistic Algorithms for Logistic Bandits
Louis Faury
Marc Abeille
Clément Calauzènes
Olivier Fercoq
70
93
0
18 Feb 2020
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning
Tianhe Yu
Deirdre Quillen
Zhanpeng He
Ryan Julian
Avnish Narayan
Hayden Shively
Adithya Bellathur
Karol Hausman
Chelsea Finn
Sergey Levine
OffRL
224
1,160
0
24 Oct 2019
Stochastic Linear Optimization with Adversarial Corruption
Yingkai Li
Edmund Y. Lou
Liren Shan
AAML
43
42
0
04 Sep 2019
Better Algorithms for Stochastic Bandits with Adversarial Corruptions
Anupam Gupta
Tomer Koren
Kunal Talwar
AAML
92
152
0
22 Feb 2019
Stochastic bandits robust to adversarial corruptions
Thodoris Lykouris
Vahab Mirrokni
R. Leme
AAML
119
204
0
25 Mar 2018
Approximate Ranking from Pairwise Comparisons
Reinhard Heckel
Max Simchowitz
Kannan Ramchandran
Martin J. Wainwright
52
39
0
04 Jan 2018
An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits
Yevgeny Seldin
Gábor Lugosi
51
92
0
20 Feb 2017
An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits
P. Auer
Chao-Kai Chiang
53
111
0
27 May 2016
Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm
Junpei Komiyama
Junya Honda
Hiroshi Nakagawa
44
39
0
05 May 2016
Double Thompson Sampling for Dueling Bandits
Huasen Wu
Xin Liu
88
87
0
25 Apr 2016
A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits
Pratik Gajane
Tanguy Urvoy
Fabrice Clérot
76
46
0
15 Jan 2016
Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem
Junpei Komiyama
Junya Honda
H. Kashima
Hiroshi Nakagawa
141
92
0
08 Jun 2015
Copeland Dueling Bandits
M. Zoghi
Zohar Karnin
Shimon Whiteson
Maarten de Rijke
102
89
0
01 Jun 2015
Contextual Dueling Bandits
Miroslav Dudík
Katja Hofmann
Robert Schapire
Aleksandrs Slivkins
M. Zoghi
105
124
0
23 Feb 2015
Sparse Dueling Bandits
Kevin Jamieson
S. Katariya
Atul Deshpande
Robert D. Nowak
193
64
0
31 Jan 2015
Reducing Dueling Bandits to Cardinal Bandits
Nir Ailon
Thorsten Joachims
Zohar Karnin
125
139
0
14 May 2014
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem
M. Zoghi
Shimon Whiteson
Rémi Munos
Maarten de Rijke
75
143
0
12 Dec 2013
1