Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.05606
Cited By
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition
10 June 2020
Tiancheng Jin
Haipeng Luo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition"
21 / 21 papers shown
Title
A Model Selection Approach for Corruption Robust Reinforcement Learning
Chen-Yu Wei
Christoph Dann
Julian Zimmert
99
44
0
31 Dec 2024
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
Yu Chen
Jiatai Huang
Yan Dai
Longbo Huang
101
0
0
04 Oct 2024
LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits
Masahiro Kato
Shinji Ito
83
0
0
05 Mar 2024
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
Chen Ye
Wei Xiong
Quanquan Gu
Tong Zhang
89
31
0
12 Dec 2022
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback
Chung-Wei Lee
Haipeng Luo
Mengxiao Zhang
29
24
0
02 Feb 2020
Learning Adversarial MDPs with Bandit Feedback and Unknown Transition
Chi Jin
Tiancheng Jin
Haipeng Luo
S. Sra
Tiancheng Yu
43
103
0
03 Dec 2019
Corruption-robust exploration in episodic reinforcement learning
Thodoris Lykouris
Max Simchowitz
Aleksandrs Slivkins
Wen Sun
41
105
0
20 Nov 2019
Introduction to Online Convex Optimization
Elad Hazan
OffRL
90
1,922
0
07 Sep 2019
Equipping Experts/Bandits with Long-term Memory
Kai Zheng
Haipeng Luo
Ilias Diakonikolas
Liwei Wang
OffRL
36
15
0
30 May 2019
Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs
Max Simchowitz
Kevin Jamieson
50
144
0
09 May 2019
Better Algorithms for Stochastic Bandits with Adversarial Corruptions
Anupam Gupta
Tomer Koren
Kunal Talwar
AAML
56
152
0
22 Feb 2019
Bandit Principal Component Analysis
W. Kotłowski
Gergely Neu
29
17
0
08 Feb 2019
Improved Path-length Regret Bounds for Bandits
Sébastien Bubeck
Yuanzhi Li
Haipeng Luo
Chen-Yu Wei
54
46
0
29 Jan 2019
Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously
Julian Zimmert
Haipeng Luo
Chen-Yu Wei
59
81
0
25 Jan 2019
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits
Julian Zimmert
Yevgeny Seldin
AAML
46
177
0
19 Jul 2018
Stochastic bandits robust to adversarial corruptions
Thodoris Lykouris
Vahab Mirrokni
R. Leme
AAML
73
203
0
25 Mar 2018
More Adaptive Algorithms for Adversarial Bandits
Chen-Yu Wei
Haipeng Luo
79
181
0
10 Jan 2018
Sparsity, variance and curvature in multi-armed bandits
Sébastien Bubeck
Michael B. Cohen
Yuanzhi Li
69
60
0
03 Nov 2017
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
65
771
0
16 Mar 2017
An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits
Yevgeny Seldin
Gábor Lugosi
33
92
0
20 Feb 2017
An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits
P. Auer
Chao-Kai Chiang
22
110
0
27 May 2016
1