Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.10500
Cited By
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
25 January 2023
Jiatai Huang
Yan Dai
Longbo Huang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning"
7 / 7 papers shown
Title
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs
Alexander Ryabchenko
Idan Attias
Daniel M. Roy
CLL
34
0
0
25 Mar 2025
Scale-free Adversarial Reinforcement Learning
Mingyu Chen
Xuezhou Zhang
82
2
0
01 Mar 2024
Improved Algorithms for Adversarial Bandits with Unbounded Losses
Mingyu Chen
Xuezhou Zhang
17
3
0
03 Oct 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
21
4
0
15 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
27
8
0
03 Feb 2023
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
Yan Dai
Haipeng Luo
Liyu Chen
66
19
0
26 May 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
74
21
0
31 Jan 2022
1