Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning

25 January 2023

Papers citing "Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning"

7 / 7 papers shown

Title
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs Alexander Ryabchenko Idan Attias Daniel M. Roy CLL 34 0 0 25 Mar 2025
Scale-free Adversarial Reinforcement Learning Mingyu Chen Xuezhou Zhang 82 2 0 01 Mar 2024
Improved Algorithms for Adversarial Bandits with Unbounded Losses Mingyu Chen Xuezhou Zhang 17 3 0 03 Oct 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs Dirk van der Hoeven Lukas Zierahn Tal Lancewicki Aviv A. Rosenberg Nicolò Cesa-Bianchi 21 4 0 15 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback Yunchang Yang Hangshi Zhong Tianhao Wu B. Liu Liwei Wang S. Du OffRL 27 8 0 03 Feb 2023
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback Yan Dai Haipeng Luo Liyu Chen 66 19 0 26 May 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback Tiancheng Jin Tal Lancewicki Haipeng Luo Yishay Mansour Aviv A. Rosenberg 74 21 0 31 Jan 2022