ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.10500
  4. Cited By
Banker Online Mirror Descent: A Universal Approach for Delayed Online
  Bandit Learning

Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning

25 January 2023
Jiatai Huang
Yan Dai
Longbo Huang
ArXivPDFHTML

Papers citing "Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning"

7 / 7 papers shown
Title
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs
Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs
Alexander Ryabchenko
Idan Attias
Daniel M. Roy
CLL
34
0
0
25 Mar 2025
Scale-free Adversarial Reinforcement Learning
Scale-free Adversarial Reinforcement Learning
Mingyu Chen
Xuezhou Zhang
82
2
0
01 Mar 2024
Improved Algorithms for Adversarial Bandits with Unbounded Losses
Improved Algorithms for Adversarial Bandits with Unbounded Losses
Mingyu Chen
Xuezhou Zhang
17
3
0
03 Oct 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial
  Semi-Bandits, Linear Bandits, and MDPs
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
21
4
0
15 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed
  Feedback
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
27
8
0
03 Feb 2023
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes
  with Bandit Feedback
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
Yan Dai
Haipeng Luo
Liyu Chen
66
19
0
26 May 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
74
21
0
31 Jan 2022
1