ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.13172
  4. Cited By
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

31 January 2022
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
ArXivPDFHTML

Papers citing "Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback"

22 / 22 papers shown
Title
Biased Dueling Bandits with Stochastic Delayed Feedback
Biased Dueling Bandits with Stochastic Delayed Feedback
Bongsoo Yi
Yue Kang
Yao Li
38
1
0
26 Aug 2024
Non-stochastic Bandits With Evolving Observations
Non-stochastic Bandits With Evolving Observations
Yogev Bar-On
Yishay Mansour
27
1
0
27 May 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
31
3
0
13 May 2024
Provably Efficient Reinforcement Learning for Adversarial Restless
  Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
Guojun Xiong
Jian Li
20
1
0
02 May 2024
Scale-free Adversarial Reinforcement Learning
Scale-free Adversarial Reinforcement Learning
Mingyu Chen
Xuezhou Zhang
82
2
0
01 Mar 2024
Posterior Sampling with Delayed Feedback for Reinforcement Learning with
  Linear Function Approximation
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation
Nikki Lijing Kuang
Ming Yin
Mengdi Wang
Yu-Xiang Wang
Yian Ma
24
6
0
29 Oct 2023
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with
  Robustness to Excessive Delays
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
39
3
0
21 Aug 2023
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Lei Shi
Jingshen Wang
Tianhao Wu
22
4
0
03 Jul 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial
  Semi-Bandits, Linear Bandits, and MDPs
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
19
4
0
15 May 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial
  MDP with Delayed Bandit Feedback
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
Tal Lancewicki
Aviv A. Rosenberg
Dmitry Sotnikov
29
3
0
13 May 2023
Reinforcement Learning with Delayed, Composite, and Partially Anonymous
  Reward
Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward
Washim Uddin Mondal
Vaneet Aggarwal
43
2
0
04 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed
  Feedback
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
27
8
0
03 Feb 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear
  Function Approximation
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
Uri Sherman
Tomer Koren
Yishay Mansour
32
12
0
30 Jan 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online
  Bandit Learning
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Jiatai Huang
Yan Dai
Longbo Huang
11
6
0
25 Jan 2023
Multi-Agent Reinforcement Learning with Reward Delays
Multi-Agent Reinforcement Learning with Reward Delays
Yuyang Zhang
Runyu Zhang
Yu Gu
Na Li
18
8
0
02 Dec 2022
Dynamical Linear Bandits
Dynamical Linear Bandits
Marco Mussi
Alberto Maria Metelli
Marcello Restelli
38
2
0
16 Nov 2022
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes
  with Bandit Feedback
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
Yan Dai
Haipeng Luo
Liyu Chen
63
19
0
26 May 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for
  Discounted MDPs
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs
Ian A. Kash
L. Reyzin
Zishun Yu
31
0
0
18 May 2022
Cooperative Online Learning in Stochastic and Adversarial MDPs
Cooperative Online Learning in Stochastic and Adversarial MDPs
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
63
3
0
31 Jan 2022
Nonstochastic Bandits with Composite Anonymous Feedback
Nonstochastic Bandits with Composite Anonymous Feedback
Nicolò Cesa-Bianchi
Tommaso Cesari
Roberto Colomboni
Claudio Gentile
Yishay Mansour
108
39
0
06 Dec 2021
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays
Jiatai Huang
Yan Dai
Longbo Huang
AI4CE
19
2
0
26 Oct 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial
  Linear Mixture MDPs
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
95
23
0
17 Feb 2021
1