Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.13172
Cited By
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
31 January 2022
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback"
22 / 22 papers shown
Title
Biased Dueling Bandits with Stochastic Delayed Feedback
Bongsoo Yi
Yue Kang
Yao Li
38
1
0
26 Aug 2024
Non-stochastic Bandits With Evolving Observations
Yogev Bar-On
Yishay Mansour
27
1
0
27 May 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
31
3
0
13 May 2024
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
Guojun Xiong
Jian Li
20
1
0
02 May 2024
Scale-free Adversarial Reinforcement Learning
Mingyu Chen
Xuezhou Zhang
82
2
0
01 Mar 2024
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation
Nikki Lijing Kuang
Ming Yin
Mengdi Wang
Yu-Xiang Wang
Yian Ma
24
6
0
29 Oct 2023
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
39
3
0
21 Aug 2023
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Lei Shi
Jingshen Wang
Tianhao Wu
22
4
0
03 Jul 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
19
4
0
15 May 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
Tal Lancewicki
Aviv A. Rosenberg
Dmitry Sotnikov
29
3
0
13 May 2023
Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward
Washim Uddin Mondal
Vaneet Aggarwal
43
2
0
04 May 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
27
8
0
03 Feb 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
Uri Sherman
Tomer Koren
Yishay Mansour
32
12
0
30 Jan 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Jiatai Huang
Yan Dai
Longbo Huang
11
6
0
25 Jan 2023
Multi-Agent Reinforcement Learning with Reward Delays
Yuyang Zhang
Runyu Zhang
Yu Gu
Na Li
18
8
0
02 Dec 2022
Dynamical Linear Bandits
Marco Mussi
Alberto Maria Metelli
Marcello Restelli
38
2
0
16 Nov 2022
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback
Yan Dai
Haipeng Luo
Liyu Chen
63
19
0
26 May 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs
Ian A. Kash
L. Reyzin
Zishun Yu
31
0
0
18 May 2022
Cooperative Online Learning in Stochastic and Adversarial MDPs
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
63
3
0
31 Jan 2022
Nonstochastic Bandits with Composite Anonymous Feedback
Nicolò Cesa-Bianchi
Tommaso Cesari
Roberto Colomboni
Claudio Gentile
Yishay Mansour
108
39
0
06 Dec 2021
Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays
Jiatai Huang
Yan Dai
Longbo Huang
AI4CE
19
2
0
26 Oct 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
95
23
0
17 Feb 2021
1