ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.13451
  4. Cited By
Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes
  with Bandit Feedback

Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback

26 May 2022
Yan Dai
Haipeng Luo
Liyu Chen
ArXivPDFHTML

Papers citing "Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback"

15 / 15 papers shown
Title
Hybrid Real- and Complex-valued Neural Network Architecture
Hybrid Real- and Complex-valued Neural Network Architecture
Alex Young
L. V. Fiorio
Bo Yang
B. Karanov
Wim J. van Houtum
Ronald M. Aarts
31
0
0
04 Apr 2025
State-free Reinforcement Learning
State-free Reinforcement Learning
Mingyu Chen
Aldo Pacchiano
Xuezhou Zhang
66
0
0
27 Sep 2024
Random Latent Exploration for Deep Reinforcement Learning
Random Latent Exploration for Deep Reinforcement Learning
Srinath Mahankali
Zhang-Wei Hong
Ayush Sekhari
Alexander Rakhlin
Pulkit Agrawal
33
3
0
18 Jul 2024
Scale-free Adversarial Reinforcement Learning
Scale-free Adversarial Reinforcement Learning
Mingyu Chen
Xuezhou Zhang
82
2
0
01 Mar 2024
Learning Adversarial Low-rank Markov Decision Processes with Unknown
  Transition and Full-information Feedback
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback
Canzhe Zhao
Ruofeng Yang
Baoxiang Wang
Xuezhou Zhang
Shuai Li
27
2
0
14 Nov 2023
Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback
Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback
Haolin Liu
Chen-Yu Wei
Julian Zimmert
22
6
0
17 Oct 2023
Online Resource Allocation in Episodic Markov Decision Processes
Online Resource Allocation in Episodic Markov Decision Processes
Duksang Lee
William Overman
Dabeen Lee
37
1
0
18 May 2023
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial
  Semi-Bandits, Linear Bandits, and MDPs
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven
Lukas Zierahn
Tal Lancewicki
Aviv A. Rosenberg
Nicolò Cesa-Bianchi
21
4
0
15 May 2023
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs
Kaixuan Ji
Qingyue Zhao
Jiafan He
Weitong Zhang
Q. Gu
52
4
0
15 May 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial
  MDP with Delayed Bandit Feedback
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
Tal Lancewicki
Aviv A. Rosenberg
Dmitry Sotnikov
29
3
0
13 May 2023
Oracle-Efficient Smoothed Online Learning for Piecewise Continuous
  Decision Making
Oracle-Efficient Smoothed Online Learning for Piecewise Continuous Decision Making
Adam Block
Alexander Rakhlin
Max Simchowitz
40
4
0
10 Feb 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online
  Bandit Learning
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Jiatai Huang
Yan Dai
Longbo Huang
13
6
0
25 Jan 2023
Online Prediction in Sub-linear Space
Online Prediction in Sub-linear Space
Binghui Peng
Fred Zhang
23
16
0
16 Jul 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
74
21
0
31 Jan 2022
Learning in Online MDPs: Is there a Price for Handling the Communicating
  Case?
Learning in Online MDPs: Is there a Price for Handling the Communicating Case?
Gautam Chandrasekaran
Ambuj Tewari
20
1
0
03 Nov 2021
1