ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.00490
  4. Cited By
Online Markov Decision Processes with Aggregate Bandit Feedback

Online Markov Decision Processes with Aggregate Bandit Feedback

31 January 2021
Alon Cohen
Haim Kaplan
Tomer Koren
Yishay Mansour
    OffRL
ArXivPDFHTML

Papers citing "Online Markov Decision Processes with Aggregate Bandit Feedback"

4 / 4 papers shown
Title
Online Episodic Convex Reinforcement Learning
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
34
0
0
12 May 2025
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
36
3
0
13 May 2024
Dynamic Regret of Online Markov Decision Processes
Dynamic Regret of Online Markov Decision Processes
Peng Zhao
Longfei Li
Zhi-Hua Zhou
OffRL
47
17
0
26 Aug 2022
Kernel-based methods for bandit convex optimization
Kernel-based methods for bandit convex optimization
Sébastien Bubeck
Ronen Eldan
Y. Lee
89
164
0
11 Jul 2016
1