Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.00490
Cited By
Online Markov Decision Processes with Aggregate Bandit Feedback
31 January 2021
Alon Cohen
Haim Kaplan
Tomer Koren
Yishay Mansour
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Online Markov Decision Processes with Aggregate Bandit Feedback"
4 / 4 papers shown
Title
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
34
0
0
12 May 2025
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf B. Cassel
Haipeng Luo
Aviv A. Rosenberg
Dmitry Sotnikov
OffRL
36
3
0
13 May 2024
Dynamic Regret of Online Markov Decision Processes
Peng Zhao
Longfei Li
Zhi-Hua Zhou
OffRL
47
17
0
26 Aug 2022
Kernel-based methods for bandit convex optimization
Sébastien Bubeck
Ronen Eldan
Y. Lee
89
164
0
11 Jul 2016
1