Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.00587
Cited By
v1
v2
v3 (latest)
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
1 October 2020
Jiafan He
Dongruo Zhou
Quanquan Gu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs"
14 / 14 papers shown
Title
Neural Logistic Bandits
Seoungbin Bae
Dabeen Lee
527
0
0
04 May 2025
Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty
Guanin Liu
Zhihan Zhou
Han Liu
Lifeng Lai
61
2
0
15 Jul 2023
Optimistic Planning by Regularized Dynamic Programming
Antoine Moulin
Gergely Neu
68
4
0
27 Feb 2023
Online Reinforcement Learning with Uncertain Episode Lengths
Debmalya Mandal
Goran Radanović
Jiarui Gan
Adish Singla
R. Majumdar
OffRL
70
8
0
07 Feb 2023
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
Heyang Zhao
Dongruo Zhou
Quanquan Gu
OffRL
136
55
0
12 Dec 2022
Multi-armed Bandit Learning on a Graph
Tianpeng Zhang
Kasper Johansson
Na Li
85
6
0
20 Sep 2022
No-regret Learning in Repeated First-Price Auctions with Budget Constraints
Rui Ai
Chang Wang
Chenchen Li
Jinshan Zhang
Wenhan Huang
Xiaotie Deng
67
10
0
29 May 2022
A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Shangding Gu
Longyu Yang
Yali Du
Guang Chen
Florian Walter
Jun Wang
Alois C. Knoll
OffRL
AI4TS
269
258
0
20 May 2022
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
98
4
0
21 Apr 2022
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
102
54
0
09 Oct 2021
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach
Qinbo Bai
Amrit Singh Bedi
Mridul Agarwal
Alec Koppel
Vaneet Aggarwal
189
60
0
13 Sep 2021
MADE: Exploration via Maximizing Deviation from Explored Regions
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
98
44
0
18 Jun 2021
Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?
Nicolas Gast
B. Gaujal
K. Khun
95
2
0
16 Jun 2021
Regret Bounds for Discounted MDPs
Shuang Liu
H. Su
OffRL
71
19
0
12 Feb 2020
1