Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.14352
Cited By
Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning
28 June 2021
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning"
15 / 15 papers shown
Title
Stochastic Optimization with Constraints: A Non-asymptotic Instance-Dependent Analysis
K. Khamaru
27
0
0
24 Mar 2024
Optimal Sample Complexity for Average Reward Markov Decision Processes
Shengbo Wang
Jose H. Blanchet
Peter Glynn
38
10
0
13 Oct 2023
MFRL-BI: Design of a Model-free Reinforcement Learning Process Control Scheme by Using Bayesian Inference
Yanrong Li
Juan Du
Wei Jiang
22
0
0
17 Sep 2023
Optimal Sample Complexity of Reinforcement Learning for Mixing Discounted Markov Decision Processes
Shengbo Wang
Jose H. Blanchet
Peter Glynn
31
5
0
15 Feb 2023
Robust Markov Decision Processes without Model Estimation
Wenhao Yang
Hanfengzhai Wang
Tadashi Kozuno
S. Jordan
Zhihua Zhang
24
2
0
02 Feb 2023
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
45
5
0
01 Jun 2022
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
Tadashi Kozuno
Wenhao Yang
Nino Vieillard
Toshinori Kitamura
Yunhao Tang
...
Michal Valko
Rémi Munos
Olivier Pietquin
M. Geist
Csaba Szepesvári
107
10
0
27 May 2022
Pessimism for Offline Linear Contextual Bandits using
ℓ
p
\ell_p
ℓ
p
Confidence Sets
Gen Li
Cong Ma
Nathan Srebro
OffRL
36
13
0
21 May 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
37
5
0
21 Jan 2022
Optimal variance-reduced stochastic approximation in Banach spaces
Wenlong Mou
K. Khamaru
Martin J. Wainwright
Peter L. Bartlett
Michael I. Jordan
41
8
0
21 Jan 2022
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning
Xiang Li
Wenhao Yang
Jiadong Liang
Zhihua Zhang
Michael I. Jordan
48
15
0
29 Dec 2021
Accelerated and instance-optimal policy evaluation with linear function approximation
Tianjiao Li
Guanghui Lan
A. Pananjady
OffRL
44
13
0
24 Dec 2021
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning
Andrew Wagenmaker
Max Simchowitz
Kevin G. Jamieson
25
34
0
05 Aug 2021
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity
J.N. Zhang
Hongzhou Lin
Subhro Das
S. Sra
Ali Jadbabaie
16
1
0
08 Jun 2020
1