Online Target Q-learning with Reverse Experience Replay: Efficiently
finding the Optimal Policy for Linear MDPs

Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs

16 October 2021

Syomantak Chaudhuri

Dheeraj M. Nagaraj

Praneeth Netrapalli

Papers citing "Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs"

12 / 12 papers shown

Title
Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems Prateek Jain S. Kowshik Dheeraj M. Nagaraj Praneeth Netrapalli OffRL 30 23 0 24 May 2021
Streaming Linear System Identification with Reverse Experience Replay Prateek Jain S. Kowshik Dheeraj M. Nagaraj Praneeth Netrapalli OffRL 48 19 0 10 Mar 2021
Momentum Q-learning with Finite-Sample Convergence Guarantee Bowen Weng Huaqing Xiong Linna Zhao Yingbin Liang Wei Zhang 43 8 0 30 Jul 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model Gen Li Yuting Wei Yuejie Chi Yuxin Chen 88 128 0 26 May 2020
Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$ -Learning Guannan Qu Adam Wierman 50 110 0 01 Feb 2020
Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning Harsh Gupta R. Srikant Lei Ying 51 86 0 14 Jul 2019
Provably Efficient Reinforcement Learning with Linear Function Approximation Chi Jin Zhuoran Yang Zhaoran Wang Michael I. Jordan 86 555 0 11 Jul 2019
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal Alekh Agarwal Sham Kakade Lin F. Yang OffRL 81 170 0 10 Jun 2019
Finite-Sample Analysis of Nonlinear Stochastic Approximation with Applications in Reinforcement Learning Zaiwei Chen Sheng Zhang Thinh T. Doan John-Paul Clarke S. T. Maguluri 55 59 0 27 May 2019
$Stochastic approximation with cone-contractive operators: Sharp $\ell_\infty$-bounds for $Q$-learning$ Stochastic approximation with cone-contractive operators: Sharp $\ell_\infty$ -bounds for $Q$ -learning Martin J. Wainwright 43 105 0 15 May 2019
Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning R. Srikant Lei Ying 61 252 0 03 Feb 2019
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation Jalaj Bhandari Daniel Russo Raghav Singal 101 339 0 06 Jun 2018

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.