Batch Value-function Approximation with Only Realizability

11 August 2020

Papers citing "Batch Value-function Approximation with Only Realizability"

19 / 19 papers shown

Title
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure Aleksandrs Slivkins Yunzong Xu Shiliang Zuo 352 1 0 06 Mar 2025
Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm Lin Chen B. Scherrer Peter L. Bartlett OffRL 162 16 0 17 Mar 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints Chi Jin Zhuoran Yang Zhaoran Wang OffRL 200 167 0 06 Jan 2021
Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL Andrea Zanette OffRL 115 71 0 14 Dec 2020
A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting Philip Amortila Nan Jiang Tengyang Xie OffRL 78 23 0 02 Nov 2020
What are the Statistical Limits of Offline RL with Linear Function Approximation? Ruosong Wang Dean Phillips Foster Sham Kakade OffRL 117 163 0 22 Oct 2020
Accountable Off-Policy Evaluation With Kernel Bellman Statistics Yihao Feng Tongzheng Ren Ziyang Tang Qiang Liu OffRL 88 44 0 15 Aug 2020
Hyperparameter Selection for Offline Reinforcement Learning T. Paine Cosmin Paduraru Andrea Michi Çağlar Gülçehre Konrad Zolna Alexander Novikov Ziyun Wang Nando de Freitas GP OffRL 130 147 0 17 Jul 2020
Provably Good Batch Reinforcement Learning Without Great Exploration Yao Liu Adith Swaminathan Alekh Agarwal Emma Brunskill OffRL 107 105 0 16 Jul 2020
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison Tengyang Xie Nan Jiang 89 35 0 09 Mar 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization Nan Jiang Jiawei Huang OffRL 107 17 0 06 Feb 2020
Minimax Weight and Q-Function Learning for Off-Policy Evaluation Masatoshi Uehara Jiawei Huang Nan Jiang OffRL 113 186 0 28 Oct 2019
On Value Functions and the Agent-Environment Boundary Nan Jiang OffRL 82 21 0 30 May 2019
Information-Theoretic Considerations in Batch Reinforcement Learning Jinglin Chen Nan Jiang OOD OffRL 120 373 0 01 May 2019
Off-Policy Policy Gradient with State Distribution Correction Yao Liu Adith Swaminathan Alekh Agarwal Emma Brunskill OffRL 111 67 0 17 Apr 2019
Off-Policy Deep Reinforcement Learning without Exploration Scott Fujimoto David Meger Doina Precup OffRL BDL 183 1,586 0 07 Dec 2018
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation Qiang Liu Lihong Li Ziyang Tang Dengyong Zhou OffRL 121 354 0 29 Oct 2018
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable Nan Jiang A. Krishnamurthy Alekh Agarwal John Langford Robert Schapire 113 417 0 29 Oct 2016
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning Nan Jiang Lihong Li OffRL 167 621 0 11 Nov 2015