ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.02581
  4. Cited By
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
v1v2 (latest)

Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning

4 October 2023
Weidong Liu
Jiyuan Tu
Yichen Zhang
Xi Chen
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning"

23 / 23 papers shown
Title
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Hongyi Zhou
Josiah P. Hanna
Jin Zhu
Ying Yang
Chengchun Shi
OffRL
40
0
0
28 May 2025
Variance-aware robust reinforcement learning with linear function
  approximation under heavy-tailed rewards
Variance-aware robust reinforcement learning with linear function approximation under heavy-tailed rewards
Xiang Li
Qiang Sun
58
9
0
09 Mar 2023
Post Reinforcement Learning Inference
Post Reinforcement Learning Inference
Vasilis Syrgkanis
Ruohan Zhan
OffRL
8
2
0
17 Feb 2023
Testing Stationarity and Change Point Detection in Reinforcement Learning
Testing Stationarity and Change Point Detection in Reinforcement Learning
Mengbing Li
C. Shi
Zhanghua Wu
Piotr Fryzlewicz
OffRL
85
9
0
03 Mar 2022
Optimal and instance-dependent guarantees for Markovian linear
  stochastic approximation
Optimal and instance-dependent guarantees for Markovian linear stochastic approximation
Wenlong Mou
A. Pananjady
Martin J. Wainwright
Peter L. Bartlett
59
23
0
23 Dec 2021
Online Bootstrap Inference For Policy Evaluation in Reinforcement
  Learning
Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning
Pratik Ramprasad
Yuantong Li
Zhuoran Yang
Zhaoran Wang
W. Sun
Guang Cheng
OffRL
104
28
0
08 Aug 2021
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual
  Bandits
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
Ruohan Zhan
Vitor Hadad
David A. Hirshberg
Susan Athey
OffRL
77
62
0
03 Jun 2021
Statistical Inference with M-Estimators on Adaptively Collected Data
Statistical Inference with M-Estimators on Adaptively Collected Data
Kelly W. Zhang
Lucas Janson
Susan Murphy
OffRL
55
43
0
29 Apr 2021
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and
  Dual Bounds
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds
Yihao Feng
Ziyang Tang
Na Zhang
Qiang Liu
OffRL
52
14
0
09 Mar 2021
Berry--Esseen Bounds for Multivariate Nonlinear Statistics with
  Applications to M-estimators and Stochastic Gradient Descent Algorithms
Berry--Esseen Bounds for Multivariate Nonlinear Statistics with Applications to M-estimators and Stochastic Gradient Descent Algorithms
Q. Shao
Zhuohui Zhang
42
24
0
09 Feb 2021
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Botao Hao
X. Ji
Yaqi Duan
Hao Lu
Csaba Szepesvári
Mengdi Wang
OffRL
46
40
0
06 Feb 2021
On the Stability of Random Matrix Product with Markovian Noise:
  Application to Linear Stochastic Approximation and TD Learning
On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning
Alain Durmus
Eric Moulines
A. Naumov
S. Samsonov
Hoi-To Wai
80
19
0
30 Jan 2021
CoinDICE: Off-Policy Confidence Interval Estimation
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
Ofir Nachum
Yinlam Chow
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
54
87
0
22 Oct 2020
Statistical Inference for Online Decision Making via Stochastic Gradient
  Descent
Statistical Inference for Online Decision Making via Stochastic Gradient Descent
Haoyu Chen
Wenbin Lu
R. Song
OffRL
119
27
0
14 Oct 2020
Statistical Inference for Online Decision-Making: In a Contextual Bandit
  Setting
Statistical Inference for Online Decision-Making: In a Contextual Bandit Setting
Haoyu Chen
Wenbin Lu
R. Song
OffRL
61
30
0
14 Oct 2020
Statistical Inference of the Value Function for Reinforcement Learning
  in Infinite Horizon Settings
Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings
C. Shi
Shengyao Zhang
W. Lu
R. Song
OffRL
62
87
0
13 Jan 2020
Adaptive Huber Regression on Markov-dependent Data
Adaptive Huber Regression on Markov-dependent Data
Jianqing Fan
Yongyi Guo
Bai Jiang
36
13
0
18 Apr 2019
Reinforcement Learning with Perturbed Rewards
Reinforcement Learning with Perturbed Rewards
Jingkang Wang
Yang Liu
Yue Liu
NoLa
61
129
0
02 Oct 2018
Accurate Inference for Adaptive Linear Models
Accurate Inference for Adaptive Linear Models
Y. Deshpande
Lester W. Mackey
Vasilis Syrgkanis
Matt Taddy
OffRL
94
62
0
18 Dec 2017
Adaptive Huber Regression
Adaptive Huber Regression
Qiang Sun
Wen-Xin Zhou
Jianqing Fan
162
281
0
21 Jun 2017
Reinforcement Learning with a Corrupted Reward Channel
Reinforcement Learning with a Corrupted Reward Channel
Tom Everitt
Victoria Krakovna
Laurent Orseau
Marcus Hutter
Shane Legg
90
104
0
23 May 2017
Statistical Inference for Model Parameters in Stochastic Gradient
  Descent
Statistical Inference for Model Parameters in Stochastic Gradient Descent
Xi Chen
Jason D. Lee
Xin T. Tong
Yichen Zhang
67
139
0
27 Oct 2016
A Stochastic Quasi-Newton Method for Large-Scale Optimization
A Stochastic Quasi-Newton Method for Large-Scale Optimization
R. Byrd
Samantha Hansen
J. Nocedal
Y. Singer
ODL
108
471
0
27 Jan 2014
1