ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.03393
  4. Cited By
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with
  Marginalized Importance Sampling

Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

8 June 2019
Tengyang Xie
Yifei Ma
Yu Wang
    OffRL
ArXivPDFHTML

Papers citing "Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling"

50 / 55 papers shown
Title
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
42
0
0
02 May 2025
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yuhan Li
Eugene Han
Yifan Hu
Wenzhuo Zhou
Zhengling Qi
Yifan Cui
Ruoqing Zhu
OffRL
239
0
0
01 May 2025
Multiple-policy Evaluation via Density Estimation
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
32
0
0
29 Mar 2024
On the Curses of Future and History in Future-dependent Value Functions
  for Off-policy Evaluation
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
31
4
0
22 Feb 2024
When is Agnostic Reinforcement Learning Statistically Tractable?
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
37
5
0
09 Oct 2023
Stackelberg Batch Policy Learning
Stackelberg Batch Policy Learning
Wenzhuo Zhou
Annie Qu
OffRL
42
1
0
28 Sep 2023
High-probability sample complexities for policy evaluation with linear
  function approximation
High-probability sample complexities for policy evaluation with linear function approximation
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
40
7
0
30 May 2023
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
Ge Gao
Song Ju
Markel Sanz Ausin
Min Chi
OffRL
34
8
0
18 Feb 2023
A Reinforcement Learning Framework for Dynamic Mediation Analysis
A Reinforcement Learning Framework for Dynamic Mediation Analysis
Linjuan Ge
Jitao Wang
C. Shi
Zhanghua Wu
Rui Song
31
5
0
31 Jan 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline
  Reinforcement Learning
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
49
16
0
30 Jan 2023
Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Yash Chandak
Shiv Shankar
Nathaniel D. Bastian
Bruno Castro da Silva
Emma Brunskil
Philip S. Thomas
OffRL
52
6
0
24 Jan 2023
Policy learning "without'' overlap: Pessimism and generalized empirical
  Bernstein's inequality
Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
Ying Jin
Zhimei Ren
Zhuoran Yang
Zhaoran Wang
OffRL
40
25
0
19 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
46
69
0
13 Dec 2022
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Andrea Zanette
OffRL
29
14
0
10 Nov 2022
Beyond the Return: Off-policy Function Estimation under User-specified
  Error-measuring Distributions
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions
Audrey Huang
Nan Jiang
OffRL
62
9
0
27 Oct 2022
On the Reuse Bias in Off-Policy Reinforcement Learning
On the Reuse Bias in Off-Policy Reinforcement Learning
Chengyang Ying
Zhongkai Hao
Xinning Zhou
Hang Su
Dong Yan
Jun Zhu
OffRL
45
3
0
15 Sep 2022
Multi-objective Optimization of Notifications Using Offline
  Reinforcement Learning
Multi-objective Optimization of Notifications Using Offline Reinforcement Learning
Prakruthi Prabhakar
Yiping Yuan
Guangyu Yang
Wensheng Sun
A. Muralidharan
OffRL
28
6
0
07 Jul 2022
Off-Policy Fitted Q-Evaluation with Differentiable Function
  Approximators: Z-Estimation and Inference Theory
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory
Ruiqi Zhang
Xuezhou Zhang
Chengzhuo Ni
Mengdi Wang
OffRL
40
16
0
10 Feb 2022
Offline Reinforcement Learning for Mobile Notifications
Offline Reinforcement Learning for Mobile Notifications
Yiping Yuan
A. Muralidharan
Preetam Nandy
Miao Cheng
Prakruthi Prabhakar
OffRL
36
9
0
04 Feb 2022
When is Offline Two-Player Zero-Sum Markov Game Solvable?
When is Offline Two-Player Zero-Sum Markov Game Solvable?
Qiwen Cui
S. Du
OffRL
39
29
0
10 Jan 2022
Flexible Option Learning
Flexible Option Learning
Martin Klissarov
Doina Precup
OffRL
41
26
0
06 Dec 2021
SOPE: Spectrum of Off-Policy Estimators
SOPE: Spectrum of Off-Policy Estimators
C. J. Yuan
Yash Chandak
S. Giguere
Philip S. Thomas
S. Niekum
OffRL
57
5
0
06 Nov 2021
False Correlation Reduction for Offline Reinforcement Learning
False Correlation Reduction for Offline Reinforcement Learning
Arvindkumar Krishnakumar
Zuyue Fu
Lingxiao Wang
Zhuoran Yang
Chenjia Bai
Tianyi Zhou
Judy Hoffman
Jing Jiang
OffRL
39
9
0
24 Oct 2021
Optimal policy evaluation using kernel-based temporal difference methods
Optimal policy evaluation using kernel-based temporal difference methods
Yaqi Duan
Mengdi Wang
Martin J. Wainwright
OffRL
30
27
0
24 Sep 2021
Towards optimized actions in critical situations of soccer games with
  deep reinforcement learning
Towards optimized actions in critical situations of soccer games with deep reinforcement learning
Pegah Rahimian
Afshin Oroojlooy
László Toka
23
5
0
14 Sep 2021
State Relevance for Off-Policy Evaluation
State Relevance for Off-Policy Evaluation
S. Shen
Yecheng Ma
Omer Gottesman
Finale Doshi-Velez
OffRL
14
4
0
13 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
34
115
0
19 Aug 2021
Supervised Off-Policy Ranking
Supervised Off-Policy Ranking
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
OffRL
34
5
0
03 Jul 2021
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Yifei Min
Tianhao Wang
Dongruo Zhou
Quanquan Gu
OffRL
42
38
0
22 Jun 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in
  Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
36
19
0
13 May 2021
Universal Off-Policy Evaluation
Universal Off-Policy Evaluation
Yash Chandak
S. Niekum
Bruno C. da Silva
Erik Learned-Miller
Emma Brunskill
Philip S. Thomas
OffRL
ELM
39
52
0
26 Apr 2021
Nearly Horizon-Free Offline Reinforcement Learning
Nearly Horizon-Free Offline Reinforcement Learning
Tongzheng Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
32
49
0
25 Mar 2021
Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning
Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning
Yaqi Duan
Chi Jin
Zhiyuan Li
OffRL
36
48
0
25 Mar 2021
Instabilities of Offline RL with Pre-Trained Neural Representation
Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang
Yifan Wu
Ruslan Salakhutdinov
Sham Kakade
OffRL
22
42
0
08 Mar 2021
Is Pessimism Provably Efficient for Offline RL?
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
27
350
0
30 Dec 2020
Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can
  be Exponentially Harder than Online RL
Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL
Andrea Zanette
OffRL
31
71
0
14 Dec 2020
Reliable Off-policy Evaluation for Reinforcement Learning
Reliable Off-policy Evaluation for Reinforcement Learning
Jie Wang
Rui Gao
H. Zha
OffRL
22
11
0
08 Nov 2020
CoinDICE: Off-Policy Confidence Interval Estimation
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
Ofir Nachum
Yinlam Chow
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
29
84
0
22 Oct 2020
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible
  Off-Policy Evaluation
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
24
73
0
17 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
26
42
0
02 Aug 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation
  for Reinforcement Learning
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
44
31
0
07 Jul 2020
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic
  Policies
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies
Nathan Kallus
Masatoshi Uehara
OffRL
16
15
0
06 Jun 2020
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement
  Learning
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Ali Mousavi
Lihong Li
Qiang Liu
Denny Zhou
OffRL
27
32
0
24 Mar 2020
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved
  Confounding
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding
Hongseok Namkoong
Ramtin Keramati
Steve Yadlowsky
Emma Brunskill
OffRL
24
63
0
12 Mar 2020
Batch Stationary Distribution Estimation
Batch Stationary Distribution Estimation
Junfeng Wen
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
22
22
0
02 Mar 2020
Off-Policy Deep Reinforcement Learning with Analogous Disentangled
  Exploration
Off-Policy Deep Reinforcement Learning with Analogous Disentangled Exploration
Hoang Trung-Dung
Yitao Liang
Guy Van den Broeck
OffRL
22
3
0
25 Feb 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Yaqi Duan
Mengdi Wang
OffRL
32
149
0
21 Feb 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Nan Jiang
Jiawei Huang
OffRL
41
17
0
06 Feb 2020
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement
  Learning
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
Ming Yin
Yu Wang
OffRL
29
80
0
29 Jan 2020
Empirical Study of Off-Policy Policy Evaluation for Reinforcement
  Learning
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
32
152
0
15 Nov 2019
12
Next