ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.14468
  4. Cited By
An Instrumental Variable Approach to Confounded Off-Policy Evaluation

An Instrumental Variable Approach to Confounded Off-Policy Evaluation

29 December 2022
Yang Xu
Jin Zhu
C. Shi
Shuang Luo
R. Song
    OffRL
ArXivPDFHTML

Papers citing "An Instrumental Variable Approach to Confounded Off-Policy Evaluation"

50 / 57 papers shown
Title
A Review of Off-Policy Evaluation in Reinforcement Learning
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
72
72
0
13 Dec 2022
Off-Policy Evaluation for Episodic Partially Observable Markov Decision
  Processes under Non-Parametric Models
Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models
Rui Miao
Zhengling Qi
Xiaoke Zhang
OffRL
56
10
0
21 Sep 2022
Offline Reinforcement Learning with Instrumental Variables in Confounded
  Markov Decision Processes
Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes
Zuyue Fu
Zhengling Qi
Zhaoran Wang
Zhuoran Yang
Yanxun Xu
Michael R. Kosorok
OffRL
76
17
0
18 Sep 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
60
18
0
26 Jul 2022
Off-Policy Confidence Interval Estimation with Confounded Markov
  Decision Process
Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process
C. Shi
Jin Zhu
Ye Shen
Shuang Luo
Hong Zhu
R. Song
OffRL
83
34
0
22 Feb 2022
Doubly Robust Off-Policy Evaluation for Ranking Policies under the
  Cascade Behavior Model
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model
Haruka Kiyohara
Yuta Saito
Tatsuya Matsuhiro
Yusuke Narita
N. Shimizu
Yasuo Yamamoto
OffRL
66
42
0
03 Feb 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function
  Estimation in Off-policy Evaluation
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
Xiaohong Chen
Zhengling Qi
OffRL
58
34
0
17 Jan 2022
A Minimax Learning Approach to Off-Policy Evaluation in Confounded
  Partially Observable Markov Decision Processes
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
C. Shi
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
49
24
0
12 Nov 2021
Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in
  Partially Observed Markov Decision Processes
Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes
Andrew Bennett
Nathan Kallus
OffRL
39
43
0
28 Oct 2021
Off-Policy Evaluation in Partially Observed Markov Decision Processes
  under Sequential Ignorability
Off-Policy Evaluation in Partially Observed Markov Decision Processes under Sequential Ignorability
Yupeng Tang
Seung-seob Lee
OffRL
81
25
0
24 Oct 2021
A Spectral Approach to Off-Policy Evaluation for POMDPs
A Spectral Approach to Off-Policy Evaluation for POMDPs
Yash Nair
Nan Jiang
OffRL
36
18
0
22 Sep 2021
Deep Proxy Causal Learning and its Application to Confounded Bandit
  Policy Evaluation
Deep Proxy Causal Learning and its Application to Confounded Bandit Policy Evaluation
Liyuan Xu
Heishiro Kanagawa
Arthur Gretton
CML
36
36
0
07 Jun 2021
Deeply-Debiased Off-Policy Interval Estimation
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
38
38
0
10 May 2021
Estimating and Improving Dynamic Treatment Regimes With a Time-Varying
  Instrumental Variable
Estimating and Improving Dynamic Treatment Regimes With a Time-Varying Instrumental Variable
Shuxiao Chen
B. Zhang
70
20
0
15 Apr 2021
Causal Inference Under Unmeasured Confounding With Negative Controls: A
  Minimax Learning Approach
Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach
Nathan Kallus
Xiaojie Mao
Masatoshi Uehara
CML
55
67
0
25 Mar 2021
Instrumental Variable Value Iteration for Causal Offline Reinforcement
  Learning
Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning
Luofeng Liao
Zuyue Fu
Zhuoran Yang
Yixin Wang
Mladen Kolar
Zhaoran Wang
OffRL
62
35
0
19 Feb 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound
RL for Latent MDPs: Regret Guarantees and a Lower Bound
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
44
78
0
09 Feb 2021
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Botao Hao
X. Ji
Yaqi Duan
Hao Lu
Csaba Szepesvári
Mengdi Wang
OffRL
32
40
0
06 Feb 2021
Semiparametric proximal causal inference
Semiparametric proximal causal inference
Yifan Cui
Hongming Pu
Xu Shi
Wang Miao
E. T. Tchetgen Tchetgen
36
102
0
17 Nov 2020
Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment
  Settings
Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings
Hengrui Cai
C. Shi
R. Song
Wenbin Lu
OffRL
30
13
0
29 Oct 2020
CoinDICE: Off-Policy Confidence Interval Estimation
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
Ofir Nachum
Yinlam Chow
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
41
87
0
22 Oct 2020
Accountable Off-Policy Evaluation With Kernel Bellman Statistics
Accountable Off-Policy Evaluation With Kernel Bellman Statistics
Yihao Feng
Tongzheng Ren
Ziyang Tang
Qiang Liu
OffRL
77
44
0
15 Aug 2020
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with
  Latent Confounders
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders
Andrew Bennett
Nathan Kallus
Lihong Li
Ali Mousavi
OffRL
49
43
0
27 Jul 2020
Batch Policy Learning in Average Reward Markov Decision Processes
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
Susan Murphy
OffRL
74
84
0
23 Jul 2020
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Sample-Efficient Reinforcement Learning of Undercomplete POMDPs
Chi Jin
Sham Kakade
A. Krishnamurthy
Qinghua Liu
74
65
0
22 Jun 2020
Provably Efficient Causal Reinforcement Learning with Confounded
  Observational Data
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Lingxiao Wang
Zhuoran Yang
Zhaoran Wang
OffRL
49
45
0
22 Jun 2020
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved
  Confounding
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding
Hongseok Namkoong
Ramtin Keramati
Steve Yadlowsky
Emma Brunskill
OffRL
111
64
0
12 Mar 2020
GenDICE: Generalized Offline Estimation of Stationary Values
GenDICE: Generalized Offline Estimation of Stationary Values
Ruiyi Zhang
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
136
173
0
21 Feb 2020
Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement
  Learning
Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning
Nathan Kallus
Angela Zhou
OffRL
57
59
0
11 Feb 2020
Does the Markov Decision Process Fit the Data: Testing for the Markov
  Property in Sequential Decision Making
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
C. Shi
Runzhe Wan
R. Song
Wenbin Lu
Ling Leng
43
38
0
05 Feb 2020
Statistical Inference of the Value Function for Reinforcement Learning
  in Infinite Horizon Settings
Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings
C. Shi
Shengyao Zhang
W. Lu
R. Song
OffRL
26
87
0
13 Jan 2020
Off-Policy Estimation of Long-Term Average Outcomes with Applications to
  Mobile Health
Off-Policy Estimation of Long-Term Average Outcomes with Applications to Mobile Health
Peng Liao
P. Klasnja
Susan Murphy
OffRL
46
68
0
30 Dec 2019
A semiparametric instrumental variable approach to optimal treatment
  regimes under endogeneity
A semiparametric instrumental variable approach to optimal treatment regimes under endogeneity
Yifan Cui
E. T. Tchetgen Tchetgen
CML
43
65
0
21 Nov 2019
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
101
186
0
28 Oct 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
97
68
0
16 Oct 2019
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with
  Double Reinforcement Learning
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
49
91
0
12 Sep 2019
Off-Policy Evaluation in Partially Observable Environments
Off-Policy Evaluation in Partially Observable Environments
Guy Tennenholtz
Shie Mannor
Uri Shalit
OffRL
43
86
0
09 Sep 2019
Importance Resampling for Off-policy Prediction
Importance Resampling for Off-policy Prediction
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
OffRL
42
41
0
11 Jun 2019
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with
  Marginalized Importance Sampling
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Tengyang Xie
Yifei Ma
Yu Wang
OffRL
86
181
0
08 Jun 2019
Batch Policy Learning under Constraints
Batch Policy Learning under Constraints
Hoang Minh Le
Cameron Voloshin
Yisong Yue
OffRL
45
328
0
20 Mar 2019
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
103
354
0
29 Oct 2018
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A
  Simulated Comparative Evaluation of Off-Policy Methods
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods
Deirdre Quillen
Eric Jang
Ofir Nachum
Chelsea Finn
Julian Ibarz
Sergey Levine
OOD
OffRL
58
203
0
28 Feb 2018
More Robust Doubly Robust Off-policy Evaluation
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
56
267
0
10 Feb 2018
Causal Effect Inference with Deep Latent-Variable Models
Causal Effect Inference with Deep Latent-Variable Models
Christos Louizos
Uri Shalit
Joris Mooij
David Sontag
R. Zemel
Max Welling
CML
BDL
156
739
0
24 May 2017
Consistent On-Line Off-Policy Evaluation
Consistent On-Line Off-Policy Evaluation
Assaf Hallak
Shie Mannor
OffRL
59
93
0
23 Feb 2017
Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning
Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning
Daniel J. Luckett
Eric B. Laber
A. Kahkoska
D. Maahs
E. Mayer‐Davis
Michael R. Kosorok
41
137
0
10 Nov 2016
A PAC RL Algorithm for Episodic POMDPs
A PAC RL Algorithm for Episodic POMDPs
Z. Guo
Shayan Doroudi
Emma Brunskill
67
56
0
25 May 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
254
573
0
04 Apr 2016
Reinforcement Learning of POMDPs using Spectral Methods
Reinforcement Learning of POMDPs using Spectral Methods
Kamyar Azizzadenesheli
A. Lazaric
Anima Anandkumar
37
127
0
25 Feb 2016
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Nan Jiang
Lihong Li
OffRL
153
621
0
11 Nov 2015
12
Next