Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.08666
Cited By
Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes
18 September 2022
Zuyue Fu
Zhengling Qi
Zhaoran Wang
Zhuoran Yang
Yanxun Xu
Michael R. Kosorok
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes"
14 / 14 papers shown
Title
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yuhan Li
Eugene Han
Yifan Hu
Wenzhuo Zhou
Zhengling Qi
Yifan Cui
Ruoqing Zhu
OffRL
141
0
0
01 May 2025
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Shuguang Yu
Shuxing Fang
Ruixin Peng
Zhengling Qi
Fan Zhou
C. Shi
CML
OffRL
82
1
0
08 Dec 2024
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
Zhihan Liu
Miao Lu
Shenao Zhang
Boyi Liu
Hongyi Guo
Yingxiang Yang
Jose H. Blanchet
Zhaoran Wang
48
43
0
26 May 2024
Learning Decision Policies with Instrumental Variables through Double Machine Learning
Daqian Shao
Ashkan Soleymani
Francesco Quinzan
Marta Z. Kwiatkowska
36
1
0
14 May 2024
Functional Bilevel Optimization for Machine Learning
Ieva Petrulionyte
Julien Mairal
Michael Arbel
51
2
0
29 Mar 2024
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
Jin Zhu
Runzhe Wan
Zhengling Qi
S. Luo
C. Shi
OffRL
35
0
0
28 Oct 2023
PASTA: Pessimistic Assortment Optimization
Juncheng Dong
Weibin Mo
Zhengling Qi
Cong Shi
Ethan X. Fang
Vahid Tarokh
OffRL
15
2
0
08 Feb 2023
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders
David Bruns-Smith
Angela Zhou
OffRL
18
9
0
01 Feb 2023
STEEL: Singularity-aware Reinforcement Learning
Xiaohong Chen
Zhengling Qi
Runzhe Wan
OffRL
27
2
0
30 Jan 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
Yang Xu
Jin Zhu
C. Shi
S. Luo
R. Song
OffRL
21
14
0
29 Dec 2022
Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information
Zuyue Fu
Zhengling Qi
Zhuoran Yang
Zhaoran Wang
Lan Wang
OffRL
20
0
0
23 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
36
67
0
13 Dec 2022
Offline Policy Evaluation and Optimization under Confounding
Chinmaya Kausik
Yangyi Lu
Kevin Tan
Maggie Makar
Yixin Wang
Ambuj Tewari
OffRL
23
8
0
29 Nov 2022
Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach
Yunzhe Zhou
Zhengling Qi
C. Shi
Lexin Li
OffRL
10
8
0
26 Oct 2022
1