Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.03722
Cited By
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
11 November 2015
Nan Jiang
Lihong Li
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Doubly Robust Off-policy Value Evaluation for Reinforcement Learning"
50 / 163 papers shown
Title
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
39
0
0
16 May 2025
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
42
0
0
02 May 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
74
1
0
22 Feb 2025
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu
Lingfeng Zhao
Shivangi Agarwal
Jinghan Liu
Audrey Huang
Philip Amortila
Nan Jiang
OODD
OffRL
109
0
0
11 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
120
4
0
06 Feb 2025
Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces
Jifeng Hu
Sili Huang
Li Shen
Zhejian Yang
Shengchao Hu
Shisong Tang
Hao Chen
Yi Chang
Dacheng Tao
Lichao Sun
OffRL
44
0
0
21 Oct 2024
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Claire Chen
Shuze Liu
Shangtong Zhang
OffRL
198
1
0
08 Oct 2024
Doubly Optimal Policy Evaluation for Reinforcement Learning
Shuze Liu
Claire Chen
Shangtong Zhang
OffRL
43
2
0
03 Oct 2024
Balancing Immediate Revenue and Future Off-Policy Evaluation in Coupon Allocation
Naoki Nishimura
Ken Kobayashi
Kazuhide Nakata
OffRL
30
0
0
06 Jul 2024
Short-Long Policy Evaluation with Novel Actions
Hyunji Alex Nam
Yash Chandak
Emma Brunskill
OffRL
29
0
0
04 Jul 2024
Neyman Meets Causal Machine Learning: Experimental Evaluation of Individualized Treatment Rules
Michael Lingzhi Li
Kosuke Imai
CML
36
0
0
25 Apr 2024
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
32
0
0
29 Mar 2024
Spatially Randomized Designs Can Enhance Policy Evaluation
Ying Yang
Chengchun Shi
Fang Yao
Shouyang Wang
Hongtu Zhu
OffRL
47
0
0
18 Mar 2024
Optimizing Language Models for Human Preferences is a Causal Inference Problem
Victoria Lin
Eli Ben-Michael
Louis-Philippe Morency
43
3
0
22 Feb 2024
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
29
4
0
22 Feb 2024
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
31
0
0
24 Dec 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
37
5
0
09 Oct 2023
Counterfactual Explanation Policies in RL
Shripad Deshmukh
R Srivatsan
Supriti Vijay
Jayakumar Subramanian
Chirag Agarwal
OffRL
39
0
0
25 Jul 2023
Deep Attention Q-Network for Personalized Treatment Recommendation
Simin Ma
Junghwan Lee
N. Serban
Shihao Yang
OffRL
38
5
0
04 Jul 2023
Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding
Alizée Pace
Hugo Yèche
Bernhard Schölkopf
Gunnar Rätsch
Guy Tennenholtz
OffRL
31
6
0
01 Jun 2023
High-probability sample complexities for policy evaluation with linear function approximation
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
40
7
0
30 May 2023
Adaptive Policy Learning to Additional Tasks
Wenjian Hao
Zehui Lu
Zihao Liang
Tianyu Zhou
Shaoshuai Mou
37
0
0
24 May 2023
Recent Advances in the Foundations and Applications of Unbiased Learning to Rank
Shashank Gupta
Philipp Hager
Jin Huang
Ali Vardasbi
Harrie Oosterhuis
OffRL
37
5
0
04 May 2023
Correcting for Interference in Experiments: A Case Study at Douyin
Vivek F. Farias
Hao Li
Tianyi Peng
Xinyuyang Ren
B. Hassibi
A. Zheng
41
9
0
04 May 2023
Uncertainty Calibration for Counterfactual Propensity Estimation in Recommendation
Wenbo Hu
Xin Sun
Qiang liu
Wenbo Hu
Shu Wu
47
0
0
23 Mar 2023
Hallucinated Adversarial Control for Conservative Offline Policy Evaluation
Jonas Rothfuss
Bhavya Sukhija
Tobias Birchler
Parnian Kassraie
Andreas Krause
OffRL
31
10
0
02 Mar 2023
Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing
Shuai Xiao
Le Guo
Zaifan Jiang
Lei Lv
Yuanbo Chen
Jun Zhu
Shuang Yang
30
21
0
02 Mar 2023
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments
Vincent Liu
Yash Chandak
Philip S. Thomas
Martha White
OffRL
24
0
0
23 Feb 2023
Distributional Offline Policy Evaluation with Predictive Error Guarantees
Runzhe Wu
Masatoshi Uehara
Wen Sun
OffRL
40
13
0
19 Feb 2023
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
Ge Gao
Song Ju
Markel Sanz Ausin
Min Chi
OffRL
34
8
0
18 Feb 2023
Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for Parkinson Disease Treatment
Qitong Gao
Stephen L. Schimdt
Afsana Chowdhury
Guangyu Feng
Jennifer J. Peters
Katherine Genty
W. Grill
Dennis A. Turner
Miroslav Pajic
OffRL
38
11
0
05 Feb 2023
A Reinforcement Learning Framework for Dynamic Mediation Analysis
Linjuan Ge
Jitao Wang
C. Shi
Zhanghua Wu
Rui Song
31
5
0
31 Jan 2023
Variational Latent Branching Model for Off-Policy Evaluation
Qitong Gao
Ge Gao
Min Chi
Miroslav Pajic
OffRL
41
6
0
28 Jan 2023
Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Yash Chandak
Shiv Shankar
Nathaniel D. Bastian
Bruno Castro da Silva
Emma Brunskil
Philip S. Thomas
OffRL
52
6
0
24 Jan 2023
Quantile Off-Policy Evaluation via Deep Conditional Generative Learning
Yang Xu
C. Shi
Shuang Luo
Lan Wang
R. Song
OffRL
31
4
0
29 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
43
69
0
13 Dec 2022
Counterfactual Learning with Multioutput Deep Kernels
A. Caron
G. Baio
I. Manolopoulou
BDL
CML
OffRL
25
1
0
20 Nov 2022
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Andrea Zanette
OffRL
29
14
0
10 Nov 2022
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions
Audrey Huang
Nan Jiang
OffRL
60
9
0
27 Oct 2022
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Baihan Lin
OffRL
AI4TS
37
27
0
24 Oct 2022
Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions
Haanvid Lee
Jongmin Lee
Yunseon Choi
Wonseok Jeon
Byung-Jun Lee
Yung-Kyun Noh
Kee-Eung Kim
OffRL
12
5
0
24 Oct 2022
Causal Inference for De-biasing Motion Estimation from Robotic Observational Data
Junhong Xu
Kai-Li Yin
Jason M. Gregory
Lantao Liu
CML
23
3
0
17 Oct 2022
Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency
Wenlong Mou
Martin J. Wainwright
Peter L. Bartlett
OffRL
43
11
0
26 Sep 2022
On the Reuse Bias in Off-Policy Reinforcement Learning
Chengyang Ying
Zhongkai Hao
Xinning Zhou
Hang Su
Dong Yan
Jun Zhu
OffRL
45
3
0
15 Sep 2022
Entropy Regularization for Population Estimation
Ben Chugg
Peter Henderson
Jacob Goldin
Daniel E. Ho
30
3
0
24 Aug 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
34
16
0
26 Jul 2022
Offline Policy Optimization with Eligible Actions
Yao Liu
Yannis Flet-Berliac
Emma Brunskill
OffRL
31
5
0
01 Jul 2022
Federated Offline Reinforcement Learning
D. Zhou
Yufeng Zhang
Aaron Sonabend-W
Zhaoran Wang
Junwei Lu
Tianxi Cai
OffRL
40
13
0
11 Jun 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality
Ming Yin
Wenjing Chen
Mengdi Wang
Yu Wang
OffRL
32
4
0
10 Jun 2022
Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Linrui Zhang
Li Shen
Long Yang
Shi-Yong Chen
Bo Yuan
Xueqian Wang
Dacheng Tao
18
62
0
24 May 2022
1
2
3
4
Next