ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1604.00923
  4. Cited By
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

4 April 2016
Philip S. Thomas
Emma Brunskill
    OffRL
ArXivPDFHTML

Papers citing "Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning"

50 / 342 papers shown
Title
Off-Policy Risk Assessment in Markov Decision Processes
Off-Policy Risk Assessment in Markov Decision Processes
Audrey Huang
Liu Leqi
Zachary Chase Lipton
Kamyar Azizzadenesheli
OffRL
29
8
0
21 Sep 2022
On the Reuse Bias in Off-Policy Reinforcement Learning
On the Reuse Bias in Off-Policy Reinforcement Learning
Chengyang Ying
Zhongkai Hao
Xinning Zhou
Hang Su
Dong Yan
Jun Zhu
OffRL
45
3
0
15 Sep 2022
Data-Driven Influence Functions for Optimization-Based Causal Inference
Data-Driven Influence Functions for Optimization-Based Causal Inference
Michael I. Jordan
Yixin Wang
Angela Zhou
47
2
0
29 Aug 2022
Deep Reinforcement Learning for Multi-Agent Interaction
Deep Reinforcement Learning for Multi-Agent Interaction
I. Ahmed
Cillian Brewitt
Ignacio Carlucho
Filippos Christianos
Mhairi Dunion
...
Lukas Schafer
Massimiliano Tamborski
Giuseppe Vecchio
Cheng Wang
Stefano V. Albrecht
DRL
AI4CE
13
11
0
02 Aug 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
34
16
0
26 Jul 2022
Policy Optimization with Sparse Global Contrastive Explanations
Policy Optimization with Sparse Global Contrastive Explanations
Jiayu Yao
S. Parbhoo
Weiwei Pan
Finale Doshi-Velez
OffRL
27
2
0
13 Jul 2022
Learning Bellman Complete Representations for Offline Policy Evaluation
Learning Bellman Complete Representations for Offline Policy Evaluation
Jonathan D. Chang
Kaiwen Wang
Nathan Kallus
Wen Sun
OffRL
29
15
0
12 Jul 2022
Grounding Aleatoric Uncertainty for Unsupervised Environment Design
Grounding Aleatoric Uncertainty for Unsupervised Environment Design
Minqi Jiang
Michael Dennis
Jack Parker-Holder
Andrei Lupu
Heinrich Küttler
Edward Grefenstette
Tim Rocktaschel
Jakob N. Foerster
48
13
0
11 Jul 2022
Multi-objective Optimization of Notifications Using Offline
  Reinforcement Learning
Multi-objective Optimization of Notifications Using Offline Reinforcement Learning
Prakruthi Prabhakar
Yiping Yuan
Guangyu Yang
Wensheng Sun
A. Muralidharan
OffRL
28
6
0
07 Jul 2022
A Causal Approach for Business Optimization: Application on an Online
  Marketplace
A Causal Approach for Business Optimization: Application on an Online Marketplace
Naama Parush
Ohad Levinkron-Fisch
H. Shteingart
Amir Bar Sela
Amir Zilberman
Jake Klein
24
0
0
04 Jul 2022
Offline Policy Optimization with Eligible Actions
Offline Policy Optimization with Eligible Actions
Yao Liu
Yannis Flet-Berliac
Emma Brunskill
OffRL
33
5
0
01 Jul 2022
Federated Offline Reinforcement Learning
Federated Offline Reinforcement Learning
D. Zhou
Yufeng Zhang
Aaron Sonabend-W
Zhaoran Wang
Junwei Lu
Tianxi Cai
OffRL
40
13
0
11 Jun 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards
  Optimality
Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality
Ming Yin
Wenjing Chen
Mengdi Wang
Yu Wang
OffRL
32
4
0
10 Jun 2022
Markovian Interference in Experiments
Markovian Interference in Experiments
Vivek F. Farias
Andrew A. Li
Tianyi Peng
Andrew Zheng
OffRL
17
30
0
06 Jun 2022
Hybrid Value Estimation for Off-policy Evaluation and Offline
  Reinforcement Learning
Hybrid Value Estimation for Off-policy Evaluation and Offline Reinforcement Learning
Xuefeng Jin
Xu-Hui Liu
Shengyi Jiang
Yang Yu
OffRL
36
4
0
04 Jun 2022
Off-Policy Evaluation with Online Adaptation for Robot Exploration in
  Challenging Environments
Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments
Yafei Hu
Junyi Geng
Chen Wang
John Keller
Sebastian Scherer
OffRL
36
15
0
07 Apr 2022
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
David Bruns-Smith
CML
ELM
OffRL
24
12
0
02 Apr 2022
Marginalized Operators for Off-policy Reinforcement Learning
Marginalized Operators for Off-policy Reinforcement Learning
Yunhao Tang
Mark Rowland
Rémi Munos
Michal Valko
OffRL
32
0
0
30 Mar 2022
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
OffRL
35
8
0
24 Mar 2022
Towards Data-Efficient Detection Transformers
Towards Data-Efficient Detection Transformers
Wen Wang
Jing Zhang
Yang Cao
Yongliang Shen
Dacheng Tao
ViT
23
59
0
17 Mar 2022
Off-Policy Evaluation in Embedded Spaces
Off-Policy Evaluation in Embedded Spaces
Jaron J. R. Lee
David Arbour
Georgios Theocharous
OffRL
25
3
0
05 Mar 2022
Interpretable Off-Policy Learning via Hyperbox Search
Interpretable Off-Policy Learning via Hyperbox Search
D. Tschernutter
Tobias Hatt
Stefan Feuerriegel
OffRL
CML
50
6
0
04 Mar 2022
Towards Robust Off-policy Learning for Runtime Uncertainty
Towards Robust Off-policy Learning for Runtime Uncertainty
Da Xu
Yuting Ye
Chuanwei Ruan
Bo Yang
OffRL
33
5
0
27 Feb 2022
Statistically Efficient Advantage Learning for Offline Reinforcement
  Learning in Infinite Horizons
Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons
C. Shi
Shuang Luo
Yuan Le
Hongtu Zhu
R. Song
OffRL
OnRL
37
10
0
26 Feb 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
40
9
0
23 Feb 2022
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
Shuang Luo
Ying Yang
Chengchun Shi
Fang Yao
Jieping Ye
Hongtu Zhu
48
6
0
22 Feb 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation
  in Two-sided Markets
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C. Shi
Runzhe Wan
Ge Song
Shuang Luo
R. Song
Hongtu Zhu
OffRL
43
6
0
21 Feb 2022
Off-Policy Evaluation for Large Action Spaces via Embeddings
Off-Policy Evaluation for Large Action Spaces via Embeddings
Yuta Saito
Thorsten Joachims
OffRL
33
43
0
13 Feb 2022
Off-Policy Fitted Q-Evaluation with Differentiable Function
  Approximators: Z-Estimation and Inference Theory
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory
Ruiqi Zhang
Xuezhou Zhang
Chengzhuo Ni
Mengdi Wang
OffRL
40
16
0
10 Feb 2022
Stochastic Gradient Descent with Dependent Data for Offline
  Reinforcement Learning
Stochastic Gradient Descent with Dependent Data for Offline Reinforcement Learning
Jing-rong Dong
Xin T. Tong
OffRL
40
2
0
06 Feb 2022
Offline Reinforcement Learning for Mobile Notifications
Offline Reinforcement Learning for Mobile Notifications
Yiping Yuan
A. Muralidharan
Preetam Nandy
Miao Cheng
Prakruthi Prabhakar
OffRL
36
9
0
04 Feb 2022
Doubly Robust Off-Policy Evaluation for Ranking Policies under the
  Cascade Behavior Model
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model
Haruka Kiyohara
Yuta Saito
Tatsuya Matsuhiro
Yusuke Narita
N. Shimizu
Yasuo Yamamoto
OffRL
28
42
0
03 Feb 2022
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted
  Iteration
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Chengzhuo Ni
Ruiqi Zhang
Xiang Ji
Xuezhou Zhang
Mengdi Wang
OffRL
21
1
0
31 Jan 2022
Generalizing Off-Policy Evaluation From a Causal Perspective For
  Sequential Decision-Making
Generalizing Off-Policy Evaluation From a Causal Perspective For Sequential Decision-Making
S. Parbhoo
Shalmali Joshi
Finale Doshi-Velez
ELM
CML
OffRL
19
5
0
20 Jan 2022
Off Environment Evaluation Using Convex Risk Minimization
Off Environment Evaluation Using Convex Risk Minimization
Pulkit Katdare
Shuijing Liu
Katherine Driggs-Campbell
18
2
0
21 Dec 2021
Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning Approach
Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning Approach
S. Saghafian
CML
34
14
0
08 Dec 2021
Generalizing Off-Policy Learning under Sample Selection Bias
Generalizing Off-Policy Learning under Sample Selection Bias
Tobias Hatt
D. Tschernutter
Stefan Feuerriegel
OffRL
30
18
0
02 Dec 2021
Robust On-Policy Sampling for Data-Efficient Policy Evaluation in
  Reinforcement Learning
Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning
Rujie Zhong
Duohan Zhang
Lukas Schafer
Stefano V. Albrecht
Josiah P. Hanna
OOD
OffRL
15
12
0
29 Nov 2021
Identification of Subgroups With Similar Benefits in Off-Policy Policy
  Evaluation
Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation
Ramtin Keramati
Omer Gottesman
Leo Anthony Celi
Finale Doshi-Velez
Emma Brunskill
OffRL
8
6
0
28 Nov 2021
Case-based off-policy policy evaluation using prototype learning
Case-based off-policy policy evaluation using prototype learning
Anton Matsson
Fredrik D. Johansson
OffRL
14
1
0
22 Nov 2021
A Minimax Learning Approach to Off-Policy Evaluation in Confounded
  Partially Observable Markov Decision Processes
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
C. Shi
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
18
23
0
12 Nov 2021
SOPE: Spectrum of Off-Policy Estimators
SOPE: Spectrum of Off-Policy Estimators
C. J. Yuan
Yash Chandak
S. Giguere
Philip S. Thomas
S. Niekum
OffRL
57
5
0
06 Nov 2021
Finding the Optimal Dynamic Treatment Regime Using Smooth Fisher
  Consistent Surrogate Loss
Finding the Optimal Dynamic Treatment Regime Using Smooth Fisher Consistent Surrogate Loss
Nilanjana Laha
Aaron Sonabend-W
Rajarshi Mukherjee
Tianxi Cai
22
1
0
03 Nov 2021
Using Time-Series Privileged Information for Provably Efficient Learning
  of Prediction Models
Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models
R. Karlsson
Martin Willbo
Zeshan Hussain
Rahul G. Krishnan
David Sontag
Fredrik D. Johansson
AI4TS
32
4
0
28 Oct 2021
Off-Policy Evaluation in Partially Observed Markov Decision Processes
  under Sequential Ignorability
Off-Policy Evaluation in Partially Observed Markov Decision Processes under Sequential Ignorability
Yupeng Tang
Seung-seob Lee
OffRL
59
22
0
24 Oct 2021
Recursive Causal Structure Learning in the Presence of Latent Variables
  and Selection Bias
Recursive Causal Structure Learning in the Presence of Latent Variables and Selection Bias
S. Akbari
Ehsan Mokhtarian
AmirEmad Ghassami
Negar Kiyavash
CML
11
24
0
22 Oct 2021
Stateful Offline Contextual Policy Evaluation and Learning
Stateful Offline Contextual Policy Evaluation and Learning
Nathan Kallus
Angela Zhou
OffRL
15
7
0
19 Oct 2021
RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender
  System
RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System
Kai Wang
Zhene Zou
Minghao Zhao
Qilin Deng
Yue Shang
Yile Liang
Runze Wu
Xudong Shen
Tangjie Lyu
Changjie Fan
OffRL
31
9
0
18 Oct 2021
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Ting-Han Fan
Peter J. Ramadge
CML
FAtt
OffRL
21
2
0
06 Oct 2021
Estimating Potential Outcome Distributions with Collaborating Causal
  Networks
Estimating Potential Outcome Distributions with Collaborating Causal Networks
Tianhui Zhou
William E Carson IV
David Carlson
CML
44
7
0
04 Oct 2021
Previous
1234567
Next