ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.06854
  4. Cited By
Empirical Study of Off-Policy Policy Evaluation for Reinforcement
  Learning

Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

15 November 2019
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
    OffRL
ArXivPDFHTML

Papers citing "Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning"

32 / 32 papers shown
Title
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
37
0
0
02 May 2025
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu
Lingfeng Zhao
Shivangi Agarwal
Jinghan Liu
Audrey Huang
P. Amortila
Nan Jiang
OODD
OffRL
101
0
0
11 Feb 2025
Cross-Validated Off-Policy Evaluation
Cross-Validated Off-Policy Evaluation
Matej Cief
B. Kveton
Michal Kompan
OffRL
20
1
0
24 May 2024
Conservative Exploration for Policy Optimization via Off-Policy Policy
  Evaluation
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
23
0
0
24 Dec 2023
Towards Real-World Applications of Personalized Anesthesia Using Policy
  Constraint Q Learning for Propofol Infusion Control
Towards Real-World Applications of Personalized Anesthesia Using Policy Constraint Q Learning for Propofol Infusion Control
Xiuding Cai
Jiao Chen
Yaoyao Zhu
Beiming Wang
Yu Yao
OffRL
36
5
0
17 Mar 2023
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old
  Data in Nonstationary Environments
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments
Vincent Liu
Yash Chandak
Philip S. Thomas
Martha White
OffRL
14
0
0
23 Feb 2023
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
Ge Gao
Song Ju
Markel Sanz Ausin
Min Chi
OffRL
29
8
0
18 Feb 2023
Safe Evaluation For Offline Learning: Are We Ready To Deploy?
Safe Evaluation For Offline Learning: Are We Ready To Deploy?
Hager Radi
Josiah P. Hanna
Peter Stone
Matthew E. Taylor
OffRL
ELM
31
0
0
16 Dec 2022
Beyond the Return: Off-policy Function Estimation under User-specified
  Error-measuring Distributions
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions
Audrey Huang
Nan Jiang
OffRL
51
9
0
27 Oct 2022
Sustainable Online Reinforcement Learning for Auto-bidding
Sustainable Online Reinforcement Learning for Auto-bidding
Zhiyu Mou
Yusen Huo
Rongquan Bai
Mingzhou Xie
Chuan Yu
Jian Xu
Bo Zheng
OffRL
OnRL
32
15
0
13 Oct 2022
Multi-Task Fusion via Reinforcement Learning for Long-Term User
  Satisfaction in Recommender Systems
Multi-Task Fusion via Reinforcement Learning for Long-Term User Satisfaction in Recommender Systems
Qihua Zhang
Junning Liu
Yuzhuo Dai
Yiyan Qi
Yifan Yuan
Kunlun Zheng
Fan Huang
Xianfeng Tan
OffRL
19
50
0
09 Aug 2022
Performative Reinforcement Learning
Performative Reinforcement Learning
Debmalya Mandal
Stelios Triantafyllou
Goran Radanović
30
17
0
30 Jun 2022
Off-Policy Evaluation with Online Adaptation for Robot Exploration in
  Challenging Environments
Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments
Yafei Hu
Junyi Geng
Chen Wang
John Keller
Sebastian Scherer
OffRL
22
15
0
07 Apr 2022
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
David Bruns-Smith
CML
ELM
OffRL
24
12
0
02 Apr 2022
ReVar: Strengthening Policy Evaluation via Reduced Variance Sampling
ReVar: Strengthening Policy Evaluation via Reduced Variance Sampling
Subhojyoti Mukherjee
Josiah P. Hanna
Robert D. Nowak
OffRL
16
12
0
09 Mar 2022
Offline Deep Reinforcement Learning for Dynamic Pricing of Consumer
  Credit
Offline Deep Reinforcement Learning for Dynamic Pricing of Consumer Credit
Raad Khraishi
Ramin Okhrati
OffRL
14
5
0
06 Mar 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
36
9
0
23 Feb 2022
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement
  for Value Error
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
Scott Fujimoto
D. Meger
Doina Precup
Ofir Nachum
S. Gu
30
32
0
28 Jan 2022
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
21
4
0
29 Nov 2021
SOPE: Spectrum of Off-Policy Estimators
SOPE: Spectrum of Off-Policy Estimators
C. J. Yuan
Yash Chandak
S. Giguere
Philip S. Thomas
S. Niekum
OffRL
50
5
0
06 Nov 2021
Off-Policy Evaluation in Partially Observed Markov Decision Processes
  under Sequential Ignorability
Off-Policy Evaluation in Partially Observed Markov Decision Processes under Sequential Ignorability
Yupeng Tang
Seung-seob Lee
OffRL
52
22
0
24 Oct 2021
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Bingqing Chen
Jonathan M Francis
Jean Oh
Eric Nyberg
Sylvia L. Herbert
56
14
0
14 Oct 2021
Model Selection for Offline Reinforcement Learning: Practical
  Considerations for Healthcare Settings
Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings
Shengpu Tang
Jenna Wiens
OffRL
26
78
0
23 Jul 2021
Supervised Off-Policy Ranking
Supervised Off-Policy Ranking
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
OffRL
32
5
0
03 Jul 2021
Offline RL Without Off-Policy Evaluation
Offline RL Without Off-Policy Evaluation
David Brandfonbrener
William F. Whitney
Rajesh Ranganath
Joan Bruna
OffRL
42
161
0
16 Jun 2021
On Instrumental Variable Regression for Deep Offline Policy Evaluation
On Instrumental Variable Regression for Deep Offline Policy Evaluation
Yutian Chen
Liyuan Xu
Çağlar Gülçehre
T. Paine
A. Gretton
Nando de Freitas
Arnaud Doucet
OffRL
39
18
0
21 May 2021
Benchmarks for Deep Off-Policy Evaluation
Benchmarks for Deep Off-Policy Evaluation
Justin Fu
Mohammad Norouzi
Ofir Nachum
George Tucker
Ziyun Wang
...
Yutian Chen
Aviral Kumar
Cosmin Paduraru
Sergey Levine
T. Paine
ELM
OffRL
35
100
0
30 Mar 2021
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement
  Learning
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning
Samarth Sinha
Ajay Mandlekar
Animesh Garg
OffRL
26
104
0
10 Mar 2021
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
Rongjun Qin
Songyi Gao
Xingyuan Zhang
Zhen Xu
Shengkai Huang
Zewen Li
Weinan Zhang
Yang Yu
OffRL
132
78
0
01 Feb 2021
Batch Policy Learning in Average Reward Markov Decision Processes
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
S. Murphy
OffRL
23
81
0
23 Jul 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Nan Jiang
Jiawei Huang
OffRL
20
17
0
06 Feb 2020
Double Reinforcement Learning for Efficient Off-Policy Evaluation in
  Markov Decision Processes
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
38
181
0
22 Aug 2019
1