Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1809.03084
Cited By
Efficient Counterfactual Learning from Bandit Feedback
10 September 2018
Yusuke Narita
Shota Yasui
Kohei Yata
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Counterfactual Learning from Bandit Feedback"
7 / 7 papers shown
Title
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
46
69
0
13 Dec 2022
Offline Policy Optimization with Eligible Actions
Yao Liu
Yannis Flet-Berliac
Emma Brunskill
OffRL
31
5
0
01 Jul 2022
Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning
Rujie Zhong
Duohan Zhang
Lukas Schafer
Stefano V. Albrecht
Josiah P. Hanna
OOD
OffRL
15
12
0
29 Nov 2021
Dynamic Selection in Algorithmic Decision-making
Jin Li
Ye Luo
Xiaowei Zhang
31
2
0
28 Aug 2021
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
24
73
0
17 Aug 2020
Reducing Sampling Error in Batch Temporal Difference Learning
Brahma S. Pavse
Ishan Durugkar
Josiah P. Hanna
Peter Stone
OffRL
25
12
0
15 Aug 2020
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
24
54
0
09 Jun 2019
1