Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.12470
Cited By
A Practical Guide of Off-Policy Evaluation for Bandit Problems
23 October 2020
Masahiro Kato
Kenshi Abe
Kaito Ariu
Shota Yasui
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Practical Guide of Off-Policy Evaluation for Bandit Problems"
16 / 16 papers shown
Title
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
74
74
0
17 Aug 2020
Confidence Interval for Off-Policy Evaluation from Dependent Samples via Bandit Algorithm: Approach from Standardized Martingales
Masahiro Kato
OffRL
17
2
0
12 Jun 2020
Off-Policy Evaluation and Learning for External Validity under a Covariate Shift
Masahiro Kato
Masatoshi Uehara
Shota Yasui
OffRL
35
52
0
26 Feb 2020
More Efficient Off-Policy Evaluation through Regularized Targeted Learning
Aurélien F. Bibaut
Ivana Malenica
N. Vlassis
Mark van der Laan
OOD
OffRL
22
40
0
13 Dec 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
42
54
0
09 Jun 2019
Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models
Michael Oberst
David Sontag
CML
OffRL
38
169
0
14 May 2019
Efficient Counterfactual Learning from Bandit Feedback
Yusuke Narita
Shota Yasui
Kohei Yata
OffRL
52
47
0
10 Sep 2018
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
41
266
0
10 Feb 2018
Offline A/B testing for Recommender Systems
Alexandre Gilotte
Clément Calauzènes
Thomas Nedelec
A. Abraham
Simon Dollé
OffRL
53
220
0
22 Jan 2018
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
47
220
0
04 Dec 2016
Batched bandit problems
Vianney Perchet
Philippe Rigollet
Sylvain Chassang
E. Snowberg
OffRL
79
200
0
02 May 2015
Collaborative Filtering Bandits
Shuai Li
Alexandros Karatzoglou
Claudio Gentile
44
315
0
11 Feb 2015
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
97
694
0
23 Mar 2011
Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms
Lihong Li
Wei Chu
John Langford
Xuanhui Wang
OffRL
111
574
0
31 Mar 2010
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
200
2,935
0
28 Feb 2010
The Offset Tree for Learning with Partial Labels
A. Beygelzimer
John Langford
71
184
0
21 Dec 2008
1