ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.12470
  4. Cited By
A Practical Guide of Off-Policy Evaluation for Bandit Problems

A Practical Guide of Off-Policy Evaluation for Bandit Problems

23 October 2020
Masahiro Kato
Kenshi Abe
Kaito Ariu
Shota Yasui
    OffRL
ArXivPDFHTML

Papers citing "A Practical Guide of Off-Policy Evaluation for Bandit Problems"

16 / 16 papers shown
Title
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible
  Off-Policy Evaluation
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
108
75
0
17 Aug 2020
Confidence Interval for Off-Policy Evaluation from Dependent Samples via
  Bandit Algorithm: Approach from Standardized Martingales
Confidence Interval for Off-Policy Evaluation from Dependent Samples via Bandit Algorithm: Approach from Standardized Martingales
Masahiro Kato
OffRL
22
2
0
12 Jun 2020
Off-Policy Evaluation and Learning for External Validity under a
  Covariate Shift
Off-Policy Evaluation and Learning for External Validity under a Covariate Shift
Masahiro Kato
Masatoshi Uehara
Shota Yasui
OffRL
41
53
0
26 Feb 2020
More Efficient Off-Policy Evaluation through Regularized Targeted
  Learning
More Efficient Off-Policy Evaluation through Regularized Targeted Learning
Aurélien F. Bibaut
Ivana Malenica
N. Vlassis
Mark van der Laan
OOD
OffRL
27
40
0
13 Dec 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for
  Reinforcement Learning
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
55
54
0
09 Jun 2019
Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal
  Models
Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models
Michael Oberst
David Sontag
CML
OffRL
43
169
0
14 May 2019
Efficient Counterfactual Learning from Bandit Feedback
Efficient Counterfactual Learning from Bandit Feedback
Yusuke Narita
Shota Yasui
Kohei Yata
OffRL
52
47
0
10 Sep 2018
More Robust Doubly Robust Off-policy Evaluation
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
51
267
0
10 Feb 2018
Offline A/B testing for Recommender Systems
Offline A/B testing for Recommender Systems
Alexandre Gilotte
Clément Calauzènes
Thomas Nedelec
A. Abraham
Simon Dollé
OffRL
59
220
0
22 Jan 2018
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
56
220
0
04 Dec 2016
Batched bandit problems
Batched bandit problems
Vianney Perchet
Philippe Rigollet
Sylvain Chassang
E. Snowberg
OffRL
106
200
0
02 May 2015
Collaborative Filtering Bandits
Collaborative Filtering Bandits
Shuai Li
Alexandros Karatzoglou
Claudio Gentile
58
315
0
11 Feb 2015
Doubly Robust Policy Evaluation and Learning
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
151
694
0
23 Mar 2011
Unbiased Offline Evaluation of Contextual-bandit-based News Article
  Recommendation Algorithms
Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms
Lihong Li
Wei Chu
John Langford
Xuanhui Wang
OffRL
150
574
0
31 Mar 2010
A Contextual-Bandit Approach to Personalized News Article Recommendation
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
267
2,935
0
28 Feb 2010
The Offset Tree for Learning with Partial Labels
The Offset Tree for Learning with Partial Labels
A. Beygelzimer
John Langford
99
184
0
21 Dec 2008
1