Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.03323
Cited By
Empirical Likelihood for Contextual Bandits
7 June 2019
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Empirical Likelihood for Contextual Bandits"
14 / 14 papers shown
Title
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
42
54
0
09 Jun 2019
Off-Policy Policy Gradient with State Distribution Correction
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
OffRL
61
67
0
17 Apr 2019
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
L. Espeholt
Hubert Soyer
Rémi Munos
Karen Simonyan
Volodymyr Mnih
...
Vlad Firoiu
Tim Harley
Iain Dunning
Shane Legg
Koray Kavukcuoglu
127
1,584
0
05 Feb 2018
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
47
220
0
04 Dec 2016
Statistics of Robust Optimization: A Generalized Empirical Likelihood Approach
John C. Duchi
Peter Glynn
Hongseok Namkoong
82
321
0
11 Oct 2016
Off-policy evaluation for slate recommendation
Adith Swaminathan
A. Krishnamurthy
Alekh Agarwal
Miroslav Dudík
John Langford
Damien Jose
I. Zitouni
CML
OffRL
28
225
0
16 May 2016
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
221
6,722
0
19 Feb 2015
Multinomial and empirical likelihood under convex constraints: directions of recession, Fenchel duality, perturbations
Marian Grendár
Vladimír vSpitalský
37
3
0
24 Aug 2014
OpenML: networked science in machine learning
Joaquin Vanschoren
Jan N. van Rijn
B. Bischl
Luís Torgo
FedML
AI4CE
62
1,310
0
29 Jul 2014
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
OffRL
111
504
0
04 Feb 2014
Kullback-Leibler upper confidence bounds for optimal sequential allocation
Olivier Cappé
Aurélien Garivier
Odalric-Ambrym Maillard
Rémi Munos
Gilles Stoltz
58
394
0
03 Oct 2012
Counterfactual Reasoning and Learning Systems
Léon Bottou
J. Peters
J. Q. Candela
Denis Xavier Charles
D. M. Chickering
Elon Portugaly
Dipankar Ray
Patrice Y. Simard
Edward Snelson
CML
OffRL
119
781
0
11 Sep 2012
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
95
694
0
23 Mar 2011
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
200
2,935
0
28 Feb 2010
1