ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.06111
  4. Cited By
Sublinear Optimal Policy Value Estimation in Contextual Bandits

Sublinear Optimal Policy Value Estimation in Contextual Bandits

12 December 2019
Weihao Kong
Gregory Valiant
Emma Brunskill
    OffRL
ArXivPDFHTML

Papers citing "Sublinear Optimal Policy Value Estimation in Contextual Bandits"

11 / 11 papers shown
Title
Value Driven Representation for Human-in-the-Loop Reinforcement Learning
Value Driven Representation for Human-in-the-Loop Reinforcement Learning
Ramtin Keramati
Emma Brunskill
OffRL
20
3
0
02 Apr 2020
Off-Policy Policy Gradient with State Distribution Correction
Off-Policy Policy Gradient with State Distribution Correction
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
OffRL
157
67
0
17 Apr 2019
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate
  Shift
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
Carles Gelada
Marc G. Bellemare
OffRL
57
99
0
27 Jan 2019
Estimating Learnability in the Sublinear Data Regime
Estimating Learnability in the Sublinear Data Regime
Weihao Kong
Gregory Valiant
69
30
0
04 May 2018
Fully adaptive algorithm for pure exploration in linear bandits
Fully adaptive algorithm for pure exploration in linear bandits
Liyuan Xu
Junya Honda
Masashi Sugiyama
53
85
0
16 Oct 2017
Policy Learning with Observational Data
Policy Learning with Observational Data
Susan Athey
Stefan Wager
CML
OffRL
447
183
0
09 Feb 2017
Latent Contextual Bandits and their Application to Personalized
  Recommendations for New Users
Latent Contextual Bandits and their Application to Personalized Recommendations for New Users
Li Zhou
Emma Brunskill
41
62
0
22 Apr 2016
Best-Arm Identification in Linear Bandits
Best-Arm Identification in Linear Bandits
Marta Soare
A. Lazaric
Rémi Munos
65
178
0
22 Sep 2014
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
OffRL
391
508
0
04 Feb 2014
lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits
lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits
Kevin Jamieson
Matthew Malloy
Robert D. Nowak
Sébastien Bubeck
84
415
0
27 Dec 2013
A Contextual-Bandit Approach to Personalized News Article Recommendation
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
459
2,949
0
28 Feb 2010
1