Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1402.0555
Cited By
v1
v2 (latest)
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
4 February 2014
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits"
50 / 202 papers shown
Title
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
Dylan J. Foster
Alexander Rakhlin
371
213
0
12 Feb 2020
Online Preselection with Context Information under the Plackett-Luce Model
Adil El Mesaoudi-Paul
Viktor Bengs
Eyke Hüllermeier
51
4
0
11 Feb 2020
Combinatorial Semi-Bandit in the Non-Stationary Environment
Wei Chen
Liwei Wang
Haoyu Zhao
Kai Zheng
86
18
0
10 Feb 2020
Fair Contextual Multi-Armed Bandits: Theory and Experiments
Yifang Chen
Alex Cuellar
Haipeng Luo
Jignesh Modi
Heramb Nemlekar
Stefanos Nikolaidis
FaML
91
61
0
13 Dec 2019
Sublinear Optimal Policy Value Estimation in Contextual Bandits
Weihao Kong
Gregory Valiant
Emma Brunskill
OffRL
62
13
0
12 Dec 2019
Online Pricing with Reserve Price Constraint for Personal Data Markets
Chaoyue Niu
Zhenzhe Zheng
Fan Wu
Shaojie Tang
Guihai Chen
53
34
0
28 Nov 2019
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Dipendra Kumar Misra
Mikael Henaff
A. Krishnamurthy
John Langford
85
151
0
13 Nov 2019
Neural Contextual Bandits with UCB-based Exploration
Dongruo Zhou
Lihong Li
Quanquan Gu
135
15
0
11 Nov 2019
Multi-Armed Bandits with Correlated Arms
Samarth Gupta
Shreyas Chaudhari
Gauri Joshi
Osman Yağan
177
51
0
06 Nov 2019
Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
Yichun Hu
Nathan Kallus
Xiaojie Mao
97
34
0
05 Sep 2019
n
\sqrt{n}
n
-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank
Kefan Dong
Jian-wei Peng
Yining Wang
Yuanshuo Zhou
OffRL
81
36
0
05 Sep 2019
Adaptive Robot-Assisted Feeding: An Online Learning Framework for Acquiring Previously Unseen Food Items
E. Gordon
Xiang Meng
Matt Barnes
Tapomayukh Bhattacharjee
S. Srinivasa
OffRL
OnRL
155
47
0
19 Aug 2019
Off-policy Learning for Multiple Loggers
Li He
Long Xia
Wei Zeng
Zhi-Ming Ma
Yue Zhao
Dawei Yin
OffRL
57
10
0
23 Jul 2019
Exploiting Relevance for Online Decision-Making in High-Dimensions
E. Turğay
Cem Bulucu
Cem Tekin
64
4
0
01 Jul 2019
Adaptive Sequential Experiments with Unknown Information Arrival Processes
Y. Gur
Ahmadreza Momeni
92
3
0
28 Jun 2019
ASAC: Active Sensing using Actor-Critic models
Jinsung Yoon
James Jordon
M. Schaar
CML
59
16
0
16 Jun 2019
Distributionally Robust Counterfactual Risk Minimization
Louis Faury
Ugo Tanielian
Flavian Vasile
E. Smirnova
Elvis Dohmatob
78
45
0
14 Jun 2019
Empirical Likelihood for Contextual Bandits
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
134
9
0
07 Jun 2019
Stochastic Bandits with Context Distributions
Johannes Kirschner
Andreas Krause
70
30
0
06 Jun 2019
Model selection for contextual bandits
Dylan J. Foster
A. Krishnamurthy
Haipeng Luo
OffRL
216
90
0
03 Jun 2019
Multi-Objective Generalized Linear Bandits
Shiyin Lu
G. Wang
Yao Hu
Lijun Zhang
25
22
0
30 May 2019
On the Generalization Gap in Reparameterizable Reinforcement Learning
Huan Wang
Stephan Zheng
Caiming Xiong
R. Socher
117
41
0
29 May 2019
Provably Efficient Imitation Learning from Observation Alone
Wen Sun
Anirudh Vemula
Byron Boots
J. Andrew Bagnell
167
107
0
27 May 2019
OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits
Niladri S. Chatterji
Vidya Muthukumar
Peter L. Bartlett
85
46
0
24 May 2019
Introduction to Multi-Armed Bandits
Aleksandrs Slivkins
687
1,025
0
15 Apr 2019
Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits
Yingkai Li
Yining Wang
Yuanshuo Zhou
193
61
0
30 Mar 2019
Modeling and Optimization of Human-machine Interaction Processes via the Maximum Entropy Principle
Jiaxiao Zheng
G. Veciana
26
1
0
17 Mar 2019
Cost-Effective Incentive Allocation via Structured Counterfactual Inference
Romain Lopez
Chenchen Li
X. Yan
Junwu Xiong
Michael I. Jordan
Yuan Qi
Le Song
OffRL
96
17
0
07 Feb 2019
Equal Opportunity in Online Classification with Partial Feedback
Yahav Bechavod
Katrina Ligett
Aaron Roth
Bo Waggoner
Zhiwei Steven Wu
FaML
78
60
0
06 Feb 2019
Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting
A. Krishnamurthy
John Langford
Aleksandrs Slivkins
Chicheng Zhang
OffRL
177
65
0
05 Feb 2019
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free
Yifang Chen
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
166
134
0
03 Feb 2019
Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective
Anirudh Vemula
Wen Sun
J. Andrew Bagnell
73
40
0
31 Jan 2019
The Assistive Multi-Armed Bandit
Lawrence Chan
Dylan Hadfield-Menell
S. Srinivasa
Anca Dragan
61
36
0
24 Jan 2019
Online Learning with Diverse User Preferences
Chao Gan
Jing Yang
Ruida Zhou
Cong Shen
39
2
0
23 Jan 2019
Online Learning for Measuring Incentive Compatibility in Ad Auctions
Zhe Feng
Okke Schrijvers
Eric Sodomka
38
22
0
21 Jan 2019
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback
Chicheng Zhang
Alekh Agarwal
Hal Daumé
John Langford
S. Negahban
85
34
0
02 Jan 2019
Top-K Off-Policy Correction for a REINFORCE Recommender System
Minmin Chen
Alex Beutel
Paul Covington
Sagar Jain
Francois Belletti
Ed H. Chi
CML
OffRL
149
485
0
06 Dec 2018
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
117
391
0
29 Nov 2018
Adversarial Bandits with Knapsacks
Nicole Immorlica
Karthik Abinav Sankararaman
Robert Schapire
Aleksandrs Slivkins
209
116
0
28 Nov 2018
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
Branislav Kveton
Csaba Szepesvári
Sharan Vaswani
Zheng Wen
Mohammad Ghavamzadeh
Tor Lattimore
180
70
0
13 Nov 2018
Adapting multi-armed bandits policies to contextual bandits scenarios
David Cortes
74
32
0
11 Nov 2018
CAB: Continuous Adaptive Blending Estimator for Policy Evaluation and Learning
Yi-Hsun Su
Lequn Wang
Michele Santacatterina
Mohsen Guizani
CML
OffRL
25
7
0
06 Nov 2018
Adversarial Attacks on Stochastic Bandits
Kwang-Sung Jun
Lihong Li
Yuzhe Ma
Xiaojin Zhu
AAML
373
124
0
29 Oct 2018
Contextual Bandits with Cross-learning
S. Balseiro
Negin Golrezaei
Mohammad Mahdian
Vahab Mirrokni
Jon Schneider
171
51
0
25 Sep 2018
Linear Bandits with Stochastic Delayed Feedback
Claire Vernade
Alexandra Carpentier
Tor Lattimore
Giovanni Zappella
Beyza Ermis
M. Brueckner
80
67
0
05 Jul 2018
Playing against Nature: causal discovery for decision making under uncertainty
Mauricio Gonzalez-Soto
L. Sucar
Hugo Jair Escalante
CML
20
9
0
03 Jul 2018
Contextual bandits with surrogate losses: Margin bounds and efficient algorithms
Dylan J. Foster
A. Krishnamurthy
167
18
0
28 Jun 2018
Causal Bandits with Propagating Inference
Akihiro Yabe
Daisuke Hatano
Hanna Sumita
Shinji Ito
Naonori Kakimura
Takuro Fukunaga
Ken-ichi Kawarabayashi
CML
62
33
0
06 Jun 2018
The Externalities of Exploration and How Data Diversity Helps Exploitation
Manish Raghavan
Aleksandrs Slivkins
Jennifer Wortman Vaughan
Zhiwei Steven Wu
241
53
0
01 Jun 2018
A Study on Overfitting in Deep Reinforcement Learning
Chiyuan Zhang
Oriol Vinyals
Rémi Munos
Samy Bengio
OffRL
OnRL
61
391
0
18 Apr 2018
Previous
1
2
3
4
5
Next