Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1402.0555
Cited By
v1
v2 (latest)
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
4 February 2014
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits"
50 / 202 papers shown
Title
A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
Yasin Abbasi-Yadkori
András Gyorgy
N. Lazić
63
22
0
17 Jan 2022
Top
K
K
K
Ranking for Multi-Armed Bandit with Noisy Evaluations
Evrard Garcelon
Vashist Avadhanula
A. Lazaric
and Matteo Pirotta
73
4
0
13 Dec 2021
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization
Thanh Nguyen-Tang
Sunil R. Gupta
A. Nguyen
Svetha Venkatesh
OffRL
97
30
0
27 Nov 2021
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability
Aadirupa Saha
A. Krishnamurthy
105
38
0
24 Nov 2021
Towards the D-Optimal Online Experiment Design for Recommender Selection
Madina Abdrakhmanova
Saniya Abushakimova
Evren Körpeoglu
H. A. Varol
Kannan Achan
98
3
0
23 Oct 2021
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics
Yonathan Efroni
Dipendra Kumar Misra
A. Krishnamurthy
Alekh Agarwal
John Langford
OffRL
76
23
0
17 Oct 2021
Representation Learning for Online and Offline RL in Low-rank MDPs
Masatoshi Uehara
Xuezhou Zhang
Wen Sun
OffRL
140
129
0
09 Oct 2021
Distributionally Robust Learning
Ruidi Chen
I. Paschalidis
OOD
93
69
0
20 Aug 2021
Combining Online Learning and Offline Learning for Contextual Bandits with Deficient Support
Hung The Tran
Sunil R. Gupta
Thanh Nguyen-Tang
Santu Rana
Svetha Venkatesh
OffRL
65
5
0
24 Jul 2021
Adapting to Misspecification in Contextual Bandits
Dylan J. Foster
Claudio Gentile
M. Mohri
Julian Zimmert
117
87
0
12 Jul 2021
Model Selection for Generic Contextual Bandits
Avishek Ghosh
Abishek Sankararaman
Kannan Ramchandran
76
6
0
07 Jul 2021
On component interactions in two-stage recommender systems
Jiri Hron
K. Krauth
Michael I. Jordan
Niki Kilbertus
CML
LRM
74
31
0
28 Jun 2021
Dealing with Expert Bias in Collective Decision-Making
Axel Abels
Tom Lenaerts
V. Trianni
Ann Nowé
56
6
0
25 Jun 2021
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Dhruv Malik
Aldo Pacchiano
Vishwak Srinivasan
Yuanzhi Li
57
6
0
15 Jun 2021
Efficient Online Learning for Dynamic k-Clustering
Dimitris Fotakis
Georgios Piliouras
Stratis Skoulakis
34
4
0
08 Jun 2021
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning
Aurélien F. Bibaut
Antoine Chambaz
Maria Dimakopoulou
Nathan Kallus
Mark van der Laan
OffRL
91
15
0
03 Jun 2021
Information Directed Sampling for Sparse Linear Bandits
Botao Hao
Tor Lattimore
Wei Deng
62
19
0
29 May 2021
Statistical Testing under Distributional Shifts
Nikolaj Thams
Sorawit Saengkyongam
Niklas Pfister
J. Peters
OOD
124
10
0
22 May 2021
Adaptive ABAC Policy Learning: A Reinforcement Learning Approach
Leila Karimi
Mai Abdelhakim
J. Joshi
13
13
0
18 May 2021
An Efficient Algorithm for Deep Stochastic Contextual Bandits
Tan Zhu
Guannan Liang
Chunjiang Zhu
HaiNing Li
J. Bi
77
1
0
12 Apr 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
106
53
0
24 Mar 2021
Online Multi-Armed Bandits with Adaptive Inference
Maria Dimakopoulou
Zhimei Ren
Zhengyuan Zhou
87
36
0
25 Feb 2021
Logarithmic Regret in Feature-based Dynamic Pricing
Jianyu Xu
Yu Wang
73
27
0
20 Feb 2021
Boosting for Online Convex Optimization
Elad Hazan
Karan Singh
OffRL
66
9
0
18 Feb 2021
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
Mengxiao Zhang
Xiaojin Zhang
98
49
0
11 Feb 2021
Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach
Chen-Yu Wei
Haipeng Luo
OffRL
187
107
0
10 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
123
220
0
01 Feb 2021
Relational Boosted Bandits
A. Kakadiya
S. Natarajan
Balaraman Ravindran
33
5
0
16 Dec 2020
Neural Contextual Bandits with Deep Representation and Shallow Exploration
Pan Xu
Zheng Wen
Handong Zhao
Quanquan Gu
OffRL
89
78
0
03 Dec 2020
Improving Offline Contextual Bandits with Distributional Robustness
Otmane Sakhi
Louis Faury
Flavian Vasile
OffRL
41
7
0
13 Nov 2020
Leveraging Post Hoc Context for Faster Learning in Bandit Settings with Applications in Robot-Assisted Feeding
E. Gordon
Sumegh Roychowdhury
Tapomayukh Bhattacharjee
Kevin Jamieson
S. Srinivasa
145
20
0
05 Nov 2020
Online Algorithm for Unsupervised Sequential Selection with Contextual Information
Arun Verma
M. Hanawal
Csaba Szepesvári
Venkatesh Saligrama
53
6
0
23 Oct 2020
Carousel Personalization in Music Streaming Apps with Contextual Bandits
Walid Bendada
Guillaume Salha-Galvan
Théo Bontempelli
70
57
0
14 Sep 2020
Unifying Clustered and Non-stationary Bandits
Chuanhao Li
Qingyun Wu
Hongning Wang
93
12
0
05 Sep 2020
Fast Distributed Bandits for Online Recommendation Systems
K. Mahadik
Qingyun Wu
Shuai Li
Amit Sabne
87
60
0
16 Jul 2020
Quantum exploration algorithms for multi-armed bandits
Daochen Wang
Xuchen You
Tongyang Li
Andrew M. Childs
109
29
0
14 Jul 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal
Sham Kakade
A. Krishnamurthy
Wen Sun
OffRL
203
227
0
18 Jun 2020
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
198
78
0
16 Jun 2020
Efficient Contextual Bandits with Continuous Actions
Maryam Majzoubi
Chicheng Zhang
Rajan Chari
A. Krishnamurthy
John Langford
Aleksandrs Slivkins
OffRL
80
32
0
10 Jun 2020
Meta-Learning Bandit Policies by Gradient Ascent
Branislav Kveton
Martin Mladenov
Chih-Wei Hsu
Manzil Zaheer
Csaba Szepesvári
Craig Boutilier
76
9
0
09 Jun 2020
Greedy Algorithm almost Dominates in Smoothed Contextual Bandits
Manish Raghavan
Aleksandrs Slivkins
Jennifer Wortman Vaughan
Zhiwei Steven Wu
388
18
0
19 May 2020
DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
Yichun Hu
Nathan Kallus
OffRL
17
0
0
06 May 2020
Learning nonlinear dynamical systems from a single trajectory
Dylan J. Foster
Alexander Rakhlin
Tuhin Sarkar
52
74
0
30 Apr 2020
Counterfactual Learning of Stochastic Policies with Continuous Actions: from Models to Offline Evaluation
Houssam Zenati
A. Bietti
Matthieu Martin
Eustache Diemert
Pierre Gaillard
Julien Mairal
OffRL
CML
59
4
0
22 Apr 2020
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability
D. Simchi-Levi
Yunzong Xu
OffRL
441
112
0
28 Mar 2020
Delay-Adaptive Learning in Generalized Linear Contextual Bandits
Jose H. Blanchet
Renyuan Xu
Zhengyuan Zhou
OffRL
38
6
0
11 Mar 2020
Contextual Blocking Bandits
Soumya Basu
Orestis Papadigenopoulos
Constantine Caramanis
Sanjay Shakkottai
83
21
0
06 Mar 2020
Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits
Aurélien F. Bibaut
Antoine Chambaz
Mark van der Laan
OffRL
116
3
0
05 Mar 2020
Stochastic Linear Contextual Bandits with Diverse Contexts
Weiqiang Wu
Jing Yang
Cong Shen
117
14
0
05 Mar 2020
Taking a hint: How to leverage loss predictors in contextual bandits?
Chen-Yu Wei
Haipeng Luo
Alekh Agarwal
170
27
0
04 Mar 2020
Previous
1
2
3
4
5
Next