ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1402.0555
  4. Cited By
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
v1v2 (latest)

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

4 February 2014
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits"

50 / 202 papers shown
Title
A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
Yasin Abbasi-Yadkori
András Gyorgy
N. Lazić
63
22
0
17 Jan 2022
Top $K$ Ranking for Multi-Armed Bandit with Noisy Evaluations
Top KKK Ranking for Multi-Armed Bandit with Noisy Evaluations
Evrard Garcelon
Vashist Avadhanula
A. Lazaric
and Matteo Pirotta
73
4
0
13 Dec 2021
Offline Neural Contextual Bandits: Pessimism, Optimization and
  Generalization
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization
Thanh Nguyen-Tang
Sunil R. Gupta
A. Nguyen
Svetha Venkatesh
OffRL
97
30
0
27 Nov 2021
Efficient and Optimal Algorithms for Contextual Dueling Bandits under
  Realizability
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability
Aadirupa Saha
A. Krishnamurthy
105
38
0
24 Nov 2021
Towards the D-Optimal Online Experiment Design for Recommender Selection
Towards the D-Optimal Online Experiment Design for Recommender Selection
Madina Abdrakhmanova
Saniya Abushakimova
Evren Körpeoglu
H. A. Varol
Kannan Achan
98
3
0
23 Oct 2021
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics
Yonathan Efroni
Dipendra Kumar Misra
A. Krishnamurthy
Alekh Agarwal
John Langford
OffRL
76
23
0
17 Oct 2021
Representation Learning for Online and Offline RL in Low-rank MDPs
Representation Learning for Online and Offline RL in Low-rank MDPs
Masatoshi Uehara
Xuezhou Zhang
Wen Sun
OffRL
140
129
0
09 Oct 2021
Distributionally Robust Learning
Distributionally Robust Learning
Ruidi Chen
I. Paschalidis
OOD
93
69
0
20 Aug 2021
Combining Online Learning and Offline Learning for Contextual Bandits
  with Deficient Support
Combining Online Learning and Offline Learning for Contextual Bandits with Deficient Support
Hung The Tran
Sunil R. Gupta
Thanh Nguyen-Tang
Santu Rana
Svetha Venkatesh
OffRL
65
5
0
24 Jul 2021
Adapting to Misspecification in Contextual Bandits
Adapting to Misspecification in Contextual Bandits
Dylan J. Foster
Claudio Gentile
M. Mohri
Julian Zimmert
117
87
0
12 Jul 2021
Model Selection for Generic Contextual Bandits
Model Selection for Generic Contextual Bandits
Avishek Ghosh
Abishek Sankararaman
Kannan Ramchandran
76
6
0
07 Jul 2021
On component interactions in two-stage recommender systems
On component interactions in two-stage recommender systems
Jiri Hron
K. Krauth
Michael I. Jordan
Niki Kilbertus
CMLLRM
74
31
0
28 Jun 2021
Dealing with Expert Bias in Collective Decision-Making
Dealing with Expert Bias in Collective Decision-Making
Axel Abels
Tom Lenaerts
V. Trianni
Ann Nowé
56
6
0
25 Jun 2021
Sample Efficient Reinforcement Learning In Continuous State Spaces: A
  Perspective Beyond Linearity
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Dhruv Malik
Aldo Pacchiano
Vishwak Srinivasan
Yuanzhi Li
57
6
0
15 Jun 2021
Efficient Online Learning for Dynamic k-Clustering
Efficient Online Learning for Dynamic k-Clustering
Dimitris Fotakis
Georgios Piliouras
Stratis Skoulakis
34
4
0
08 Jun 2021
Risk Minimization from Adaptively Collected Data: Guarantees for
  Supervised and Policy Learning
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning
Aurélien F. Bibaut
Antoine Chambaz
Maria Dimakopoulou
Nathan Kallus
Mark van der Laan
OffRL
91
15
0
03 Jun 2021
Information Directed Sampling for Sparse Linear Bandits
Information Directed Sampling for Sparse Linear Bandits
Botao Hao
Tor Lattimore
Wei Deng
62
19
0
29 May 2021
Statistical Testing under Distributional Shifts
Statistical Testing under Distributional Shifts
Nikolaj Thams
Sorawit Saengkyongam
Niklas Pfister
J. Peters
OOD
124
10
0
22 May 2021
Adaptive ABAC Policy Learning: A Reinforcement Learning Approach
Adaptive ABAC Policy Learning: A Reinforcement Learning Approach
Leila Karimi
Mai Abdelhakim
J. Joshi
13
13
0
18 May 2021
An Efficient Algorithm for Deep Stochastic Contextual Bandits
An Efficient Algorithm for Deep Stochastic Contextual Bandits
Tan Zhu
Guannan Liang
Chunjiang Zhu
HaiNing Li
J. Bi
77
1
0
12 Apr 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear
  Function Approximation
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
106
53
0
24 Mar 2021
Online Multi-Armed Bandits with Adaptive Inference
Online Multi-Armed Bandits with Adaptive Inference
Maria Dimakopoulou
Zhimei Ren
Zhengyuan Zhou
87
36
0
25 Feb 2021
Logarithmic Regret in Feature-based Dynamic Pricing
Logarithmic Regret in Feature-based Dynamic Pricing
Jianyu Xu
Yu Wang
73
27
0
20 Feb 2021
Boosting for Online Convex Optimization
Boosting for Online Convex Optimization
Elad Hazan
Karan Singh
OffRL
66
9
0
18 Feb 2021
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic
  and Adversarial Linear Bandits Simultaneously
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
Mengxiao Zhang
Xiaojin Zhang
98
49
0
11 Feb 2021
Non-stationary Reinforcement Learning without Prior Knowledge: An
  Optimal Black-box Approach
Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach
Chen-Yu Wei
Haipeng Luo
OffRL
187
107
0
10 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and
  Sample-Efficient Algorithms
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
123
220
0
01 Feb 2021
Relational Boosted Bandits
Relational Boosted Bandits
A. Kakadiya
S. Natarajan
Balaraman Ravindran
33
5
0
16 Dec 2020
Neural Contextual Bandits with Deep Representation and Shallow
  Exploration
Neural Contextual Bandits with Deep Representation and Shallow Exploration
Pan Xu
Zheng Wen
Handong Zhao
Quanquan Gu
OffRL
89
78
0
03 Dec 2020
Improving Offline Contextual Bandits with Distributional Robustness
Improving Offline Contextual Bandits with Distributional Robustness
Otmane Sakhi
Louis Faury
Flavian Vasile
OffRL
41
7
0
13 Nov 2020
Leveraging Post Hoc Context for Faster Learning in Bandit Settings with
  Applications in Robot-Assisted Feeding
Leveraging Post Hoc Context for Faster Learning in Bandit Settings with Applications in Robot-Assisted Feeding
E. Gordon
Sumegh Roychowdhury
Tapomayukh Bhattacharjee
Kevin Jamieson
S. Srinivasa
145
20
0
05 Nov 2020
Online Algorithm for Unsupervised Sequential Selection with Contextual
  Information
Online Algorithm for Unsupervised Sequential Selection with Contextual Information
Arun Verma
M. Hanawal
Csaba Szepesvári
Venkatesh Saligrama
53
6
0
23 Oct 2020
Carousel Personalization in Music Streaming Apps with Contextual Bandits
Carousel Personalization in Music Streaming Apps with Contextual Bandits
Walid Bendada
Guillaume Salha-Galvan
Théo Bontempelli
70
57
0
14 Sep 2020
Unifying Clustered and Non-stationary Bandits
Unifying Clustered and Non-stationary Bandits
Chuanhao Li
Qingyun Wu
Hongning Wang
93
12
0
05 Sep 2020
Fast Distributed Bandits for Online Recommendation Systems
Fast Distributed Bandits for Online Recommendation Systems
K. Mahadik
Qingyun Wu
Shuai Li
Amit Sabne
87
60
0
16 Jul 2020
Quantum exploration algorithms for multi-armed bandits
Quantum exploration algorithms for multi-armed bandits
Daochen Wang
Xuchen You
Tongyang Li
Andrew M. Childs
109
29
0
14 Jul 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank
  MDPs
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal
Sham Kakade
A. Krishnamurthy
Wen Sun
OffRL
203
227
0
18 Jun 2020
Off-policy Bandits with Deficient Support
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
198
78
0
16 Jun 2020
Efficient Contextual Bandits with Continuous Actions
Efficient Contextual Bandits with Continuous Actions
Maryam Majzoubi
Chicheng Zhang
Rajan Chari
A. Krishnamurthy
John Langford
Aleksandrs Slivkins
OffRL
80
32
0
10 Jun 2020
Meta-Learning Bandit Policies by Gradient Ascent
Meta-Learning Bandit Policies by Gradient Ascent
Branislav Kveton
Martin Mladenov
Chih-Wei Hsu
Manzil Zaheer
Csaba Szepesvári
Craig Boutilier
76
9
0
09 Jun 2020
Greedy Algorithm almost Dominates in Smoothed Contextual Bandits
Greedy Algorithm almost Dominates in Smoothed Contextual Bandits
Manish Raghavan
Aleksandrs Slivkins
Jennifer Wortman Vaughan
Zhiwei Steven Wu
388
18
0
19 May 2020
DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
Yichun Hu
Nathan Kallus
OffRL
17
0
0
06 May 2020
Learning nonlinear dynamical systems from a single trajectory
Learning nonlinear dynamical systems from a single trajectory
Dylan J. Foster
Alexander Rakhlin
Tuhin Sarkar
52
74
0
30 Apr 2020
Counterfactual Learning of Stochastic Policies with Continuous Actions:
  from Models to Offline Evaluation
Counterfactual Learning of Stochastic Policies with Continuous Actions: from Models to Offline Evaluation
Houssam Zenati
A. Bietti
Matthieu Martin
Eustache Diemert
Pierre Gaillard
Julien Mairal
OffRLCML
59
4
0
22 Apr 2020
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for
  Contextual Bandits under Realizability
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability
D. Simchi-Levi
Yunzong Xu
OffRL
441
112
0
28 Mar 2020
Delay-Adaptive Learning in Generalized Linear Contextual Bandits
Delay-Adaptive Learning in Generalized Linear Contextual Bandits
Jose H. Blanchet
Renyuan Xu
Zhengyuan Zhou
OffRL
38
6
0
11 Mar 2020
Contextual Blocking Bandits
Contextual Blocking Bandits
Soumya Basu
Orestis Papadigenopoulos
Constantine Caramanis
Sanjay Shakkottai
83
21
0
06 Mar 2020
Generalized Policy Elimination: an efficient algorithm for Nonparametric
  Contextual Bandits
Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits
Aurélien F. Bibaut
Antoine Chambaz
Mark van der Laan
OffRL
116
3
0
05 Mar 2020
Stochastic Linear Contextual Bandits with Diverse Contexts
Stochastic Linear Contextual Bandits with Diverse Contexts
Weiqiang Wu
Jing Yang
Cong Shen
117
14
0
05 Mar 2020
Taking a hint: How to leverage loss predictors in contextual bandits?
Taking a hint: How to leverage loss predictors in contextual bandits?
Chen-Yu Wei
Haipeng Luo
Alekh Agarwal
170
27
0
04 Mar 2020
Previous
12345
Next