ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1402.0555
  4. Cited By
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
v1v2 (latest)

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

4 February 2014
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits"

50 / 202 papers shown
Title
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression
  Oracles
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
Dylan J. Foster
Alexander Rakhlin
371
213
0
12 Feb 2020
Online Preselection with Context Information under the Plackett-Luce
  Model
Online Preselection with Context Information under the Plackett-Luce Model
Adil El Mesaoudi-Paul
Viktor Bengs
Eyke Hüllermeier
51
4
0
11 Feb 2020
Combinatorial Semi-Bandit in the Non-Stationary Environment
Combinatorial Semi-Bandit in the Non-Stationary Environment
Wei Chen
Liwei Wang
Haoyu Zhao
Kai Zheng
86
18
0
10 Feb 2020
Fair Contextual Multi-Armed Bandits: Theory and Experiments
Fair Contextual Multi-Armed Bandits: Theory and Experiments
Yifang Chen
Alex Cuellar
Haipeng Luo
Jignesh Modi
Heramb Nemlekar
Stefanos Nikolaidis
FaML
91
61
0
13 Dec 2019
Sublinear Optimal Policy Value Estimation in Contextual Bandits
Sublinear Optimal Policy Value Estimation in Contextual Bandits
Weihao Kong
Gregory Valiant
Emma Brunskill
OffRL
62
13
0
12 Dec 2019
Online Pricing with Reserve Price Constraint for Personal Data Markets
Online Pricing with Reserve Price Constraint for Personal Data Markets
Chaoyue Niu
Zhenzhe Zheng
Fan Wu
Shaojie Tang
Guihai Chen
53
34
0
28 Nov 2019
Kinematic State Abstraction and Provably Efficient Rich-Observation
  Reinforcement Learning
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Dipendra Kumar Misra
Mikael Henaff
A. Krishnamurthy
John Langford
85
151
0
13 Nov 2019
Neural Contextual Bandits with UCB-based Exploration
Neural Contextual Bandits with UCB-based Exploration
Dongruo Zhou
Lihong Li
Quanquan Gu
135
15
0
11 Nov 2019
Multi-Armed Bandits with Correlated Arms
Multi-Armed Bandits with Correlated Arms
Samarth Gupta
Shreyas Chaudhari
Gauri Joshi
Osman Yağan
177
51
0
06 Nov 2019
Smooth Contextual Bandits: Bridging the Parametric and
  Non-differentiable Regret Regimes
Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
Yichun Hu
Nathan Kallus
Xiaojie Mao
97
34
0
05 Sep 2019
$\sqrt{n}$-Regret for Learning in Markov Decision Processes with
  Function Approximation and Low Bellman Rank
n\sqrt{n}n​-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank
Kefan Dong
Jian-wei Peng
Yining Wang
Yuanshuo Zhou
OffRL
81
36
0
05 Sep 2019
Adaptive Robot-Assisted Feeding: An Online Learning Framework for
  Acquiring Previously Unseen Food Items
Adaptive Robot-Assisted Feeding: An Online Learning Framework for Acquiring Previously Unseen Food Items
E. Gordon
Xiang Meng
Matt Barnes
Tapomayukh Bhattacharjee
S. Srinivasa
OffRLOnRL
155
47
0
19 Aug 2019
Off-policy Learning for Multiple Loggers
Off-policy Learning for Multiple Loggers
Li He
Long Xia
Wei Zeng
Zhi-Ming Ma
Yue Zhao
Dawei Yin
OffRL
57
10
0
23 Jul 2019
Exploiting Relevance for Online Decision-Making in High-Dimensions
Exploiting Relevance for Online Decision-Making in High-Dimensions
E. Turğay
Cem Bulucu
Cem Tekin
64
4
0
01 Jul 2019
Adaptive Sequential Experiments with Unknown Information Arrival
  Processes
Adaptive Sequential Experiments with Unknown Information Arrival Processes
Y. Gur
Ahmadreza Momeni
92
3
0
28 Jun 2019
ASAC: Active Sensing using Actor-Critic models
ASAC: Active Sensing using Actor-Critic models
Jinsung Yoon
James Jordon
M. Schaar
CML
59
16
0
16 Jun 2019
Distributionally Robust Counterfactual Risk Minimization
Distributionally Robust Counterfactual Risk Minimization
Louis Faury
Ugo Tanielian
Flavian Vasile
E. Smirnova
Elvis Dohmatob
78
45
0
14 Jun 2019
Empirical Likelihood for Contextual Bandits
Empirical Likelihood for Contextual Bandits
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
134
9
0
07 Jun 2019
Stochastic Bandits with Context Distributions
Stochastic Bandits with Context Distributions
Johannes Kirschner
Andreas Krause
70
30
0
06 Jun 2019
Model selection for contextual bandits
Model selection for contextual bandits
Dylan J. Foster
A. Krishnamurthy
Haipeng Luo
OffRL
216
90
0
03 Jun 2019
Multi-Objective Generalized Linear Bandits
Multi-Objective Generalized Linear Bandits
Shiyin Lu
G. Wang
Yao Hu
Lijun Zhang
25
22
0
30 May 2019
On the Generalization Gap in Reparameterizable Reinforcement Learning
On the Generalization Gap in Reparameterizable Reinforcement Learning
Huan Wang
Stephan Zheng
Caiming Xiong
R. Socher
117
41
0
29 May 2019
Provably Efficient Imitation Learning from Observation Alone
Provably Efficient Imitation Learning from Observation Alone
Wen Sun
Anirudh Vemula
Byron Boots
J. Andrew Bagnell
167
107
0
27 May 2019
OSOM: A simultaneously optimal algorithm for multi-armed and linear
  contextual bandits
OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits
Niladri S. Chatterji
Vidya Muthukumar
Peter L. Bartlett
85
46
0
24 May 2019
Introduction to Multi-Armed Bandits
Introduction to Multi-Armed Bandits
Aleksandrs Slivkins
687
1,025
0
15 Apr 2019
Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits
Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits
Yingkai Li
Yining Wang
Yuanshuo Zhou
193
61
0
30 Mar 2019
Modeling and Optimization of Human-machine Interaction Processes via the
  Maximum Entropy Principle
Modeling and Optimization of Human-machine Interaction Processes via the Maximum Entropy Principle
Jiaxiao Zheng
G. Veciana
26
1
0
17 Mar 2019
Cost-Effective Incentive Allocation via Structured Counterfactual
  Inference
Cost-Effective Incentive Allocation via Structured Counterfactual Inference
Romain Lopez
Chenchen Li
X. Yan
Junwu Xiong
Michael I. Jordan
Yuan Qi
Le Song
OffRL
96
17
0
07 Feb 2019
Equal Opportunity in Online Classification with Partial Feedback
Equal Opportunity in Online Classification with Partial Feedback
Yahav Bechavod
Katrina Ligett
Aaron Roth
Bo Waggoner
Zhiwei Steven Wu
FaML
78
60
0
06 Feb 2019
Contextual Bandits with Continuous Actions: Smoothing, Zooming, and
  Adapting
Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting
A. Krishnamurthy
John Langford
Aleksandrs Slivkins
Chicheng Zhang
OffRL
177
65
0
05 Feb 2019
A New Algorithm for Non-stationary Contextual Bandits: Efficient,
  Optimal, and Parameter-free
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free
Yifang Chen
Chung-Wei Lee
Haipeng Luo
Chen-Yu Wei
166
134
0
03 Feb 2019
Contrasting Exploration in Parameter and Action Space: A Zeroth-Order
  Optimization Perspective
Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective
Anirudh Vemula
Wen Sun
J. Andrew Bagnell
73
40
0
31 Jan 2019
The Assistive Multi-Armed Bandit
The Assistive Multi-Armed Bandit
Lawrence Chan
Dylan Hadfield-Menell
S. Srinivasa
Anca Dragan
61
36
0
24 Jan 2019
Online Learning with Diverse User Preferences
Chao Gan
Jing Yang
Ruida Zhou
Cong Shen
39
2
0
23 Jan 2019
Online Learning for Measuring Incentive Compatibility in Ad Auctions
Online Learning for Measuring Incentive Compatibility in Ad Auctions
Zhe Feng
Okke Schrijvers
Eric Sodomka
38
22
0
21 Jan 2019
Warm-starting Contextual Bandits: Robustly Combining Supervised and
  Bandit Feedback
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback
Chicheng Zhang
Alekh Agarwal
Hal Daumé
John Langford
S. Negahban
85
34
0
02 Jan 2019
Top-K Off-Policy Correction for a REINFORCE Recommender System
Top-K Off-Policy Correction for a REINFORCE Recommender System
Minmin Chen
Alex Beutel
Paul Covington
Sagar Jain
Francois Belletti
Ed H. Chi
CMLOffRL
149
485
0
06 Dec 2018
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual
  Street Environments
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
117
391
0
29 Nov 2018
Adversarial Bandits with Knapsacks
Adversarial Bandits with Knapsacks
Nicole Immorlica
Karthik Abinav Sankararaman
Robert Schapire
Aleksandrs Slivkins
209
116
0
28 Nov 2018
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
Branislav Kveton
Csaba Szepesvári
Sharan Vaswani
Zheng Wen
Mohammad Ghavamzadeh
Tor Lattimore
180
70
0
13 Nov 2018
Adapting multi-armed bandits policies to contextual bandits scenarios
Adapting multi-armed bandits policies to contextual bandits scenarios
David Cortes
74
32
0
11 Nov 2018
CAB: Continuous Adaptive Blending Estimator for Policy Evaluation and
  Learning
CAB: Continuous Adaptive Blending Estimator for Policy Evaluation and Learning
Yi-Hsun Su
Lequn Wang
Michele Santacatterina
Mohsen Guizani
CMLOffRL
25
7
0
06 Nov 2018
Adversarial Attacks on Stochastic Bandits
Adversarial Attacks on Stochastic Bandits
Kwang-Sung Jun
Lihong Li
Yuzhe Ma
Xiaojin Zhu
AAML
373
124
0
29 Oct 2018
Contextual Bandits with Cross-learning
Contextual Bandits with Cross-learning
S. Balseiro
Negin Golrezaei
Mohammad Mahdian
Vahab Mirrokni
Jon Schneider
171
51
0
25 Sep 2018
Linear Bandits with Stochastic Delayed Feedback
Linear Bandits with Stochastic Delayed Feedback
Claire Vernade
Alexandra Carpentier
Tor Lattimore
Giovanni Zappella
Beyza Ermis
M. Brueckner
80
67
0
05 Jul 2018
Playing against Nature: causal discovery for decision making under
  uncertainty
Playing against Nature: causal discovery for decision making under uncertainty
Mauricio Gonzalez-Soto
L. Sucar
Hugo Jair Escalante
CML
20
9
0
03 Jul 2018
Contextual bandits with surrogate losses: Margin bounds and efficient
  algorithms
Contextual bandits with surrogate losses: Margin bounds and efficient algorithms
Dylan J. Foster
A. Krishnamurthy
167
18
0
28 Jun 2018
Causal Bandits with Propagating Inference
Causal Bandits with Propagating Inference
Akihiro Yabe
Daisuke Hatano
Hanna Sumita
Shinji Ito
Naonori Kakimura
Takuro Fukunaga
Ken-ichi Kawarabayashi
CML
62
33
0
06 Jun 2018
The Externalities of Exploration and How Data Diversity Helps
  Exploitation
The Externalities of Exploration and How Data Diversity Helps Exploitation
Manish Raghavan
Aleksandrs Slivkins
Jennifer Wortman Vaughan
Zhiwei Steven Wu
241
53
0
01 Jun 2018
A Study on Overfitting in Deep Reinforcement Learning
A Study on Overfitting in Deep Reinforcement Learning
Chiyuan Zhang
Oriol Vinyals
Rémi Munos
Samy Bengio
OffRLOnRL
61
391
0
18 Apr 2018
Previous
12345
Next