ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1402.0555
  4. Cited By
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
v1v2 (latest)

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

4 February 2014
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits"

50 / 202 papers shown
Title
Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits
Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits
Tianyi Xu
Jiaxin Liu
Zizhan Zheng
FaML
55
0
0
17 Jun 2025
No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!
No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!
Francesco Emanuele Stradi
Matteo Castiglioni
A. Marchesi
N. Gatti
Christian Kroer
20
0
0
16 Jun 2025
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
Jiachen Hu
Rui Ai
Han Zhong
Xiaoyu Chen
L. Wang
Zhaoran Wang
Zhuoran Yang
62
0
0
11 Jun 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
Aleksandrs Slivkins
Yunzong Xu
Shiliang Zuo
539
1
0
06 Mar 2025
Minimax Optimal Reinforcement Learning with Quasi-Optimism
Minimax Optimal Reinforcement Learning with Quasi-Optimism
Harin Lee
Min-hwan Oh
OffRL
105
1
0
02 Mar 2025
A Complete Characterization of Learnability for Stochastic Noisy Bandits
A Complete Characterization of Learnability for Stochastic Noisy Bandits
Steve Hanneke
Kun Wang
176
1
0
20 Jan 2025
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
H. Bui
Enrique Mallada
Anqi Liu
509
1
0
08 Nov 2024
Second Order Bounds for Contextual Bandits with Function Approximation
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
278
5
0
24 Sep 2024
Efficient Sequential Decision Making with Large Language Models
Efficient Sequential Decision Making with Large Language Models
Dingyang Chen
Qi Zhang
Yinglun Zhu
LRM
98
4
0
17 Jun 2024
Towards Domain Adaptive Neural Contextual Bandits
Towards Domain Adaptive Neural Contextual Bandits
Ziyan Wang
Hao Wang
Hao Wang
220
0
0
13 Jun 2024
Multiple-policy Evaluation via Density Estimation
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
62
1
0
29 Mar 2024
Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for
  Counselor Reflection Generation
Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for Counselor Reflection Generation
Do June Min
Verónica Pérez-Rosas
Kenneth Resnicow
Rada Mihalcea
OffRL
112
4
0
20 Mar 2024
Experiment Planning with Function Approximation
Experiment Planning with Function Approximation
Aldo Pacchiano
Jonathan Lee
Emma Brunskill
OffRL
70
4
0
10 Jan 2024
Bayesian Design Principles for Frequentist Sequential Learning
Bayesian Design Principles for Frequentist Sequential Learning
Yunbei Xu
A. Zeevi
125
13
0
01 Oct 2023
A Unified Model and Dimension for Interactive Estimation
A Unified Model and Dimension for Interactive Estimation
Nataly Brukhim
Miroslav Dudík
Aldo Pacchiano
Robert Schapire
40
1
0
09 Jun 2023
Online Learning for Equilibrium Pricing in Markets under Incomplete Information
Online Learning for Equilibrium Pricing in Markets under Incomplete Information
Devansh Jalota
Haoyuan Sun
Navid Azizan
55
2
0
21 Mar 2023
Smoothed Analysis of Sequential Probability Assignment
Smoothed Analysis of Sequential Probability Assignment
Alankrita Bhatt
Nika Haghtalab
Abhishek Shetty
80
10
0
08 Mar 2023
Model-based Constrained MDP for Budget Allocation in Sequential
  Incentive Marketing
Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing
Shuai Xiao
Le Guo
Zaifan Jiang
Lei Lv
Yuanbo Chen
Jun Zhu
Shuang Yang
66
21
0
02 Mar 2023
Reinforcement Learning in Low-Rank MDPs with Density Features
Reinforcement Learning in Low-Rank MDPs with Density Features
Audrey Huang
Jinglin Chen
Nan Jiang
OffRL
84
14
0
04 Feb 2023
Multiplier Bootstrap-based Exploration
Multiplier Bootstrap-based Exploration
Runzhe Wan
Haoyu Wei
Branislav Kveton
R. Song
52
3
0
03 Feb 2023
Selective Uncertainty Propagation in Offline RL
Selective Uncertainty Propagation in Offline RL
Sanath Kumar Krishnamurthy
Shrey Modi
Tanmay Gangwani
S. Katariya
Branislav Kveton
A. Rangi
OffRL
220
0
0
01 Feb 2023
Learning to Generate All Feasible Actions
Learning to Generate All Feasible Actions
Mirco Theile
Daniele Bernardini
Raphael Trumpp
C. Piazza
Marco Caccamo
Alberto L. Sangiovanni-Vincentelli
60
2
0
26 Jan 2023
GBOSE: Generalized Bandit Orthogonalized Semiparametric Estimation
GBOSE: Generalized Bandit Orthogonalized Semiparametric Estimation
Mubarrat Chowdhury
Elkhan Ismayilzada
Khalequzzaman Sayem
Gi-Soo Kim
59
1
0
20 Jan 2023
On the Complexity of Representation Learning in Contextual Linear
  Bandits
On the Complexity of Representation Learning in Contextual Linear Bandits
Andrea Tirinzoni
Matteo Pirotta
A. Lazaric
61
1
0
19 Dec 2022
Contextual Bandits in a Survey Experiment on Charitable Giving:
  Within-Experiment Outcomes versus Policy Learning
Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning
Susan Athey
Undral Byambadalai
Vitor Hadad
Sanath Kumar Krishnamurthy
Weiwen Leung
Joseph Jay Williams
94
14
0
22 Nov 2022
Deploying a Steered Query Optimizer in Production at Microsoft
Deploying a Steered Query Optimizer in Production at Microsoft
Wangda Zhang
Matteo Interlandi
Paul Mineiro
S. Qiao
Nasim Ghazanfari
Marc T. Friedman
Rafah Hosn
Hiren Patel
Alekh Jindal
52
24
0
24 Oct 2022
Scalable Representation Learning in Linear Contextual Bandits with
  Constant Regret Guarantees
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees
Andrea Tirinzoni
Matteo Papini
Ahmed Touati
A. Lazaric
Matteo Pirotta
70
4
0
24 Oct 2022
Optimal Contextual Bandits with Knapsacks under Realizability via
  Regression Oracles
Optimal Contextual Bandits with Knapsacks under Realizability via Regression Oracles
Yuxuan Han
Jialin Zeng
Yang Wang
Yangzhen Xiang
Jiheng Zhang
103
9
0
21 Oct 2022
Adaptive Oracle-Efficient Online Learning
Adaptive Oracle-Efficient Online Learning
Guanghui Wang
Zihao Hu
Vidya Muthukumar
Jacob D. Abernethy
64
4
0
17 Oct 2022
The Role of Coverage in Online Reinforcement Learning
The Role of Coverage in Online Reinforcement Learning
Tengyang Xie
Dylan J. Foster
Yu Bai
Nan Jiang
Sham Kakade
OffRL
85
60
0
09 Oct 2022
Making Decisions under Outcome Performativity
Making Decisions under Outcome Performativity
Michael P. Kim
Juan C. Perdomo
91
21
0
04 Oct 2022
A General Framework for Sample-Efficient Function Approximation in
  Reinforcement Learning
A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Zixiang Chen
C. J. Li
An Yuan
Quanquan Gu
Michael I. Jordan
OffRL
151
27
0
30 Sep 2022
Advertising Media and Target Audience Optimization via High-dimensional
  Bandits
Advertising Media and Target Audience Optimization via High-dimensional Bandits
Wenjia Ba
J. Harrison
Harikesh S. Nair
59
0
0
17 Sep 2022
Sales Channel Optimization via Simulations Based on Observational Data
  with Delayed Rewards: A Case Study at LinkedIn
Sales Channel Optimization via Simulations Based on Observational Data with Delayed Rewards: A Case Study at LinkedIn
Diana M. Negoescu
Pasha Khosravi
Shadow Zhao
Nanyu Chen
P. Ahammad
H. González
31
0
0
16 Sep 2022
Feature selection with gradient descent on two-layer networks in
  low-rotation regimes
Feature selection with gradient descent on two-layer networks in low-rotation regimes
Matus Telgarsky
MLT
81
16
0
04 Aug 2022
Contextual Bandits with Smooth Regret: Efficient Learning in Continuous
  Action Spaces
Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces
Yinglun Zhu
Paul Mineiro
62
18
0
12 Jul 2022
Interaction-Grounded Learning with Action-inclusive Feedback
Interaction-Grounded Learning with Action-inclusive Feedback
Tengyang Xie
Akanksha Saran
Dylan J. Foster
Lekan Molu
Ida Momennejad
Nan Jiang
Paul Mineiro
John Langford
69
10
0
16 Jun 2022
Efficient Heterogeneous Treatment Effect Estimation With Multiple
  Experiments and Multiple Outcomes
Efficient Heterogeneous Treatment Effect Estimation With Multiple Experiments and Multiple Outcomes
Leon Yao
Caroline Lo
Israel Nir
S. Tan
Ariel Evnine
Adam Lerer
A. Peysakhovich
CML
57
7
0
10 Jun 2022
Asymptotic Instance-Optimal Algorithms for Interactive Decision Making
Asymptotic Instance-Optimal Algorithms for Interactive Decision Making
Kefan Dong
Tengyu Ma
132
9
0
06 Jun 2022
Provable General Function Class Representation Learning in Multitask
  Bandits and MDPs
Provable General Function Class Representation Learning in Multitask Bandits and MDPs
Rui Lu
Andrew Zhao
S. Du
Gao Huang
OffRL
104
10
0
31 May 2022
Efficient Phi-Regret Minimization in Extensive-Form Games via Online
  Mirror Descent
Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent
Yu Bai
Chi Jin
Song Mei
Ziang Song
Tiancheng Yu
OffRL
103
19
0
30 May 2022
Chain of Thought Imitation with Procedure Cloning
Chain of Thought Imitation with Procedure Cloning
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
OffRL
111
33
0
22 May 2022
Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret
  in Stochastic Contextual Linear Bandits
Breaking the T\sqrt{T}T​ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits
Avishek Ghosh
Abishek Sankararaman
52
4
0
19 May 2022
Efficient Active Learning with Abstention
Efficient Active Learning with Abstention
Yinglun Zhu
Robert D. Nowak
107
15
0
31 Mar 2022
Flexible and Efficient Contextual Bandits with Heterogeneous Treatment
  Effect Oracles
Flexible and Efficient Contextual Bandits with Heterogeneous Treatment Effect Oracles
Aldo G. Carranza
Sanath Kumar Krishnamurthy
Susan Athey
50
1
0
30 Mar 2022
Stochastic linear optimization never overfits with quadratically-bounded
  losses on general data
Stochastic linear optimization never overfits with quadratically-bounded losses on general data
Matus Telgarsky
90
12
0
14 Feb 2022
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
Yunru Bai
Chi Jin
Song Mei
Tiancheng Yu
104
26
0
03 Feb 2022
Variance-Optimal Augmentation Logging for Counterfactual Evaluation in
  Contextual Bandits
Variance-Optimal Augmentation Logging for Counterfactual Evaluation in Contextual Bandits
Aaron David Tucker
Thorsten Joachims
OffRL
36
9
0
03 Feb 2022
Context Uncertainty in Contextual Bandits with Applications to
  Recommender Systems
Context Uncertainty in Contextual Bandits with Applications to Recommender Systems
Hao Wang
Yifei Ma
Hao Ding
Yuyang Wang
94
6
0
01 Feb 2022
Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs
  Linear Valuation with Unknown Noise
Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs Linear Valuation with Unknown Noise
Jianyu Xu
Yu Wang
136
23
0
27 Jan 2022
12345
Next