v1v2 (latest)

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

4 February 2014

Papers citing "Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits"

50 / 202 papers shown

Title
Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits Tianyi Xu Jiaxin Liu Zizhan Zheng FaML 55 0 0 17 Jun 2025
No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need! Francesco Emanuele Stradi Matteo Castiglioni A. Marchesi N. Gatti Christian Kroer 20 0 0 16 Jun 2025
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability Jiachen Hu Rui Ai Han Zhong Xiaoyu Chen L. Wang Zhaoran Wang Zhuoran Yang 62 0 0 11 Jun 2025
Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure Aleksandrs Slivkins Yunzong Xu Shiliang Zuo 539 1 0 06 Mar 2025
Minimax Optimal Reinforcement Learning with Quasi-Optimism Harin Lee Min-hwan Oh OffRL 105 1 0 02 Mar 2025
A Complete Characterization of Learnability for Stochastic Noisy Bandits Steve Hanneke Kun Wang 176 1 0 20 Jan 2025
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits H. Bui Enrique Mallada Anqi Liu 509 1 0 08 Nov 2024
Second Order Bounds for Contextual Bandits with Function Approximation Aldo Pacchiano 278 5 0 24 Sep 2024
Efficient Sequential Decision Making with Large Language Models Dingyang Chen Qi Zhang Yinglun Zhu LRM 98 4 0 17 Jun 2024
Towards Domain Adaptive Neural Contextual Bandits Ziyan Wang Hao Wang Hao Wang 220 0 0 13 Jun 2024
Multiple-policy Evaluation via Density Estimation Yilei Chen Aldo Pacchiano I. Paschalidis OffRL 62 1 0 29 Mar 2024
Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for Counselor Reflection Generation Do June Min Verónica Pérez-Rosas Kenneth Resnicow Rada Mihalcea OffRL 112 4 0 20 Mar 2024
Experiment Planning with Function Approximation Aldo Pacchiano Jonathan Lee Emma Brunskill OffRL 70 4 0 10 Jan 2024
Bayesian Design Principles for Frequentist Sequential Learning Yunbei Xu A. Zeevi 125 13 0 01 Oct 2023
A Unified Model and Dimension for Interactive Estimation Nataly Brukhim Miroslav Dudík Aldo Pacchiano Robert Schapire 40 1 0 09 Jun 2023
Online Learning for Equilibrium Pricing in Markets under Incomplete Information Devansh Jalota Haoyuan Sun Navid Azizan 55 2 0 21 Mar 2023
Smoothed Analysis of Sequential Probability Assignment Alankrita Bhatt Nika Haghtalab Abhishek Shetty 80 10 0 08 Mar 2023
Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing Shuai Xiao Le Guo Zaifan Jiang Lei Lv Yuanbo Chen Jun Zhu Shuang Yang 66 21 0 02 Mar 2023
Reinforcement Learning in Low-Rank MDPs with Density Features Audrey Huang Jinglin Chen Nan Jiang OffRL 84 14 0 04 Feb 2023
Multiplier Bootstrap-based Exploration Runzhe Wan Haoyu Wei Branislav Kveton R. Song 52 3 0 03 Feb 2023
Selective Uncertainty Propagation in Offline RL Sanath Kumar Krishnamurthy Shrey Modi Tanmay Gangwani S. Katariya Branislav Kveton A. Rangi OffRL 220 0 0 01 Feb 2023
Learning to Generate All Feasible Actions Mirco Theile Daniele Bernardini Raphael Trumpp C. Piazza Marco Caccamo Alberto L. Sangiovanni-Vincentelli 60 2 0 26 Jan 2023
GBOSE: Generalized Bandit Orthogonalized Semiparametric Estimation Mubarrat Chowdhury Elkhan Ismayilzada Khalequzzaman Sayem Gi-Soo Kim 59 1 0 20 Jan 2023
On the Complexity of Representation Learning in Contextual Linear Bandits Andrea Tirinzoni Matteo Pirotta A. Lazaric 61 1 0 19 Dec 2022
Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning Susan Athey Undral Byambadalai Vitor Hadad Sanath Kumar Krishnamurthy Weiwen Leung Joseph Jay Williams 94 14 0 22 Nov 2022
Deploying a Steered Query Optimizer in Production at Microsoft Wangda Zhang Matteo Interlandi Paul Mineiro S. Qiao Nasim Ghazanfari Marc T. Friedman Rafah Hosn Hiren Patel Alekh Jindal 52 24 0 24 Oct 2022
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees Andrea Tirinzoni Matteo Papini Ahmed Touati A. Lazaric Matteo Pirotta 70 4 0 24 Oct 2022
Optimal Contextual Bandits with Knapsacks under Realizability via Regression Oracles Yuxuan Han Jialin Zeng Yang Wang Yangzhen Xiang Jiheng Zhang 103 9 0 21 Oct 2022
Adaptive Oracle-Efficient Online Learning Guanghui Wang Zihao Hu Vidya Muthukumar Jacob D. Abernethy 64 4 0 17 Oct 2022
The Role of Coverage in Online Reinforcement Learning Tengyang Xie Dylan J. Foster Yu Bai Nan Jiang Sham Kakade OffRL 85 60 0 09 Oct 2022
Making Decisions under Outcome Performativity Michael P. Kim Juan C. Perdomo 91 21 0 04 Oct 2022
A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning Zixiang Chen C. J. Li An Yuan Quanquan Gu Michael I. Jordan OffRL 151 27 0 30 Sep 2022
Advertising Media and Target Audience Optimization via High-dimensional Bandits Wenjia Ba J. Harrison Harikesh S. Nair 59 0 0 17 Sep 2022
Sales Channel Optimization via Simulations Based on Observational Data with Delayed Rewards: A Case Study at LinkedIn Diana M. Negoescu Pasha Khosravi Shadow Zhao Nanyu Chen P. Ahammad H. González 31 0 0 16 Sep 2022
Feature selection with gradient descent on two-layer networks in low-rotation regimes Matus Telgarsky MLT 81 16 0 04 Aug 2022
Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces Yinglun Zhu Paul Mineiro 62 18 0 12 Jul 2022
Interaction-Grounded Learning with Action-inclusive Feedback Tengyang Xie Akanksha Saran Dylan J. Foster Lekan Molu Ida Momennejad Nan Jiang Paul Mineiro John Langford 69 10 0 16 Jun 2022
Efficient Heterogeneous Treatment Effect Estimation With Multiple Experiments and Multiple Outcomes Leon Yao Caroline Lo Israel Nir S. Tan Ariel Evnine Adam Lerer A. Peysakhovich CML 57 7 0 10 Jun 2022
Asymptotic Instance-Optimal Algorithms for Interactive Decision Making Kefan Dong Tengyu Ma 132 9 0 06 Jun 2022
Provable General Function Class Representation Learning in Multitask Bandits and MDPs Rui Lu Andrew Zhao S. Du Gao Huang OffRL 104 10 0 31 May 2022
Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent Yu Bai Chi Jin Song Mei Ziang Song Tiancheng Yu OffRL 103 19 0 30 May 2022
Chain of Thought Imitation with Procedure Cloning Mengjiao Yang Dale Schuurmans Pieter Abbeel Ofir Nachum OffRL 111 33 0 22 May 2022
$Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits$ Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits Avishek Ghosh Abishek Sankararaman 52 4 0 19 May 2022
Efficient Active Learning with Abstention Yinglun Zhu Robert D. Nowak 107 15 0 31 Mar 2022
Flexible and Efficient Contextual Bandits with Heterogeneous Treatment Effect Oracles Aldo G. Carranza Sanath Kumar Krishnamurthy Susan Athey 50 1 0 30 Mar 2022
Stochastic linear optimization never overfits with quadratically-bounded losses on general data Matus Telgarsky 90 12 0 14 Feb 2022
Near-Optimal Learning of Extensive-Form Games with Imperfect Information Yunru Bai Chi Jin Song Mei Tiancheng Yu 104 26 0 03 Feb 2022
Variance-Optimal Augmentation Logging for Counterfactual Evaluation in Contextual Bandits Aaron David Tucker Thorsten Joachims OffRL 36 9 0 03 Feb 2022
Context Uncertainty in Contextual Bandits with Applications to Recommender Systems Hao Wang Yifei Ma Hao Ding Yuyang Wang 94 6 0 01 Feb 2022
Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs Linear Valuation with Unknown Noise Jianyu Xu Yu Wang 136 23 0 27 Jan 2022