v1v2 (latest)

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

4 February 2014

Papers citing "Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits"

50 / 202 papers shown

Title
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles Dylan J. Foster Alexander Rakhlin 371 213 0 12 Feb 2020
Online Preselection with Context Information under the Plackett-Luce Model Adil El Mesaoudi-Paul Viktor Bengs Eyke Hüllermeier 51 4 0 11 Feb 2020
Combinatorial Semi-Bandit in the Non-Stationary Environment Wei Chen Liwei Wang Haoyu Zhao Kai Zheng 86 18 0 10 Feb 2020
Fair Contextual Multi-Armed Bandits: Theory and Experiments Yifang Chen Alex Cuellar Haipeng Luo Jignesh Modi Heramb Nemlekar Stefanos Nikolaidis FaML 91 61 0 13 Dec 2019
Sublinear Optimal Policy Value Estimation in Contextual Bandits Weihao Kong Gregory Valiant Emma Brunskill OffRL 62 13 0 12 Dec 2019
Online Pricing with Reserve Price Constraint for Personal Data Markets Chaoyue Niu Zhenzhe Zheng Fan Wu Shaojie Tang Guihai Chen 53 34 0 28 Nov 2019
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning Dipendra Kumar Misra Mikael Henaff A. Krishnamurthy John Langford 85 151 0 13 Nov 2019
Neural Contextual Bandits with UCB-based Exploration Dongruo Zhou Lihong Li Quanquan Gu 135 15 0 11 Nov 2019
Multi-Armed Bandits with Correlated Arms Samarth Gupta Shreyas Chaudhari Gauri Joshi Osman Yağan 177 51 0 06 Nov 2019
Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes Yichun Hu Nathan Kallus Xiaojie Mao 97 34 0 05 Sep 2019
$$\sqrt{n}$-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank$ $\sqrt{n}$ -Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank Kefan Dong Jian-wei Peng Yining Wang Yuanshuo Zhou OffRL 81 36 0 05 Sep 2019
Adaptive Robot-Assisted Feeding: An Online Learning Framework for Acquiring Previously Unseen Food Items E. Gordon Xiang Meng Matt Barnes Tapomayukh Bhattacharjee S. Srinivasa OffRL OnRL 155 47 0 19 Aug 2019
Off-policy Learning for Multiple Loggers Li He Long Xia Wei Zeng Zhi-Ming Ma Yue Zhao Dawei Yin OffRL 57 10 0 23 Jul 2019
Exploiting Relevance for Online Decision-Making in High-Dimensions E. Turğay Cem Bulucu Cem Tekin 64 4 0 01 Jul 2019
Adaptive Sequential Experiments with Unknown Information Arrival Processes Y. Gur Ahmadreza Momeni 92 3 0 28 Jun 2019
ASAC: Active Sensing using Actor-Critic models Jinsung Yoon James Jordon M. Schaar CML 59 16 0 16 Jun 2019
Distributionally Robust Counterfactual Risk Minimization Louis Faury Ugo Tanielian Flavian Vasile E. Smirnova Elvis Dohmatob 78 45 0 14 Jun 2019
Empirical Likelihood for Contextual Bandits Nikos Karampatziakis John Langford Paul Mineiro OffRL 134 9 0 07 Jun 2019
Stochastic Bandits with Context Distributions Johannes Kirschner Andreas Krause 70 30 0 06 Jun 2019
Model selection for contextual bandits Dylan J. Foster A. Krishnamurthy Haipeng Luo OffRL 216 90 0 03 Jun 2019
Multi-Objective Generalized Linear Bandits Shiyin Lu G. Wang Yao Hu Lijun Zhang 25 22 0 30 May 2019
On the Generalization Gap in Reparameterizable Reinforcement Learning Huan Wang Stephan Zheng Caiming Xiong R. Socher 117 41 0 29 May 2019
Provably Efficient Imitation Learning from Observation Alone Wen Sun Anirudh Vemula Byron Boots J. Andrew Bagnell 167 107 0 27 May 2019
OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits Niladri S. Chatterji Vidya Muthukumar Peter L. Bartlett 85 46 0 24 May 2019
Introduction to Multi-Armed Bandits Aleksandrs Slivkins 687 1,025 0 15 Apr 2019
Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits Yingkai Li Yining Wang Yuanshuo Zhou 193 61 0 30 Mar 2019
Modeling and Optimization of Human-machine Interaction Processes via the Maximum Entropy Principle Jiaxiao Zheng G. Veciana 26 1 0 17 Mar 2019
Cost-Effective Incentive Allocation via Structured Counterfactual Inference Romain Lopez Chenchen Li X. Yan Junwu Xiong Michael I. Jordan Yuan Qi Le Song OffRL 96 17 0 07 Feb 2019
Equal Opportunity in Online Classification with Partial Feedback Yahav Bechavod Katrina Ligett Aaron Roth Bo Waggoner Zhiwei Steven Wu FaML 78 60 0 06 Feb 2019
Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting A. Krishnamurthy John Langford Aleksandrs Slivkins Chicheng Zhang OffRL 177 65 0 05 Feb 2019
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free Yifang Chen Chung-Wei Lee Haipeng Luo Chen-Yu Wei 166 134 0 03 Feb 2019
Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective Anirudh Vemula Wen Sun J. Andrew Bagnell 73 40 0 31 Jan 2019
The Assistive Multi-Armed Bandit Lawrence Chan Dylan Hadfield-Menell S. Srinivasa Anca Dragan 61 36 0 24 Jan 2019
Online Learning with Diverse User Preferences Chao Gan Jing Yang Ruida Zhou Cong Shen 42 2 0 23 Jan 2019
Online Learning for Measuring Incentive Compatibility in Ad Auctions Zhe Feng Okke Schrijvers Eric Sodomka 38 22 0 21 Jan 2019
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback Chicheng Zhang Alekh Agarwal Hal Daumé John Langford S. Negahban 85 34 0 02 Jan 2019
Top-K Off-Policy Correction for a REINFORCE Recommender System Minmin Chen Alex Beutel Paul Covington Sagar Jain Francois Belletti Ed H. Chi CML OffRL 149 485 0 06 Dec 2018
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments Howard Chen Alane Suhr Dipendra Kumar Misra Noah Snavely Yoav Artzi 117 391 0 29 Nov 2018
Adversarial Bandits with Knapsacks Nicole Immorlica Karthik Abinav Sankararaman Robert Schapire Aleksandrs Slivkins 209 116 0 28 Nov 2018
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits Branislav Kveton Csaba Szepesvári Sharan Vaswani Zheng Wen Mohammad Ghavamzadeh Tor Lattimore 180 70 0 13 Nov 2018
Adapting multi-armed bandits policies to contextual bandits scenarios David Cortes 74 32 0 11 Nov 2018
CAB: Continuous Adaptive Blending Estimator for Policy Evaluation and Learning Yi-Hsun Su Lequn Wang Michele Santacatterina Mohsen Guizani CML OffRL 25 7 0 06 Nov 2018
Adversarial Attacks on Stochastic Bandits Kwang-Sung Jun Lihong Li Yuzhe Ma Xiaojin Zhu AAML 373 124 0 29 Oct 2018
Contextual Bandits with Cross-learning S. Balseiro Negin Golrezaei Mohammad Mahdian Vahab Mirrokni Jon Schneider 171 51 0 25 Sep 2018
Linear Bandits with Stochastic Delayed Feedback Claire Vernade Alexandra Carpentier Tor Lattimore Giovanni Zappella Beyza Ermis M. Brueckner 80 67 0 05 Jul 2018
Playing against Nature: causal discovery for decision making under uncertainty Mauricio Gonzalez-Soto L. Sucar Hugo Jair Escalante CML 20 9 0 03 Jul 2018
Contextual bandits with surrogate losses: Margin bounds and efficient algorithms Dylan J. Foster A. Krishnamurthy 167 18 0 28 Jun 2018
Causal Bandits with Propagating Inference Akihiro Yabe Daisuke Hatano Hanna Sumita Shinji Ito Naonori Kakimura Takuro Fukunaga Ken-ichi Kawarabayashi CML 62 33 0 06 Jun 2018
The Externalities of Exploration and How Data Diversity Helps Exploitation Manish Raghavan Aleksandrs Slivkins Jennifer Wortman Vaughan Zhiwei Steven Wu 241 53 0 01 Jun 2018
A Study on Overfitting in Deep Reinforcement Learning Chiyuan Zhang Oriol Vinyals Rémi Munos Samy Bengio OffRL OnRL 61 391 0 18 Apr 2018