v1v2 (latest)

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

4 February 2014

Papers citing "Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits"

50 / 202 papers shown

Title
A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits Yasin Abbasi-Yadkori András Gyorgy N. Lazić 63 22 0 17 Jan 2022
Top $K$ Ranking for Multi-Armed Bandit with Noisy Evaluations Evrard Garcelon Vashist Avadhanula A. Lazaric and Matteo Pirotta 73 4 0 13 Dec 2021
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization Thanh Nguyen-Tang Sunil R. Gupta A. Nguyen Svetha Venkatesh OffRL 97 30 0 27 Nov 2021
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability Aadirupa Saha A. Krishnamurthy 105 38 0 24 Nov 2021
Towards the D-Optimal Online Experiment Design for Recommender Selection Madina Abdrakhmanova Saniya Abushakimova Evren Körpeoglu H. A. Varol Kannan Achan 98 3 0 23 Oct 2021
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics Yonathan Efroni Dipendra Kumar Misra A. Krishnamurthy Alekh Agarwal John Langford OffRL 76 23 0 17 Oct 2021
Representation Learning for Online and Offline RL in Low-rank MDPs Masatoshi Uehara Xuezhou Zhang Wen Sun OffRL 140 129 0 09 Oct 2021
Distributionally Robust Learning Ruidi Chen I. Paschalidis OOD 93 69 0 20 Aug 2021
Combining Online Learning and Offline Learning for Contextual Bandits with Deficient Support Hung The Tran Sunil R. Gupta Thanh Nguyen-Tang Santu Rana Svetha Venkatesh OffRL 65 5 0 24 Jul 2021
Adapting to Misspecification in Contextual Bandits Dylan J. Foster Claudio Gentile M. Mohri Julian Zimmert 117 87 0 12 Jul 2021
Model Selection for Generic Contextual Bandits Avishek Ghosh Abishek Sankararaman Kannan Ramchandran 76 6 0 07 Jul 2021
On component interactions in two-stage recommender systems Jiri Hron K. Krauth Michael I. Jordan Niki Kilbertus CML LRM 74 31 0 28 Jun 2021
Dealing with Expert Bias in Collective Decision-Making Axel Abels Tom Lenaerts V. Trianni Ann Nowé 56 6 0 25 Jun 2021
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity Dhruv Malik Aldo Pacchiano Vishwak Srinivasan Yuanzhi Li 57 6 0 15 Jun 2021
Efficient Online Learning for Dynamic k-Clustering Dimitris Fotakis Georgios Piliouras Stratis Skoulakis 34 4 0 08 Jun 2021
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning Aurélien F. Bibaut Antoine Chambaz Maria Dimakopoulou Nathan Kallus Mark van der Laan OffRL 91 15 0 03 Jun 2021
Information Directed Sampling for Sparse Linear Bandits Botao Hao Tor Lattimore Wei Deng 62 19 0 29 May 2021
Statistical Testing under Distributional Shifts Nikolaj Thams Sorawit Saengkyongam Niklas Pfister J. Peters OOD 124 10 0 22 May 2021
Adaptive ABAC Policy Learning: A Reinforcement Learning Approach Leila Karimi Mai Abdelhakim J. Joshi 13 13 0 18 May 2021
An Efficient Algorithm for Deep Stochastic Contextual Bandits Tan Zhu Guannan Liang Chunjiang Zhu HaiNing Li J. Bi 77 1 0 12 Apr 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation Andrea Zanette Ching-An Cheng Alekh Agarwal 106 53 0 24 Mar 2021
Online Multi-Armed Bandits with Adaptive Inference Maria Dimakopoulou Zhimei Ren Zhengyuan Zhou 87 36 0 25 Feb 2021
Logarithmic Regret in Feature-based Dynamic Pricing Jianyu Xu Yu Wang 73 27 0 20 Feb 2021
Boosting for Online Convex Optimization Elad Hazan Karan Singh OffRL 66 9 0 18 Feb 2021
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously Chung-Wei Lee Haipeng Luo Chen-Yu Wei Mengxiao Zhang Xiaojin Zhang 98 49 0 11 Feb 2021
Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach Chen-Yu Wei Haipeng Luo OffRL 187 107 0 10 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms Chi Jin Qinghua Liu Sobhan Miryoosefi OffRL 123 220 0 01 Feb 2021
Relational Boosted Bandits A. Kakadiya S. Natarajan Balaraman Ravindran 33 5 0 16 Dec 2020
Neural Contextual Bandits with Deep Representation and Shallow Exploration Pan Xu Zheng Wen Handong Zhao Quanquan Gu OffRL 89 78 0 03 Dec 2020
Improving Offline Contextual Bandits with Distributional Robustness Otmane Sakhi Louis Faury Flavian Vasile OffRL 41 7 0 13 Nov 2020
Leveraging Post Hoc Context for Faster Learning in Bandit Settings with Applications in Robot-Assisted Feeding E. Gordon Sumegh Roychowdhury Tapomayukh Bhattacharjee Kevin Jamieson S. Srinivasa 145 20 0 05 Nov 2020
Online Algorithm for Unsupervised Sequential Selection with Contextual Information Arun Verma M. Hanawal Csaba Szepesvári Venkatesh Saligrama 53 6 0 23 Oct 2020
Carousel Personalization in Music Streaming Apps with Contextual Bandits Walid Bendada Guillaume Salha-Galvan Théo Bontempelli 70 57 0 14 Sep 2020
Unifying Clustered and Non-stationary Bandits Chuanhao Li Qingyun Wu Hongning Wang 93 12 0 05 Sep 2020
Fast Distributed Bandits for Online Recommendation Systems K. Mahadik Qingyun Wu Shuai Li Amit Sabne 87 60 0 16 Jul 2020
Quantum exploration algorithms for multi-armed bandits Daochen Wang Xuchen You Tongyang Li Andrew M. Childs 109 29 0 14 Jul 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs Alekh Agarwal Sham Kakade A. Krishnamurthy Wen Sun OffRL 205 227 0 18 Jun 2020
Off-policy Bandits with Deficient Support Noveen Sachdeva Yi-Hsun Su Thorsten Joachims OffRL 200 78 0 16 Jun 2020
Efficient Contextual Bandits with Continuous Actions Maryam Majzoubi Chicheng Zhang Rajan Chari A. Krishnamurthy John Langford Aleksandrs Slivkins OffRL 80 32 0 10 Jun 2020
Meta-Learning Bandit Policies by Gradient Ascent Branislav Kveton Martin Mladenov Chih-Wei Hsu Manzil Zaheer Csaba Szepesvári Craig Boutilier 76 9 0 09 Jun 2020
Greedy Algorithm almost Dominates in Smoothed Contextual Bandits Manish Raghavan Aleksandrs Slivkins Jennifer Wortman Vaughan Zhiwei Steven Wu 388 18 0 19 May 2020
DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret Yichun Hu Nathan Kallus OffRL 17 0 0 06 May 2020
Learning nonlinear dynamical systems from a single trajectory Dylan J. Foster Alexander Rakhlin Tuhin Sarkar 52 74 0 30 Apr 2020
Counterfactual Learning of Stochastic Policies with Continuous Actions: from Models to Offline Evaluation Houssam Zenati A. Bietti Matthieu Martin Eustache Diemert Pierre Gaillard Julien Mairal OffRL CML 59 4 0 22 Apr 2020
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability D. Simchi-Levi Yunzong Xu OffRL 441 112 0 28 Mar 2020
Delay-Adaptive Learning in Generalized Linear Contextual Bandits Jose H. Blanchet Renyuan Xu Zhengyuan Zhou OffRL 38 6 0 11 Mar 2020
Contextual Blocking Bandits Soumya Basu Orestis Papadigenopoulos Constantine Caramanis Sanjay Shakkottai 83 21 0 06 Mar 2020
Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits Aurélien F. Bibaut Antoine Chambaz Mark van der Laan OffRL 116 3 0 05 Mar 2020
Stochastic Linear Contextual Bandits with Diverse Contexts Weiqiang Wu Jing Yang Cong Shen 117 14 0 05 Mar 2020
Taking a hint: How to leverage loss predictors in contextual bandits? Chen-Yu Wei Haipeng Luo Alekh Agarwal 170 27 0 04 Mar 2020