The multi-armed bandit problem with covariates

27 October 2011

Papers citing "The multi-armed bandit problem with covariates"

41 / 41 papers shown

Title
Locally Private Nonparametric Contextual Multi-armed Bandits Yuheng Ma Feiyu Jiang Zifeng Zhao Hanfang Yang Y. Yu 53 0 0 11 Mar 2025
Contextual Bandits for Unbounded Context Distributions Puning Zhao Xiaogang Xu Zhe Liu Huiwen Wu Qin Zhang Zong Ke Tianhang Zheng 74 4 0 19 Aug 2024
Improved Algorithms for Contextual Dynamic Pricing Matilde Tullii Solenne Gaucher Nadav Merlis Vianney Perchet 61 1 0 17 Jun 2024
Batched Nonparametric Contextual Bandits Rong Jiang Cong Ma OffRL 39 1 0 27 Feb 2024
Allocating Divisible Resources on Arms with Unknown and Random Rewards Ningyuan Chen Wenhao Li 24 0 0 28 Jun 2023
Nearest Neighbour with Bandit Feedback Stephen Pasteris Chris Hicks V. Mavroudis 16 3 0 23 Jun 2023
Trading-off price for data quality to achieve fair online allocation M. Molina Nicolas Gast P. Loiseau Vianney Perchet 37 4 0 23 Jun 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage Masatoshi Uehara Nathan Kallus Jason D. Lee Wen Sun OffRL 55 5 0 05 Feb 2023
Smooth Non-Stationary Bandits S. Jia Qian Xie Nathan Kallus P. Frazier 106 9 0 29 Jan 2023
Contextual Bandits and Optimistically Universal Learning Moise Blanchard Steve Hanneke Patrick Jaillet OffRL 28 1 0 31 Dec 2022
A survey on multi-player bandits Etienne Boursier Vianney Perchet 32 13 0 29 Nov 2022
Transfer Learning for Contextual Multi-armed Bandits Changxiao Cai T. Tony Cai Hongzhe Li 47 16 0 22 Nov 2022
Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms Xuchuang Wang Hong Xie John C. S. Lui 30 6 0 17 Jun 2022
Truncated LinUCB for Stochastic Linear Bandits Yanglei Song Meng zhou 52 0 0 23 Feb 2022
Fast Rates for the Regret of Offline Reinforcement Learning Yichun Hu Nathan Kallus Masatoshi Uehara OffRL 24 30 0 31 Jan 2021
Fast Rates for Contextual Linear Optimization Yichun Hu Nathan Kallus Xiaojie Mao OffRL 34 41 0 05 Nov 2020
Real-Time Optimisation for Online Learning in Auctions Lorenzo Croissant Marc Abeille Clément Calauzènes 21 4 0 20 Oct 2020
Statistical Inference for Online Decision-Making: In a Contextual Bandit Setting Haoyu Chen Wenbin Lu R. Song OffRL 29 29 0 14 Oct 2020
An Asymptotically Optimal Multi-Armed Bandit Algorithm and Hyperparameter Optimization Yimin Huang Yujun Li Hanrong Ye Zhenguo Li Zhihua Zhang 32 7 0 11 Jul 2020
Treatment recommendation with distributional targets Anders Bredahl Kock David Preinerstorfer Bezirgen Veliyev OffRL 8 7 0 19 May 2020
Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes Yichun Hu Nathan Kallus Xiaojie Mao 29 34 0 05 Sep 2019
A Dimension-free Algorithm for Contextual Continuum-armed Bandits Wenhao Li Ningyuan Chen L. Jeff Hong 20 2 0 15 Jul 2019
Knowledge Gradient for Selection with Covariates: Consistency and Computation Liang Ding L. Hong Haihui Shen Xiaowei Zhang BDL 16 27 0 12 Jun 2019
Batched Multi-armed Bandits Problem Zijun Gao Yanjun Han Zhimei Ren Zhengqing Zhou 16 138 0 03 Apr 2019
Contextual Bandits with Cross-learning S. Balseiro Negin Golrezaei Mohammad Mahdian Vahab Mirrokni Jon Schneider 21 50 0 25 Sep 2018
An adaptive multiclass nearest neighbor classifier Nikita Puchkin V. Spokoiny 18 7 0 08 Apr 2018
The K-Nearest Neighbour UCB algorithm for multi-armed bandits with covariates Henry W. J. Reeve J. Mellor Gavin Brown 32 27 0 01 Mar 2018
Bandit Learning with Positive Externalities Virag Shah Jose H. Blanchet Ramesh Johari 23 19 0 15 Feb 2018
Nonparametric Stochastic Contextual Bandits M. Guan Heinrich Jiang 18 36 0 05 Jan 2018
Estimation Considerations in Contextual Bandits Maria Dimakopoulou Zhengyuan Zhou Susan Athey Guido Imbens 32 69 0 19 Nov 2017
Ranking and Selection with Covariates for Personalized Decision Making Haihui Shen L. Hong Xiaowei Zhang CML 18 53 0 07 Oct 2017
Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe Quentin Berthet Vianney Perchet 36 31 0 22 Feb 2017
Policy Learning with Observational Data Susan Athey Stefan Wager CML OffRL 32 183 0 09 Feb 2017
When to Reset Your Keys: Optimal Timing of Security Updates via Learning Zizhan Zheng Ness B. Shroff P. Mohapatra AAML 19 7 0 01 Dec 2016
Dynamic Assortment Personalization in High Dimensions Nathan Kallus Madeleine Udell 31 66 0 18 Oct 2016
Dynamic Pricing with Demand Covariates Sheng Qiang Mohsen Bayati 8 116 0 25 Apr 2016
Regret Analysis of the Finite-Horizon Gittins Index Strategy for Multi-Armed Bandits Tor Lattimore 32 46 0 18 Nov 2015
A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit Giuseppe Burtini Jason L. Loeppky Ramon Lawrence 39 119 0 02 Oct 2015
Batched bandit problems Vianney Perchet Philippe Rigollet Sylvain Chassang E. Snowberg OffRL 42 199 0 02 May 2015
Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback N. Alon Nicolò Cesa-Bianchi Claudio Gentile Shie Mannor Yishay Mansour Ohad Shamir OffRL 44 130 0 30 Sep 2014
Bounded regret in stochastic multi-armed bandits Sébastien Bubeck Vianney Perchet Philippe Rigollet 73 91 0 06 Feb 2013