Adversarial Combinatorial Bandits with General Non-linear Reward Functions

5 January 2021

Papers citing "Adversarial Combinatorial Bandits with General Non-linear Reward Functions"

15 / 15 papers shown

Title
Tight Lower Bounds for Combinatorial Multi-Armed Bandits Nadav Merlis Shie Mannor 31 17 0 13 Feb 2020
Robust Dynamic Assortment Optimization in the Presence of Outlier Customers Xi Chen A. Krishnamurthy Yining Wang 43 17 0 09 Oct 2019
Top-k Combinatorial Bandits with Full-Bandit Feedback Idan Rejwan Yishay Mansour 41 51 0 28 May 2019
Batch-Size Independent Regret Bounds for the Combinatorial Multi-Armed Bandit Problem Nadav Merlis Shie Mannor 46 27 0 08 May 2019
Polynomial-time Algorithms for Multiple-arm Identification with Full-bandit Feedback Yuko Kuroki Liyuan Xu Atsushi Miyauchi Junya Honda Masashi Sugiyama 55 18 0 27 Feb 2019
Dynamic Assortment Selection under the Nested Logit Models Xi Chen Chao Shi Yining Wang Yuanshuo Zhou 57 13 0 27 Jun 2018
An Optimal Policy for Dynamic Assortment Planning Under Uncapacitated Multinomial Logit Models Xi Chen Yining Wang Yuanshuo Zhou 116 4 0 12 May 2018
Thompson Sampling for Combinatorial Semi-Bandits Siwei Wang Wei Chen 40 127 0 13 Mar 2018
MNL-Bandit: A Dynamic Learning Approach to Assortment Selection Shipra Agrawal Vashist Avadhanula Vineet Goyal A. Zeevi 111 157 0 13 Jun 2017
Thompson Sampling for the MNL-Bandit Shipra Agrawal Vashist Avadhanula Vineet Goyal A. Zeevi 116 98 0 03 Jun 2017
Combinatorial Multi-Armed Bandit with General Reward Functions Wei Chen Wei Hu Fu Li Jiacheng Li Yu Liu Pinyan Lu 53 133 0 20 Oct 2016
Learning Mixtures of Gaussians in High Dimensions Rong Ge Qingqing Huang Sham Kakade 105 127 0 02 Mar 2015
Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms Wei Chen Yajun Wang Yang Yuan Qinshi Wang 80 283 0 31 Jul 2014
A Spectral Algorithm for Latent Dirichlet Allocation Anima Anandkumar Dean Phillips Foster Daniel J. Hsu Sham Kakade Yi-Kai Liu 170 302 0 30 Apr 2012
Towards minimax policies for online linear optimization with bandit feedback Sébastien Bubeck Nicolò Cesa-Bianchi Sham Kakade OffRL 194 150 0 14 Feb 2012