v1v2 (latest)

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously

25 January 2019

Papers citing "Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously"

25 / 25 papers shown

Title
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries Arnab Maiti Zhiyuan Fan Kevin Jamieson Lillian J. Ratliff Gabriele Farina 500 1 0 01 Apr 2025
A Model Selection Approach for Corruption Robust Reinforcement Learning Chen-Yu Wei Christoph Dann Julian Zimmert 176 45 0 31 Dec 2024
Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits Mengmeng Li Daniel Kuhn Bahar Taşkesen 120 0 0 30 Sep 2024
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond Xutong Liu Siwei Wang Jinhang Zuo Han Zhong Xuchuang Wang Zhiyong Wang Shuai Li Mohammad Hajiesmaili J. C. Lui Wei Chen 231 4 0 03 Jun 2024
LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits Masahiro Kato Shinji Ito 155 0 0 05 Mar 2024
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition Tiancheng Jin Haipeng Luo 92 57 0 10 Jun 2020
Introduction to Online Convex Optimization Elad Hazan OffRL 193 1,940 0 07 Sep 2019
Adaptation to Easy Data in Prediction with Limited Advice Tobias Sommer Thune Yevgeny Seldin 40 13 0 02 Jul 2018
TopRank: A practical algorithm for online stochastic ranking Tor Lattimore Branislav Kveton Shuai Li Csaba Szepesvári LRM 46 71 0 06 Jun 2018
More Adaptive Algorithms for Adversarial Bandits Chen-Yu Wei Haipeng Luo 152 185 0 10 Jan 2018
Sparsity, variance and curvature in multi-armed bandits Sébastien Bubeck Michael B. Cohen Yuanzhi Li 132 60 0 03 Nov 2017
Minimal Exploration in Structured Stochastic Bandits Richard Combes Stefan Magureanu Alexandre Proutiere 437 119 0 01 Nov 2017
An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits Yevgeny Seldin Gábor Lugosi 84 93 0 20 Feb 2017
The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits Tor Lattimore Csaba Szepesvári 156 105 0 14 Oct 2016
An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits P. Auer Chao-Kai Chiang 84 112 0 27 May 2016
Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning Wouter M. Koolen Peter Grünwald T. Erven 69 38 0 20 May 2016
Fighting Bandits with a New Kind of Smoothness Jacob D. Abernethy Chansoo Lee Ambuj Tewari AAML 98 79 0 14 Dec 2015
First-order regret bounds for combinatorial semi-bandits Gergely Neu 209 59 0 23 Feb 2015
A Second-order Bound with Excess Losses Pierre Gaillard Gilles Stoltz T. Erven 81 154 0 10 Feb 2014
Thompson Sampling for Complex Bandit Problems Aditya Gopalan Shie Mannor Yishay Mansour 158 203 0 03 Nov 2013
An efficient algorithm for learning with semi-bandit feedback Gergely Neu Gábor Bartók 136 80 0 13 May 2013
A Generalized Online Mirror Descent with Applications to Classification and Regression Francesco Orabona K. Crammer Nicolò Cesa-Bianchi 202 79 0 10 Apr 2013
Bounded regret in stochastic multi-armed bandits Sébastien Bubeck Vianney Perchet Philippe Rigollet 233 92 0 06 Feb 2013
Regret in Online Combinatorial Optimization Jean-Yves Audibert Sébastien Bubeck Gábor Lugosi OffRL 107 258 0 20 Apr 2012
Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards Yi Gai Bhaskar Krishnamachari Rahul Jain 184 263 0 22 Nov 2010