More Adaptive Algorithms for Adversarial Bandits

10 January 2018

Papers citing "More Adaptive Algorithms for Adversarial Bandits"

43 / 43 papers shown

Title
Online Episodic Convex Reinforcement Learning B. Moreno Khaled Eldowa Pierre Gaillard Margaux Brégère Nadia Oudjane OffRL 29 0 0 12 May 2025
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices Lixing Lyu Jiashuo Jiang Wang Chi Cheung 42 1 0 24 Feb 2025
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs Yu Chen Jiatai Huang Yan Dai Longbo Huang 34 0 0 04 Oct 2024
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity Quan Nguyen Nishant A. Mehta Cristóbal Guzmán 39 1 0 01 Oct 2024
Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits Mengmeng Li Daniel Kuhn Bahar Taşkesen 44 0 0 30 Sep 2024
Online Optimization for Learning to Communicate over Time-Correlated Channels Zheshun Wu Junfan Li Zenglin Xu Sumei Sun Jie Liu 48 0 0 01 Sep 2024
Learnability in Online Kernel Selection with Memory Constraint via Data-dependent Regret Analysis Junfan Li Shizhong Liao 23 0 0 01 Jul 2024
$A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $Θ(T^{2/3})$ and its Application to Best-of-Both-Worlds$ A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $Θ(T^{2/3})$ and its Application to Best-of-Both-Worlds Taira Tsuchiya Shinji Ito 26 0 0 30 May 2024
Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial Constraints Martino Bernasconi Matteo Castiglioni A. Celli Federico Fusco 31 2 0 25 May 2024
Distributed No-Regret Learning for Multi-Stage Systems with End-to-End Bandit Feedback I-Hong Hou OffRL 44 0 0 06 Apr 2024
LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits Masahiro Kato Shinji Ito 36 0 0 05 Mar 2024
Best-of-Both-Worlds Linear Contextual Bandits Masahiro Kato Shinji Ito 53 0 0 27 Dec 2023
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards Hao Qin Kwang-Sung Jun Chicheng Zhang 41 0 0 28 Apr 2023
Near Optimal Memory-Regret Tradeoff for Online Learning Binghui Peng A. Rubinstein CLL 34 10 0 03 Mar 2023
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond Christoph Dann Chen-Yu Wei Julian Zimmert 24 22 0 20 Feb 2023
Refined Regret for Adversarial MDPs with Linear Function Approximation Yan Dai Haipeng Luo Chen-Yu Wei Julian Zimmert 31 12 0 30 Jan 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning Jiatai Huang Yan Dai Longbo Huang 27 6 0 25 Jan 2023
Near-Optimal $Φ$ -Regret Learning in Extensive-Form Games Ioannis Anagnostides Gabriele Farina T. Sandholm 34 7 0 20 Aug 2022
Regret Minimization and Convergence to Equilibria in General-sum Markov Games Liad Erez Tal Lancewicki Uri Sherman Tomer Koren Yishay Mansour 42 25 0 28 Jul 2022
Best of Both Worlds Model Selection Aldo Pacchiano Christoph Dann Claudio Gentile 34 10 0 29 Jun 2022
Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds Shinji Ito Taira Tsuchiya Junya Honda AAML 23 16 0 14 Jun 2022
Decentralized, Communication- and Coordination-free Learning in Structured Matching Markets C. Maheshwari Eric Mazumdar S. Shankar Sastry 19 11 0 06 Jun 2022
Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs Shinji Ito Taira Tsuchiya Junya Honda 35 24 0 02 Jun 2022
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs Chloé Rouyer Dirk van der Hoeven Nicolò Cesa-Bianchi Yevgeny Seldin 23 15 0 01 Jun 2022
Policy Optimization for Stochastic Shortest Path Liyu Chen Haipeng Luo Aviv A. Rosenberg 19 12 0 07 Feb 2022
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits Jiatai Huang Yan Dai Longbo Huang 27 14 0 28 Jan 2022
On Optimal Robustness to Adversarial Corruption in Online Decision Problems Shinji Ito 42 22 0 22 Sep 2021
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition Tiancheng Jin Longbo Huang Haipeng Luo 27 40 0 08 Jun 2021
Improved Analysis of the Tsallis-INF Algorithm in Stochastically Constrained Adversarial Bandits and Stochastic Bandits with Adversarial Corruptions Saeed Masoudian Yevgeny Seldin 22 14 0 23 Mar 2021
An Algorithm for Stochastic and Adversarial Bandits with Switching Costs Chloé Rouyer Yevgeny Seldin Nicolò Cesa-Bianchi AAML 21 24 0 19 Feb 2021
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition Liyu Chen Haipeng Luo Chen-Yu Wei 29 32 0 07 Dec 2020
No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium A. Celli A. Marchesi Gabriele Farina N. Gatti 30 45 0 01 Apr 2020
Bandits with adversarial scaling Thodoris Lykouris Vahab Mirrokni R. Leme 11 14 0 04 Mar 2020
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback Chung-Wei Lee Haipeng Luo Mengxiao Zhang 9 23 0 02 Feb 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes Chen-Yu Wei Mehdi Jafarnia-Jahromi Haipeng Luo Hiteshi Sharma R. Jain 107 100 0 15 Oct 2019
Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits Lingda Wang Huozhi Zhou Bingcong Li Lav Varshney Zhizhen Zhao 22 6 0 12 Sep 2019
Exploration by Optimisation in Partial Monitoring Tor Lattimore Csaba Szepesvári 33 38 0 12 Jul 2019
Equipping Experts/Bandits with Long-term Memory Kai Zheng Haipeng Luo Ilias Diakonikolas Liwei Wang OffRL 14 15 0 30 May 2019
Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously Julian Zimmert Haipeng Luo Chen-Yu Wei 11 79 0 25 Jan 2019
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits Julian Zimmert Yevgeny Seldin AAML 24 175 0 19 Jul 2018
Stochastic bandits robust to adversarial corruptions Thodoris Lykouris Vahab Mirrokni R. Leme AAML 8 202 0 25 Mar 2018
Efficient Contextual Bandits in Non-stationary Worlds Haipeng Luo Chen-Yu Wei Alekh Agarwal John Langford 22 129 0 05 Aug 2017
Kernel-based methods for bandit convex optimization Sébastien Bubeck Ronen Eldan Y. Lee 84 164 0 11 Jul 2016