ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.03265
  4. Cited By
More Adaptive Algorithms for Adversarial Bandits

More Adaptive Algorithms for Adversarial Bandits

10 January 2018
Chen-Yu Wei
Haipeng Luo
ArXivPDFHTML

Papers citing "More Adaptive Algorithms for Adversarial Bandits"

43 / 43 papers shown
Title
Online Episodic Convex Reinforcement Learning
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
29
0
0
12 May 2025
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Efficiently Solving Discounted MDPs with Predictions on Transition Matrices
Lixing Lyu
Jiashuo Jiang
Wang Chi Cheung
42
1
0
24 Feb 2025
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
Yu Chen
Jiatai Huang
Yan Dai
Longbo Huang
34
0
0
04 Oct 2024
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity
Quan Nguyen
Nishant A. Mehta
Cristóbal Guzmán
39
1
0
01 Oct 2024
Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits
Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits
Mengmeng Li
Daniel Kuhn
Bahar Taşkesen
44
0
0
30 Sep 2024
Online Optimization for Learning to Communicate over Time-Correlated Channels
Online Optimization for Learning to Communicate over Time-Correlated Channels
Zheshun Wu
Junfan Li
Zenglin Xu
Sumei Sun
Jie Liu
48
0
0
01 Sep 2024
Learnability in Online Kernel Selection with Memory Constraint via Data-dependent Regret Analysis
Learnability in Online Kernel Selection with Memory Constraint via Data-dependent Regret Analysis
Junfan Li
Shizhong Liao
23
0
0
01 Jul 2024
A Simple and Adaptive Learning Rate for FTRL in Online Learning with
  Minimax Regret of $Θ(T^{2/3})$ and its Application to
  Best-of-Both-Worlds
A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of Θ(T2/3)Θ(T^{2/3})Θ(T2/3) and its Application to Best-of-Both-Worlds
Taira Tsuchiya
Shinji Ito
26
0
0
30 May 2024
Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial
  Constraints
Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial Constraints
Martino Bernasconi
Matteo Castiglioni
A. Celli
Federico Fusco
31
2
0
25 May 2024
Distributed No-Regret Learning for Multi-Stage Systems with End-to-End
  Bandit Feedback
Distributed No-Regret Learning for Multi-Stage Systems with End-to-End Bandit Feedback
I-Hong Hou
OffRL
44
0
0
06 Apr 2024
LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual
  Bandits
LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits
Masahiro Kato
Shinji Ito
36
0
0
05 Mar 2024
Best-of-Both-Worlds Linear Contextual Bandits
Best-of-Both-Worlds Linear Contextual Bandits
Masahiro Kato
Shinji Ito
53
0
0
27 Dec 2023
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded
  Rewards
Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards
Hao Qin
Kwang-Sung Jun
Chicheng Zhang
41
0
0
28 Apr 2023
Near Optimal Memory-Regret Tradeoff for Online Learning
Near Optimal Memory-Regret Tradeoff for Online Learning
Binghui Peng
A. Rubinstein
CLL
34
10
0
03 Mar 2023
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
Christoph Dann
Chen-Yu Wei
Julian Zimmert
24
22
0
20 Feb 2023
Refined Regret for Adversarial MDPs with Linear Function Approximation
Refined Regret for Adversarial MDPs with Linear Function Approximation
Yan Dai
Haipeng Luo
Chen-Yu Wei
Julian Zimmert
31
12
0
30 Jan 2023
Banker Online Mirror Descent: A Universal Approach for Delayed Online
  Bandit Learning
Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning
Jiatai Huang
Yan Dai
Longbo Huang
27
6
0
25 Jan 2023
Near-Optimal $Φ$-Regret Learning in Extensive-Form Games
Near-Optimal ΦΦΦ-Regret Learning in Extensive-Form Games
Ioannis Anagnostides
Gabriele Farina
T. Sandholm
34
7
0
20 Aug 2022
Regret Minimization and Convergence to Equilibria in General-sum Markov
  Games
Regret Minimization and Convergence to Equilibria in General-sum Markov Games
Liad Erez
Tal Lancewicki
Uri Sherman
Tomer Koren
Yishay Mansour
42
25
0
28 Jul 2022
Best of Both Worlds Model Selection
Best of Both Worlds Model Selection
Aldo Pacchiano
Christoph Dann
Claudio Gentile
34
10
0
29 Jun 2022
Adversarially Robust Multi-Armed Bandit Algorithm with
  Variance-Dependent Regret Bounds
Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds
Shinji Ito
Taira Tsuchiya
Junya Honda
AAML
23
16
0
14 Jun 2022
Decentralized, Communication- and Coordination-free Learning in
  Structured Matching Markets
Decentralized, Communication- and Coordination-free Learning in Structured Matching Markets
C. Maheshwari
Eric Mazumdar
S. Shankar Sastry
19
11
0
06 Jun 2022
Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with
  Feedback Graphs
Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs
Shinji Ito
Taira Tsuchiya
Junya Honda
35
24
0
02 Jun 2022
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with
  Feedback Graphs
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs
Chloé Rouyer
Dirk van der Hoeven
Nicolò Cesa-Bianchi
Yevgeny Seldin
23
15
0
01 Jun 2022
Policy Optimization for Stochastic Shortest Path
Policy Optimization for Stochastic Shortest Path
Liyu Chen
Haipeng Luo
Aviv A. Rosenberg
19
12
0
07 Feb 2022
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed
  Bandits
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits
Jiatai Huang
Yan Dai
Longbo Huang
27
14
0
28 Jan 2022
On Optimal Robustness to Adversarial Corruption in Online Decision
  Problems
On Optimal Robustness to Adversarial Corruption in Online Decision Problems
Shinji Ito
42
22
0
22 Sep 2021
The best of both worlds: stochastic and adversarial episodic MDPs with
  unknown transition
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
Tiancheng Jin
Longbo Huang
Haipeng Luo
27
40
0
08 Jun 2021
Improved Analysis of the Tsallis-INF Algorithm in Stochastically
  Constrained Adversarial Bandits and Stochastic Bandits with Adversarial
  Corruptions
Improved Analysis of the Tsallis-INF Algorithm in Stochastically Constrained Adversarial Bandits and Stochastic Bandits with Adversarial Corruptions
Saeed Masoudian
Yevgeny Seldin
22
14
0
23 Mar 2021
An Algorithm for Stochastic and Adversarial Bandits with Switching Costs
An Algorithm for Stochastic and Adversarial Bandits with Switching Costs
Chloé Rouyer
Yevgeny Seldin
Nicolò Cesa-Bianchi
AAML
21
24
0
19 Feb 2021
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and
  Known Transition
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition
Liyu Chen
Haipeng Luo
Chen-Yu Wei
29
32
0
07 Dec 2020
No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium
No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium
A. Celli
A. Marchesi
Gabriele Farina
N. Gatti
30
45
0
01 Apr 2020
Bandits with adversarial scaling
Bandits with adversarial scaling
Thodoris Lykouris
Vahab Mirrokni
R. Leme
11
14
0
04 Mar 2020
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback
Chung-Wei Lee
Haipeng Luo
Mengxiao Zhang
9
23
0
02 Feb 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward
  Markov Decision Processes
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
107
100
0
15 Oct 2019
Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits
Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits
Lingda Wang
Huozhi Zhou
Bingcong Li
Lav Varshney
Zhizhen Zhao
22
6
0
12 Sep 2019
Exploration by Optimisation in Partial Monitoring
Exploration by Optimisation in Partial Monitoring
Tor Lattimore
Csaba Szepesvári
33
38
0
12 Jul 2019
Equipping Experts/Bandits with Long-term Memory
Equipping Experts/Bandits with Long-term Memory
Kai Zheng
Haipeng Luo
Ilias Diakonikolas
Liwei Wang
OffRL
14
15
0
30 May 2019
Beating Stochastic and Adversarial Semi-bandits Optimally and
  Simultaneously
Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously
Julian Zimmert
Haipeng Luo
Chen-Yu Wei
11
79
0
25 Jan 2019
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits
Julian Zimmert
Yevgeny Seldin
AAML
24
175
0
19 Jul 2018
Stochastic bandits robust to adversarial corruptions
Stochastic bandits robust to adversarial corruptions
Thodoris Lykouris
Vahab Mirrokni
R. Leme
AAML
8
202
0
25 Mar 2018
Efficient Contextual Bandits in Non-stationary Worlds
Efficient Contextual Bandits in Non-stationary Worlds
Haipeng Luo
Chen-Yu Wei
Alekh Agarwal
John Langford
22
129
0
05 Aug 2017
Kernel-based methods for bandit convex optimization
Kernel-based methods for bandit convex optimization
Sébastien Bubeck
Ronen Eldan
Y. Lee
84
164
0
11 Jul 2016
1