ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1506.03271
  4. Cited By
Explore no more: Improved high-probability regret bounds for
  non-stochastic bandits

Explore no more: Improved high-probability regret bounds for non-stochastic bandits

10 June 2015
Gergely Neu
ArXivPDFHTML

Papers citing "Explore no more: Improved high-probability regret bounds for non-stochastic bandits"

45 / 45 papers shown
Title
Online Episodic Convex Reinforcement Learning
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
31
0
0
12 May 2025
A New Benchmark for Online Learning with Budget-Balancing Constraints
A New Benchmark for Online Learning with Budget-Balancing Constraints
M. Braverman
Jingyi Liu
Jieming Mao
Jon Schneider
Eric Xue
60
0
0
19 Mar 2025
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Online Planning of Power Flows for Power Systems Against Bushfires Using Spatial Context
Jianyu Xu
Qiuzhuang Sun
Yang Yang
Huadong Mo
Daoyi Dong
83
0
0
24 Feb 2025
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity
Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity
Quan Nguyen
Nishant A. Mehta
Cristóbal Guzmán
39
1
0
01 Oct 2024
Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial
  Constraints
Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial Constraints
Martino Bernasconi
Matteo Castiglioni
A. Celli
Federico Fusco
31
2
0
25 May 2024
No-Regret Algorithms in non-Truthful Auctions with Budget and ROI
  Constraints
No-Regret Algorithms in non-Truthful Auctions with Budget and ROI Constraints
Gagan Aggarwal
Giannis Fikioris
Mingfei Zhao
40
5
0
15 Apr 2024
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Hao Ma
Melanie Zeilinger
Michael Muehlebach
62
0
0
08 Apr 2024
Distributed No-Regret Learning for Multi-Stage Systems with End-to-End
  Bandit Feedback
Distributed No-Regret Learning for Multi-Stage Systems with End-to-End Bandit Feedback
I-Hong Hou
OffRL
44
0
0
06 Apr 2024
Learning Adversarial MDPs with Stochastic Hard Constraints
Learning Adversarial MDPs with Stochastic Hard Constraints
Francesco Emanuele Stradi
Matteo Castiglioni
A. Marchesi
Nicola Gatti
39
4
0
06 Mar 2024
CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded
  Stochastic Corruption
CRIMED: Lower and Upper Bounds on Regret for Bandits with Unbounded Stochastic Corruption
Shubhada Agrawal
Timothée Mathieu
D. Basu
Odalric-Ambrym Maillard
30
2
0
28 Sep 2023
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with
  Robustness to Excessive Delays
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays
Saeed Masoudian
Julian Zimmert
Yevgeny Seldin
47
3
0
21 Aug 2023
Anytime Model Selection in Linear Bandits
Anytime Model Selection in Linear Bandits
Parnian Kassraie
N. Emmenegger
Andreas Krause
Aldo Pacchiano
54
2
0
24 Jul 2023
Meta-Learning Adversarial Bandit Algorithms
Meta-Learning Adversarial Bandit Algorithms
M. Khodak
Ilya Osadchiy
Keegan Harris
Maria-Florina Balcan
Kfir Y. Levy
Ron Meir
Zhiwei Steven Wu
FedML
32
2
0
05 Jul 2023
Bandits with Replenishable Knapsacks: the Best of both Worlds
Bandits with Replenishable Knapsacks: the Best of both Worlds
Martino Bernasconi
Matteo Castiglioni
A. Celli
Federico Fusco
41
4
0
14 Jun 2023
Bandits for Sponsored Search Auctions under Unknown Valuation Model:
  Case Study in E-Commerce Advertising
Bandits for Sponsored Search Auctions under Unknown Valuation Model: Case Study in E-Commerce Advertising
Danil Provodin
Jérémie Joudioux
E. Duryev
24
0
0
31 Mar 2023
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games
  with Bandit Feedback
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback
Yang Cai
Haipeng Luo
Chen-Yu Wei
Weiqiang Zheng
34
18
0
05 Mar 2023
Estimating Optimal Policy Value in General Linear Contextual Bandits
Estimating Optimal Policy Value in General Linear Contextual Bandits
Jonathan Lee
Weihao Kong
Aldo Pacchiano
Vidya Muthukumar
Emma Brunskill
32
0
0
19 Feb 2023
Contextual Bandits and Optimistically Universal Learning
Contextual Bandits and Optimistically Universal Learning
Moise Blanchard
Steve Hanneke
Patrick Jaillet
OffRL
28
1
0
31 Dec 2022
SLOPT: Bandit Optimization Framework for Mutation-Based Fuzzing
SLOPT: Bandit Optimization Framework for Mutation-Based Fuzzing
Yuki Koike
H. Katsura
Hiromu Yakura
Yuma Kurogome
31
5
0
07 Nov 2022
On-Demand Sampling: Learning Optimally from Multiple Distributions
On-Demand Sampling: Learning Optimally from Multiple Distributions
Nika Haghtalab
Michael I. Jordan
Eric Zhao
FedML
55
35
0
22 Oct 2022
Improved High-Probability Regret for Adversarial Bandits with
  Time-Varying Feedback Graphs
Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs
Haipeng Luo
Hanghang Tong
Mengxiao Zhang
Yuheng Zhang
18
5
0
04 Oct 2022
Actor-Critic based Improper Reinforcement Learning
Actor-Critic based Improper Reinforcement Learning
Mohammadi Zaki
Avinash Mohan
Aditya Gopalan
Shie Mannor
21
2
0
19 Jul 2022
Best of Both Worlds Model Selection
Best of Both Worlds Model Selection
Aldo Pacchiano
Christoph Dann
Claudio Gentile
39
10
0
29 Jun 2022
The Complexity of Markov Equilibrium in Stochastic Games
The Complexity of Markov Equilibrium in Stochastic Games
C. Daskalakis
Noah Golowich
Kaipeng Zhang
41
56
0
08 Apr 2022
Generalized Bandit Regret Minimizer Framework in Imperfect Information Extensive-Form Game
Lin Meng
Yang Gao
52
1
0
11 Mar 2022
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
Near-Optimal Learning of Extensive-Form Games with Imperfect Information
Yunru Bai
Chi Jin
Song Mei
Tiancheng Yu
26
26
0
03 Feb 2022
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
74
21
0
31 Jan 2022
Uncoupled Bandit Learning towards Rationalizability: Benchmarks,
  Barriers, and Algorithms
Uncoupled Bandit Learning towards Rationalizability: Benchmarks, Barriers, and Algorithms
Jibang Wu
Haifeng Xu
Fan Yao
35
1
0
10 Nov 2021
Decentralized Cooperative Reinforcement Learning with Hierarchical
  Information Structure
Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure
Hsu Kao
Chen-Yu Wei
V. Subramanian
33
12
0
01 Nov 2021
On Improving Model-Free Algorithms for Decentralized Multi-Agent
  Reinforcement Learning
On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
Weichao Mao
Lin F. Yang
Kaipeng Zhang
Tamer Bacsar
46
57
0
12 Oct 2021
Provably Efficient Reinforcement Learning in Decentralized General-Sum
  Markov Games
Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games
Weichao Mao
Tamer Basar
36
66
0
12 Oct 2021
When Can We Learn General-Sum Markov Games with a Large Number of
  Players Sample-Efficiently?
When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?
Ziang Song
Song Mei
Yu Bai
74
67
0
08 Oct 2021
Bandit Algorithms for Precision Medicine
Bandit Algorithms for Precision Medicine
Yangyi Lu
Ziping Xu
Ambuj Tewari
66
11
0
10 Aug 2021
Policy Optimization in Adversarial MDPs: Improved Exploration via
  Dilated Bonuses
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Haipeng Luo
Chen-Yu Wei
Chung-Wei Lee
38
44
0
18 Jul 2021
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov
  Games with Perfect Recall
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall
Tadashi Kozuno
Pierre Ménard
Rémi Munos
Michal Valko
30
18
0
11 Jun 2021
Leveraging Good Representations in Linear Contextual Bandits
Leveraging Good Representations in Linear Contextual Bandits
Matteo Papini
Andrea Tirinzoni
Marcello Restelli
A. Lazaric
Matteo Pirotta
35
26
0
08 Apr 2021
A Simple Approach for Non-stationary Linear Bandits
A Simple Approach for Non-stationary Linear Bandits
Peng Zhao
Lijun Zhang
Yuan Jiang
Zhi-Hua Zhou
36
81
0
09 Mar 2021
Near-Optimal Reinforcement Learning with Self-Play
Near-Optimal Reinforcement Learning with Self-Play
Yunru Bai
Chi Jin
Tiancheng Yu
24
130
0
22 Jun 2020
Model selection for contextual bandits
Model selection for contextual bandits
Dylan J. Foster
A. Krishnamurthy
Haipeng Luo
OffRL
34
90
0
03 Jun 2019
Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model
Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model
Gi-Soo Kim
M. Paik
22
14
0
31 Jan 2019
Taming Non-stationary Bandits: A Bayesian Approach
Taming Non-stationary Bandits: A Bayesian Approach
Vishnu Raj
Sheetal Kalyani
35
76
0
31 Jul 2017
Online Learning with Abstention
Online Learning with Abstention
Corinna Cortes
Giulia DeSalvo
Claudio Gentile
M. Mohri
Scott Yang
11
47
0
09 Mar 2017
Refined Lower Bounds for Adversarial Bandits
Refined Lower Bounds for Adversarial Bandits
Sébastien Gerchinovitz
Tor Lattimore
AAML
25
58
0
24 May 2016
Delay and Cooperation in Nonstochastic Bandits
Delay and Cooperation in Nonstochastic Bandits
Nicolò Cesa-Bianchi
Claudio Gentile
Yishay Mansour
Alberto Minora
17
144
0
15 Feb 2016
Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback
Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback
N. Alon
Nicolò Cesa-Bianchi
Claudio Gentile
Shie Mannor
Yishay Mansour
Ohad Shamir
OffRL
38
130
0
30 Sep 2014
1