ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1703.05449
  4. Cited By
Minimax Regret Bounds for Reinforcement Learning

Minimax Regret Bounds for Reinforcement Learning

16 March 2017
M. G. Azar
Ian Osband
Rémi Munos
ArXivPDFHTML

Papers citing "Minimax Regret Bounds for Reinforcement Learning"

50 / 241 papers shown
Title
Improved Regret Bounds for Linear Adversarial MDPs via Linear
  Optimization
Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization
Fang-yuan Kong
Xiangcheng Zhang
Baoxiang Wang
Shuai Li
36
12
0
14 Feb 2023
Robust Knowledge Transfer in Tiered Reinforcement Learning
Robust Knowledge Transfer in Tiered Reinforcement Learning
Jiawei Huang
Niao He
OffRL
43
1
0
10 Feb 2023
Extragradient-Type Methods with $\mathcal{O} (1/k)$ Last-Iterate
  Convergence Rates for Co-Hypomonotone Inclusions
Extragradient-Type Methods with O(1/k)\mathcal{O} (1/k)O(1/k) Last-Iterate Convergence Rates for Co-Hypomonotone Inclusions
Quoc Tran-Dinh
42
2
0
08 Feb 2023
Breaking the Curse of Multiagents in a Large State Space: RL in Markov
  Games with Independent Linear Function Approximation
Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation
Qiwen Cui
Kai Zhang
S. Du
60
23
0
07 Feb 2023
Online Reinforcement Learning with Uncertain Episode Lengths
Online Reinforcement Learning with Uncertain Episode Lengths
Debmalya Mandal
Goran Radanović
Jiarui Gan
Adish Singla
R. Majumdar
OffRL
41
5
0
07 Feb 2023
Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR
Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR
Kaiwen Wang
Nathan Kallus
Wen Sun
114
18
0
07 Feb 2023
A Reduction-based Framework for Sequential Decision Making with Delayed
  Feedback
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
50
8
0
03 Feb 2023
Sample Complexity of Kernel-Based Q-Learning
Sample Complexity of Kernel-Based Q-Learning
Sing-Yuan Yeh
Fu-Chieh Chang
Chang-Wei Yueh
Pei-Yuan Wu
A. Bernacchia
Sattar Vakili
OffRL
49
4
0
01 Feb 2023
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Jonathan Lee
Alekh Agarwal
Christoph Dann
Tong Zhang
39
20
0
31 Jan 2023
An Efficient Solution to s-Rectangular Robust Markov Decision Processes
An Efficient Solution to s-Rectangular Robust Markov Decision Processes
Navdeep Kumar
Kfir Y. Levy
Kaixin Wang
Shie Mannor
36
2
0
31 Jan 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
49
10
0
31 Jan 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear
  Function Approximation
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
Uri Sherman
Tomer Koren
Yishay Mansour
55
12
0
30 Jan 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi-An Ma
Sergey Levine
42
14
0
24 Dec 2022
Latent Variable Representation for Reinforcement Learning
Latent Variable Representation for Reinforcement Learning
Tongzheng Ren
Chenjun Xiao
Tianjun Zhang
Na Li
Zhaoran Wang
Sujay Sanghavi
Dale Schuurmans
Bo Dai
OffRL
38
10
0
17 Dec 2022
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
  Processes
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
Heyang Zhao
Dongruo Zhou
Quanquan Gu
OffRL
74
55
0
12 Dec 2022
Data-pooling Reinforcement Learning for Personalized Healthcare
  Intervention
Data-pooling Reinforcement Learning for Personalized Healthcare Intervention
Xinyun Chen
P. Shi
Shanwen Pu
OffRL
35
4
0
16 Nov 2022
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement
  Learning
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
42
4
0
30 Oct 2022
Opportunistic Episodic Reinforcement Learning
Opportunistic Episodic Reinforcement Learning
Xiaoxiao Wang
Nader Bouacida
Xueying Guo
Xin Liu
24
0
0
24 Oct 2022
On the Power of Pre-training for Generalization in RL: Provable Benefits
  and Hardness
On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness
Haotian Ye
Xiaoyu Chen
Liwei Wang
S. Du
OffRL
37
6
0
19 Oct 2022
A Unified Algorithm for Stochastic Path Problems
A Unified Algorithm for Stochastic Path Problems
Christoph Dann
Chen-Yu Wei
Julian Zimmert
40
0
0
17 Oct 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
31
9
0
15 Oct 2022
Regret Bounds for Risk-Sensitive Reinforcement Learning
Regret Bounds for Risk-Sensitive Reinforcement Learning
Osbert Bastani
Y. Ma
E. Shen
Wei Xu
46
18
0
11 Oct 2022
Tractable Optimality in Episodic Latent MABs
Tractable Optimality in Episodic Latent MABs
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
60
3
0
05 Oct 2022
Reward-Mixing MDPs with a Few Latent Contexts are Learnable
Reward-Mixing MDPs with a Few Latent Contexts are Learnable
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
38
5
0
05 Oct 2022
Square-root regret bounds for continuous-time episodic Markov decision
  processes
Square-root regret bounds for continuous-time episodic Markov decision processes
Xuefeng Gao
X. Zhou
63
6
0
03 Oct 2022
A General Framework for Sample-Efficient Function Approximation in
  Reinforcement Learning
A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Zixiang Chen
C. J. Li
An Yuan
Quanquan Gu
Michael I. Jordan
OffRL
116
26
0
30 Sep 2022
Partially Observable RL with B-Stability: Unified Structural Condition
  and Sharp Sample-Efficient Algorithms
Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms
Fan Chen
Yu Bai
Song Mei
53
22
0
29 Sep 2022
Multi-armed Bandit Learning on a Graph
Multi-armed Bandit Learning on a Graph
Tianpeng Zhang
Kasper Johansson
Na Li
42
6
0
20 Sep 2022
On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs
On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs
Zixuan Dong
Che Wang
Keith Ross
38
3
0
07 Sep 2022
Socially Fair Reinforcement Learning
Socially Fair Reinforcement Learning
Debmalya Mandal
Jiarui Gan
OffRL
30
13
0
26 Aug 2022
Strategic Decision-Making in the Presence of Information Asymmetry:
  Provably Efficient RL with Algorithmic Instruments
Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments
Mengxin Yu
Zhuoran Yang
Jianqing Fan
OffRL
54
8
0
23 Aug 2022
Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model
Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model
Gen Li
Yuejie Chi
Yuting Wei
Yuxin Chen
40
18
0
22 Aug 2022
Spectral Decomposition Representation for Reinforcement Learning
Spectral Decomposition Representation for Reinforcement Learning
Tongzheng Ren
Tianjun Zhang
Lisa Lee
Joseph E. Gonzalez
Dale Schuurmans
Bo Dai
OffRL
49
27
0
19 Aug 2022
Making Linear MDPs Practical via Contrastive Representation Learning
Making Linear MDPs Practical via Contrastive Representation Learning
Tianjun Zhang
Tongzheng Ren
Mengjiao Yang
Joseph E. Gonzalez
Dale Schuurmans
Bo Dai
30
44
0
14 Jul 2022
PAC Reinforcement Learning for Predictive State Representations
PAC Reinforcement Learning for Predictive State Representations
Wenhao Zhan
Masatoshi Uehara
Wen Sun
Jason D. Lee
42
38
0
12 Jul 2022
Model-based RL with Optimistic Posterior Sampling: Structural Conditions
  and Sample Complexity
Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity
Alekh Agarwal
Tong Zhang
59
22
0
15 Jun 2022
Achieving Zero Constraint Violation for Constrained Reinforcement
  Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Qinbo Bai
Amrit Singh Bedi
Vaneet Aggarwal
31
20
0
12 Jun 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards
  Optimality
Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality
Ming Yin
Wenjing Chen
Mengdi Wang
Yu Wang
OffRL
32
4
0
10 Jun 2022
Sample-Efficient Reinforcement Learning in the Presence of Exogenous
  Information
Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information
Yonathan Efroni
Dylan J. Foster
Dipendra Kumar Misra
A. Krishnamurthy
John Langford
OffRL
36
25
0
09 Jun 2022
Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR
  and Worst Path
Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path
Yihan Du
Siwei Wang
Longbo Huang
OOD
39
13
0
06 Jun 2022
Sample-Efficient Reinforcement Learning of Partially Observable Markov
  Games
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games
Qinghua Liu
Csaba Szepesvári
Chi Jin
69
20
0
02 Jun 2022
Incrementality Bidding via Reinforcement Learning under Mixed and
  Delayed Rewards
Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards
Ashwinkumar Badanidiyuru
Zhe Feng
Tianxi Li
Haifeng Xu
OffRL
48
3
0
02 Jun 2022
Offline Reinforcement Learning with Differential Privacy
Offline Reinforcement Learning with Differential Privacy
Dan Qiao
Yu Wang
OffRL
44
23
0
02 Jun 2022
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Yiding Chen
Xuezhou Zhang
Kai Zhang
Mengdi Wang
Xiaojin Zhu
OffRL
43
16
0
01 Jun 2022
Chain of Thought Imitation with Procedure Cloning
Chain of Thought Imitation with Procedure Cloning
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
OffRL
40
30
0
22 May 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
D. Tiapkin
Denis Belomestny
Eric Moulines
A. Naumov
S. Samsonov
Yunhao Tang
Michal Valko
Pierre Menard
36
17
0
16 May 2022
Provably Efficient Kernelized Q-Learning
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
46
4
0
21 Apr 2022
Reinforcement Learning from Partial Observation: Linear Function
  Approximation with Provable Sample Efficiency
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
Qi Cai
Zhuoran Yang
Zhaoran Wang
43
14
0
20 Apr 2022
When Is Partially Observable Reinforcement Learning Not Scary?
When Is Partially Observable Reinforcement Learning Not Scary?
Qinghua Liu
Alan Chung
Csaba Szepesvári
Chi Jin
27
94
0
19 Apr 2022
The Complexity of Markov Equilibrium in Stochastic Games
The Complexity of Markov Equilibrium in Stochastic Games
C. Daskalakis
Noah Golowich
Kai Zhang
41
56
0
08 Apr 2022
Previous
12345
Next