ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.00210
  4. Cited By
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning
  without Domain Knowledge using Value Function Bounds

Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds

1 January 2019
Andrea Zanette
Emma Brunskill
    OffRL
ArXivPDFHTML

Papers citing "Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds"

50 / 216 papers shown
Title
First-Order Regret in Reinforcement Learning with Linear Function
  Approximation: A Robust Estimation Approach
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Andrew Wagenmaker
Yifang Chen
Max Simchowitz
S. Du
Kevin G. Jamieson
73
37
0
07 Dec 2021
Dueling RL: Reinforcement Learning with Trajectory Preferences
Dueling RL: Reinforcement Learning with Trajectory Preferences
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
38
82
0
08 Nov 2021
Settling the Horizon-Dependence of Sample Complexity in Reinforcement
  Learning
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
27
20
0
01 Nov 2021
Adaptive Discretization in Online Reinforcement Learning
Adaptive Discretization in Online Reinforcement Learning
Sean R. Sinclair
Siddhartha Banerjee
Chao Yu
OffRL
45
15
0
29 Oct 2021
Can Q-Learning be Improved with Advice?
Can Q-Learning be Improved with Advice?
Noah Golowich
Ankur Moitra
OffRL
19
12
0
25 Oct 2021
Provable Hierarchy-Based Meta-Reinforcement Learning
Provable Hierarchy-Based Meta-Reinforcement Learning
Kurtland Chua
Qi Lei
Jason D. Lee
22
5
0
18 Oct 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Ming Yin
Yu Wang
OffRL
29
82
0
17 Oct 2021
Reward-Free Model-Based Reinforcement Learning with Linear Function
  Approximation
Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation
Weitong Zhang
Dongruo Zhou
Quanquan Gu
OffRL
30
27
0
12 Oct 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
49
51
0
09 Oct 2021
Theoretically Principled Deep RL Acceleration via Nearest Neighbor
  Function Approximation
Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation
Junhong Shen
Lin F. Yang
OffRL
19
15
0
09 Oct 2021
Reinforcement Learning in Reward-Mixing MDPs
Reinforcement Learning in Reward-Mixing MDPs
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
32
15
0
07 Oct 2021
Bad-Policy Density: A Measure of Reinforcement Learning Hardness
Bad-Policy Density: A Measure of Reinforcement Learning Hardness
David Abel
Cameron Allen
Dilip Arumugam
D Ellis Hershkowitz
Michael L. Littman
Lawson L. S. Wong
26
2
0
07 Oct 2021
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with
  an Arbitrary Opponent
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
41
5
0
08 Sep 2021
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
33
12
0
11 Aug 2021
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning
Andrew Wagenmaker
Max Simchowitz
Kevin G. Jamieson
28
34
0
05 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
33
47
0
30 Jul 2021
Provably Efficient Multi-Task Reinforcement Learning with Model Transfer
Provably Efficient Multi-Task Reinforcement Learning with Model Transfer
Chicheng Zhang
Zhi Wang
OffRL
27
19
0
19 Jul 2021
Policy Optimization in Adversarial MDPs: Improved Exploration via
  Dilated Bonuses
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Haipeng Luo
Chen-Yu Wei
Chung-Wei Lee
38
44
0
18 Jul 2021
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
Baihe Huang
Kaixuan Huang
Sham Kakade
Jason D. Lee
Qi Lei
Runzhe Wang
Jiaqi Yang
46
8
0
14 Jul 2021
Deep Learning for Embodied Vision Navigation: A Survey
Deep Learning for Embodied Vision Navigation: A Survey
Fengda Zhu
Yi Zhu
Vincent CS Lee
Xiaodan Liang
Xiaojun Chang
EgoV
LM&Ro
44
0
0
07 Jul 2021
Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds
  for Episodic Reinforcement Learning
Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning
Christoph Dann
T. V. Marinov
M. Mohri
Julian Zimmert
OffRL
11
29
0
02 Jul 2021
Instance-optimality in optimal value estimation: Adaptivity via
  variance-reduced Q-learning
Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
OffRL
36
20
0
28 Jun 2021
A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs
A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs
Andrea Tirinzoni
Matteo Pirotta
A. Lazaric
27
16
0
24 Jun 2021
A Reduction-Based Framework for Conservative Bandits and Reinforcement
  Learning
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning
Yunchang Yang
Tianhao Wu
Han Zhong
Evrard Garcelon
Matteo Pirotta
A. Lazaric
Liwei Wang
S. Du
OffRL
35
9
0
22 Jun 2021
Uniform-PAC Bounds for Reinforcement Learning with Linear Function
  Approximation
Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation
Jiafan He
Dongruo Zhou
Quanquan Gu
27
12
0
22 Jun 2021
MADE: Exploration via Maximizing Deviation from Explored Regions
MADE: Exploration via Maximizing Deviation from Explored Regions
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
34
42
0
18 Jun 2021
Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more
  Scalable than Optimism?
Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?
Nicolas Gast
B. Gaujal
K. Khun
28
2
0
16 Jun 2021
Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms
  for Stochastic Shortest Path
Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path
Liyu Chen
Mehdi Jafarnia-Jahromi
R. Jain
Haipeng Luo
24
25
0
15 Jun 2021
Online Sub-Sampling for Reinforcement Learning with General Function
  Approximation
Online Sub-Sampling for Reinforcement Learning with General Function Approximation
Dingwen Kong
Ruslan Salakhutdinov
Ruosong Wang
Lin F. Yang
OffRL
38
1
0
14 Jun 2021
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
Chi Jin
Qinghua Liu
Tiancheng Yu
26
50
0
07 Jun 2021
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
Niladri S. Chatterji
Aldo Pacchiano
Peter L. Bartlett
Michael I. Jordan
OffRL
27
24
0
29 May 2021
Online Selection of Diverse Committees
Online Selection of Diverse Committees
Virginie Do
Jamal Atif
J. Lang
Nicolas Usunier
40
8
0
19 May 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards
  Horizon-Free Regret
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
65
35
0
22 Apr 2021
Minimax Regret for Stochastic Shortest Path
Minimax Regret for Stochastic Shortest Path
Alon Cohen
Yonathan Efroni
Yishay Mansour
Aviv A. Rosenberg
33
28
0
24 Mar 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear
  Function Approximation
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
34
53
0
24 Mar 2021
Reinforcement Learning with Algorithms from Probabilistic Structure
  Estimation
Reinforcement Learning with Algorithms from Probabilistic Structure Estimation
J. Epperlein
R. Overko
Sergiy Zhuk
Christopher K. King
Djallel Bouneffouf
Andrew Cullen
Robert Shorten
OffRL
24
7
0
15 Mar 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
41
0
01 Mar 2021
Online Learning for Unknown Partially Observable MDPs
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
34
20
0
25 Feb 2021
Near-Optimal Randomized Exploration for Tabular Markov Decision
  Processes
Near-Optimal Randomized Exploration for Tabular Markov Decision Processes
Zhihan Xiong
Ruoqi Shen
Qiwen Cui
Maryam Fazel
S. Du
29
7
0
19 Feb 2021
Online Apprenticeship Learning
Online Apprenticeship Learning
Lior Shani
Tom Zahavy
Shie Mannor
OffRL
29
25
0
13 Feb 2021
Improved Corruption Robust Algorithms for Episodic Reinforcement
  Learning
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen
S. Du
Kevin G. Jamieson
24
22
0
13 Feb 2021
Finding the Stochastic Shortest Path with Low Regret: The Adversarial
  Cost and Unknown Transition Case
Finding the Stochastic Shortest Path with Low Regret: The Adversarial Cost and Unknown Transition Case
Liyu Chen
Haipeng Luo
33
30
0
10 Feb 2021
Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive
  Multi-Step Bootstrap
Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap
Haike Xu
Tengyu Ma
S. Du
19
42
0
09 Feb 2021
Confidence-Budget Matching for Sequential Budgeted Learning
Confidence-Budget Matching for Sequential Budgeted Learning
Yonathan Efroni
Nadav Merlis
Aadirupa Saha
Shie Mannor
40
10
0
05 Feb 2021
Near-Optimal Offline Reinforcement Learning via Double Variance
  Reduction
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction
Ming Yin
Yu Bai
Yu Wang
OffRL
20
65
0
02 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and
  Sample-Efficient Algorithms
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
38
215
0
01 Feb 2021
Online Markov Decision Processes with Aggregate Bandit Feedback
Online Markov Decision Processes with Aggregate Bandit Feedback
Alon Cohen
Haim Kaplan
Tomer Koren
Yishay Mansour
OffRL
6
8
0
31 Jan 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear
  Mixture MDP
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
38
0
29 Jan 2021
A Provably Efficient Algorithm for Linear Markov Decision Process with
  Low Switching Cost
A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost
Minbo Gao
Tianle Xie
S. Du
Lin F. Yang
36
46
0
02 Jan 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
43
32
0
29 Dec 2020
Previous
12345
Next