Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.00210
Cited By
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
1 January 2019
Andrea Zanette
Emma Brunskill
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds"
50 / 216 papers shown
Title
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Andrew Wagenmaker
Yifang Chen
Max Simchowitz
S. Du
Kevin G. Jamieson
73
37
0
07 Dec 2021
Dueling RL: Reinforcement Learning with Trajectory Preferences
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
38
82
0
08 Nov 2021
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
27
20
0
01 Nov 2021
Adaptive Discretization in Online Reinforcement Learning
Sean R. Sinclair
Siddhartha Banerjee
Chao Yu
OffRL
45
15
0
29 Oct 2021
Can Q-Learning be Improved with Advice?
Noah Golowich
Ankur Moitra
OffRL
19
12
0
25 Oct 2021
Provable Hierarchy-Based Meta-Reinforcement Learning
Kurtland Chua
Qi Lei
Jason D. Lee
22
5
0
18 Oct 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Ming Yin
Yu Wang
OffRL
29
82
0
17 Oct 2021
Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation
Weitong Zhang
Dongruo Zhou
Quanquan Gu
OffRL
30
27
0
12 Oct 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
49
51
0
09 Oct 2021
Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation
Junhong Shen
Lin F. Yang
OffRL
19
15
0
09 Oct 2021
Reinforcement Learning in Reward-Mixing MDPs
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
32
15
0
07 Oct 2021
Bad-Policy Density: A Measure of Reinforcement Learning Hardness
David Abel
Cameron Allen
Dilip Arumugam
D Ellis Hershkowitz
Michael L. Littman
Lawson L. S. Wong
26
2
0
07 Oct 2021
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
41
5
0
08 Sep 2021
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
33
12
0
11 Aug 2021
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning
Andrew Wagenmaker
Max Simchowitz
Kevin G. Jamieson
28
34
0
05 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
33
47
0
30 Jul 2021
Provably Efficient Multi-Task Reinforcement Learning with Model Transfer
Chicheng Zhang
Zhi Wang
OffRL
27
19
0
19 Jul 2021
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Haipeng Luo
Chen-Yu Wei
Chung-Wei Lee
38
44
0
18 Jul 2021
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
Baihe Huang
Kaixuan Huang
Sham Kakade
Jason D. Lee
Qi Lei
Runzhe Wang
Jiaqi Yang
46
8
0
14 Jul 2021
Deep Learning for Embodied Vision Navigation: A Survey
Fengda Zhu
Yi Zhu
Vincent CS Lee
Xiaodan Liang
Xiaojun Chang
EgoV
LM&Ro
44
0
0
07 Jul 2021
Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning
Christoph Dann
T. V. Marinov
M. Mohri
Julian Zimmert
OffRL
11
29
0
02 Jul 2021
Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
OffRL
36
20
0
28 Jun 2021
A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs
Andrea Tirinzoni
Matteo Pirotta
A. Lazaric
27
16
0
24 Jun 2021
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning
Yunchang Yang
Tianhao Wu
Han Zhong
Evrard Garcelon
Matteo Pirotta
A. Lazaric
Liwei Wang
S. Du
OffRL
35
9
0
22 Jun 2021
Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation
Jiafan He
Dongruo Zhou
Quanquan Gu
27
12
0
22 Jun 2021
MADE: Exploration via Maximizing Deviation from Explored Regions
Tianjun Zhang
Paria Rashidinejad
Jiantao Jiao
Yuandong Tian
Joseph E. Gonzalez
Stuart J. Russell
OffRL
34
42
0
18 Jun 2021
Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?
Nicolas Gast
B. Gaujal
K. Khun
28
2
0
16 Jun 2021
Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path
Liyu Chen
Mehdi Jafarnia-Jahromi
R. Jain
Haipeng Luo
24
25
0
15 Jun 2021
Online Sub-Sampling for Reinforcement Learning with General Function Approximation
Dingwen Kong
Ruslan Salakhutdinov
Ruosong Wang
Lin F. Yang
OffRL
38
1
0
14 Jun 2021
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
Chi Jin
Qinghua Liu
Tiancheng Yu
26
50
0
07 Jun 2021
On the Theory of Reinforcement Learning with Once-per-Episode Feedback
Niladri S. Chatterji
Aldo Pacchiano
Peter L. Bartlett
Michael I. Jordan
OffRL
27
24
0
29 May 2021
Online Selection of Diverse Committees
Virginie Do
Jamal Atif
J. Lang
Nicolas Usunier
40
8
0
19 May 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
65
35
0
22 Apr 2021
Minimax Regret for Stochastic Shortest Path
Alon Cohen
Yonathan Efroni
Yishay Mansour
Aviv A. Rosenberg
33
28
0
24 Mar 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
34
53
0
24 Mar 2021
Reinforcement Learning with Algorithms from Probabilistic Structure Estimation
J. Epperlein
R. Overko
Sergiy Zhuk
Christopher K. King
Djallel Bouneffouf
Andrew Cullen
Robert Shorten
OffRL
24
7
0
15 Mar 2021
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
41
0
01 Mar 2021
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
34
20
0
25 Feb 2021
Near-Optimal Randomized Exploration for Tabular Markov Decision Processes
Zhihan Xiong
Ruoqi Shen
Qiwen Cui
Maryam Fazel
S. Du
29
7
0
19 Feb 2021
Online Apprenticeship Learning
Lior Shani
Tom Zahavy
Shie Mannor
OffRL
29
25
0
13 Feb 2021
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
Yifang Chen
S. Du
Kevin G. Jamieson
24
22
0
13 Feb 2021
Finding the Stochastic Shortest Path with Low Regret: The Adversarial Cost and Unknown Transition Case
Liyu Chen
Haipeng Luo
33
30
0
10 Feb 2021
Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap
Haike Xu
Tengyu Ma
S. Du
19
42
0
09 Feb 2021
Confidence-Budget Matching for Sequential Budgeted Learning
Yonathan Efroni
Nadav Merlis
Aadirupa Saha
Shie Mannor
40
10
0
05 Feb 2021
Near-Optimal Offline Reinforcement Learning via Double Variance Reduction
Ming Yin
Yu Bai
Yu Wang
OffRL
20
65
0
02 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
38
215
0
01 Feb 2021
Online Markov Decision Processes with Aggregate Bandit Feedback
Alon Cohen
Haim Kaplan
Tomer Koren
Yishay Mansour
OffRL
6
8
0
31 Jan 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
38
0
29 Jan 2021
A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost
Minbo Gao
Tianle Xie
S. Du
Lin F. Yang
36
46
0
02 Jan 2021
Learning Adversarial Markov Decision Processes with Delayed Feedback
Tal Lancewicki
Aviv A. Rosenberg
Yishay Mansour
43
32
0
29 Dec 2020
Previous
1
2
3
4
5
Next