Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.11527
Cited By
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
27 May 2019
Yonathan Efroni
Nadav Merlis
Mohammad Ghavamzadeh
Shie Mannor
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies"
22 / 22 papers shown
Title
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
Qian Zuo
Fengxiang He
97
0
0
07 Apr 2025
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
167
23
0
25 Jul 2023
Worst-Case Regret Bounds for Exploration via Randomized Value Functions
Daniel Russo
OffRL
43
85
0
07 Jun 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
95
276
0
01 Jan 2019
Learning Latent Dynamics for Planning from Pixels
Danijar Hafner
Timothy Lillicrap
Ian S. Fischer
Ruben Villegas
David R Ha
Honglak Lee
James Davidson
BDL
84
1,435
0
12 Nov 2018
Policy Certificates: Towards Accountable Reinforcement Learning
Christoph Dann
Ashutosh Adhikari
Wei Wei
Jimmy J. Lin
OffRL
108
144
0
07 Nov 2018
How to Combine Tree-Search Methods in Reinforcement Learning
Yonathan Efroni
Gal Dalal
B. Scherrer
Shie Mannor
51
31
0
06 Sep 2018
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
63
806
0
10 Jul 2018
Feedback-Based Tree Search for Reinforcement Learning
Daniel R. Jiang
E. Ekwedike
Han Liu
98
29
0
15 May 2018
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs
M. S. Talebi
Odalric-Ambrym Maillard
56
72
0
05 Mar 2018
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning
Ronan Fruit
Matteo Pirotta
A. Lazaric
R. Ortner
81
116
0
12 Feb 2018
Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning
Baolin Peng
Xiujun Li
Jianfeng Gao
Jingjing Liu
Kam-Fai Wong
Shang-Yu Su
OffRL
60
156
0
18 Jan 2018
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
72
309
0
22 Mar 2017
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
80
774
0
16 Mar 2017
On Lower Bounds for Regret in Reinforcement Learning
Ian Osband
Benjamin Van Roy
79
101
0
09 Aug 2016
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Ian Osband
Benjamin Van Roy
BDL
76
260
0
01 Jul 2016
Value Iteration Networks
Aviv Tamar
Yi Wu
G. Thomas
Sergey Levine
Pieter Abbeel
76
653
0
09 Feb 2016
(More) Efficient Reinforcement Learning via Posterior Sampling
Ian Osband
Daniel Russo
Benjamin Van Roy
116
533
0
04 Jun 2013
Planning by Prioritized Sweeping with Small Backups
H. V. Seijen
R. Sutton
60
34
0
10 Jan 2013
Incremental Model-based Learners With Formal Learning-Time Guarantees
Alexander L. Strehl
Lihong Li
Michael L. Littman
82
61
0
27 Jun 2012
REGAL: A Regularization based Algorithm for Reinforcement Learning in Weakly Communicating MDPs
Peter L. Bartlett
Ambuj Tewari
89
283
0
09 May 2012
Empirical Bernstein Bounds and Sample Variance Penalization
Andreas Maurer
Massimiliano Pontil
372
542
0
21 Jul 2009
1