Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.11849
Cited By
v1
v2 (latest)
Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation
23 July 2020
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Rahul Jain
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation"
26 / 26 papers shown
Title
Improved Analysis of UCRL2 with Empirical Bernstein Inequality
Ronan Fruit
Matteo Pirotta
A. Lazaric
34
33
0
10 Jul 2020
Online learning in MDPs with linear function approximation and bandit feedback
Gergely Neu
Julia Olkhovskaya
36
32
0
03 Jul 2020
Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension
Ruosong Wang
Ruslan Salakhutdinov
Lin F. Yang
62
55
0
21 May 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
71
222
0
29 Feb 2020
Adaptive Approximate Policy Iteration
Botao Hao
N. Lazić
Yasin Abbasi-Yadkori
Pooria Joulani
Csaba Szepesvári
61
14
0
08 Feb 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
132
106
0
15 Oct 2019
n
\sqrt{n}
n
-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank
Kefan Dong
Jian-wei Peng
Yining Wang
Yuanshuo Zhou
OffRL
53
36
0
05 Sep 2019
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
85
241
0
29 Aug 2019
Exploration-Enhanced POLITEX
Yasin Abbasi-Yadkori
N. Lazić
Csaba Szepesvári
Gellert Weisz
52
23
0
27 Aug 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
69
321
0
01 Aug 2019
Provably Efficient Reinforcement Learning with Linear Function Approximation
Chi Jin
Zhuoran Yang
Zhaoran Wang
Michael I. Jordan
96
557
0
11 Jul 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
73
111
0
25 Jun 2019
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Zihan Zhang
Xiangyang Ji
60
72
0
12 Jun 2019
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Lin F. Yang
Mengdi Wang
OffRL
GP
62
286
0
24 May 2019
Regret Bounds for Reinforcement Learning via Markov Chain Concentration
R. Ortner
67
46
0
06 Aug 2018
Scalable Bilinear
π
π
π
Learning Using State and Action Features
Yichen Chen
Lihong Li
Mengdi Wang
64
46
0
27 Apr 2018
Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs
M. S. Talebi
Odalric-Ambrym Maillard
56
72
0
05 Mar 2018
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning
Ronan Fruit
Matteo Pirotta
A. Lazaric
R. Ortner
86
116
0
12 Feb 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
517
19,065
0
20 Jul 2017
A unified view of entropy-regularized Markov decision processes
Gergely Neu
Anders Jonsson
Vicencc Gómez
97
263
0
22 May 2017
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
199
8,859
0
04 Feb 2016
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,776
0
19 Feb 2015
Generalization and Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Zheng Wen
79
314
0
04 Feb 2014
Volumetric Spanners: an Efficient Exploration Basis for Learning
Elad Hazan
Zohar Karnin
Raghu Mehka
255
97
0
21 Dec 2013
REGAL: A Regularization based Algorithm for Reinforcement Learning in Weakly Communicating MDPs
Peter L. Bartlett
Ambuj Tewari
91
284
0
09 May 2012
Towards minimax policies for online linear optimization with bandit feedback
Sébastien Bubeck
Nicolò Cesa-Bianchi
Sham Kakade
OffRL
283
150
0
14 Feb 2012
1