Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.08459
Cited By
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning
16 July 2020
Alekh Agarwal
Mikael Henaff
Sham Kakade
Wen Sun
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning"
14 / 14 papers shown
Title
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Jincheng Mei
Bo Dai
Alekh Agarwal
Mohammad Ghavamzadeh
Csaba Szepesvári
Dale Schuurmans
98
4
0
02 Apr 2025
A Model Selection Approach for Corruption Robust Reinforcement Learning
Chen-Yu Wei
Christoph Dann
Julian Zimmert
99
44
0
31 Dec 2024
Random Latent Exploration for Deep Reinforcement Learning
Srinath Mahankali
Zhang-Wei Hong
Ayush Sekhari
Alexander Rakhlin
Pulkit Agrawal
139
3
0
18 Jul 2024
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
Dongruo Zhou
Jiafan He
Quanquan Gu
48
134
0
23 Jun 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal
Sham Kakade
A. Krishnamurthy
Wen Sun
OffRL
105
225
0
18 Jun 2020
Model-Based Reinforcement Learning with Value-Targeted Regression
Alex Ayoub
Zeyu Jia
Csaba Szepesvári
Mengdi Wang
Lin F. Yang
OffRL
78
301
0
01 Jun 2020
Explicit Explore-Exploit Algorithms in Continuous State Spaces
Mikael Henaff
OffRL
47
31
0
01 Nov 2019
Exploration-Enhanced POLITEX
Yasin Abbasi-Yadkori
N. Lazić
Csaba Szepesvári
Gellert Weisz
30
23
0
27 Aug 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
51
111
0
25 Jun 2019
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Lin F. Yang
Mengdi Wang
OffRL
GP
55
284
0
24 May 2019
A Theory of Regularized Markov Decision Processes
Matthieu Geist
B. Scherrer
Olivier Pietquin
92
317
0
31 Jan 2019
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRM
SSL
106
2,423
0
15 May 2017
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRL
ODL
186
5,056
0
05 Jun 2016
Finite-Time Analysis of Kernelised Contextual Bandits
Michal Valko
N. Korda
Rémi Munos
I. Flaounas
N. Cristianini
133
271
0
26 Sep 2013
1