Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.08607
Cited By
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
17 February 2021
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method"
21 / 21 papers shown
Title
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
29
0
0
12 May 2025
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
89
0
0
11 Feb 2025
From Gradient Clipping to Normalization for Heavy Tailed SGD
Florian Hübler
Ilyas Fatkhullin
Niao He
40
5
0
17 Oct 2024
MetaCURL: Non-stationary Concave Utility Reinforcement Learning
B. Moreno
Margaux Brégère
Pierre Gaillard
Nadia Oudjane
OffRL
39
0
0
30 May 2024
Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs
Swetha Ganesh
Washim Uddin Mondal
Vaneet Aggarwal
49
3
0
02 Apr 2024
Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries
Swetha Ganesh
Jiayu Chen
Gugan Thoppe
Vaneet Aggarwal
FedML
66
1
0
15 Mar 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
14
1
0
23 Jan 2024
Efficiently Escaping Saddle Points for Non-Convex Policy Optimization
Sadegh Khorasani
Saber Salehkaleybar
Negar Kiyavash
Niao He
Matthias Grossglauser
29
1
0
15 Nov 2023
Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes
Qinbo Bai
Washim Uddin Mondal
Vaneet Aggarwal
34
9
0
05 Sep 2023
An Adaptive Optimization Approach to Personalized Financial Incentives in Mobile Behavioral Weight Loss Interventions
Qiaomei Li
Kara L. Gavin
Corrine L. Voils
Yonatan Dov Mintz
11
0
0
01 Jul 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
32
7
0
24 May 2023
Scalable Multi-Agent Reinforcement Learning with General Utilities
Donghao Ying
Yuhao Ding
Alec Koppel
Javad Lavaei
38
1
0
15 Feb 2023
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search
Gal Dalal
Assaf Hallak
Gugan Thoppe
Shie Mannor
Gal Chechik
29
3
0
30 Jan 2023
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Jinsong Liu
Chen Xie
Qinwen Deng
Dongdong Ge
Yi-Li Ye
29
1
0
28 Jan 2023
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
29
15
0
16 Jan 2023
Variance-Reduced Conservative Policy Iteration
Naman Agarwal
Brian Bullins
Karan Singh
29
3
0
12 Dec 2022
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Qinbo Bai
Amrit Singh Bedi
Vaneet Aggarwal
26
20
0
12 Jun 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
Matilde Gargiani
Andrea Zanelli
Andrea Martinelli
Tyler H. Summers
John Lygeros
33
14
0
01 Feb 2022
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Mridul Agarwal
Qinbo Bai
Vaneet Aggarwal
36
12
0
12 Sep 2021
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
76
62
0
23 Jul 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
91
136
0
30 Jan 2021
1