Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2008.00483
Cited By
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
2 August 2020
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy"
30 / 30 papers shown
Title
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Ruijia Zhang
Siliang Zeng
Chenliang Li
Alfredo García
Mingyi Hong
67
0
0
22 Mar 2025
On The Global Convergence Of Online RLHF With Neural Parametrization
Mudit Gaur
Amrit Singh Bedi
Raghu Pasupathy
Vaneet Aggarwal
28
0
0
21 Oct 2024
On the Second-Order Convergence of Biased Policy Gradient Algorithms
Siqiao Mu
Diego Klabjan
50
2
0
05 Nov 2023
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Hang Wang
Sen Lin
Junshan Zhang
OffRL
OnRL
33
3
0
20 Jun 2023
On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization
Mudit Gaur
Amrit Singh Bedi
Di-di Wang
Vaneet Aggarwal
40
3
0
18 Jun 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
32
3
0
24 May 2023
Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators
Yin-Huan Han
Meisam Razaviyayn
Renyuan Xu
27
5
0
15 Mar 2023
Finite-time analysis of single-timescale actor-critic
Xu-yang Chen
Lin Zhao
OffRL
29
21
0
18 Oct 2022
Global Convergence of Two-timescale Actor-Critic for Solving Linear Quadratic Regulator
Xu-yang Chen
Jingliang Duan
Yingbin Liang
Lin Zhao
29
6
0
18 Aug 2022
Towards Global Optimality in Cooperative MARL with the Transformation And Distillation Framework
Jianing Ye
Chenghao Li
Jianhao Wang
Chongjie Zhang
45
2
0
12 Jul 2022
A Single-Timescale Analysis For Stochastic Approximation With Multiple Coupled Sequences
Han Shen
Tianyi Chen
45
15
0
21 Jun 2022
Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic
Qijun Luo
Xiao Li
32
1
0
12 Jun 2022
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
33
20
0
04 Mar 2022
Single Time-scale Actor-critic Method to Solve the Linear Quadratic Regulator with Convergence Guarantees
Mo Zhou
Jianfeng Lu
29
13
0
31 Jan 2022
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence
Liang Xu
Daoming Lyu
Yangchen Pan
Aiwen Jiang
Bo Liu
33
0
0
24 Jan 2022
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic
Yufeng Zhang
Siyu Chen
Zhuoran Yang
Michael I. Jordan
Zhaoran Wang
68
4
0
27 Dec 2021
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
29
167
0
08 Dec 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
32
16
0
19 Oct 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
29
113
0
19 Aug 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
40
36
0
05 Aug 2021
Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems
Tianyi Chen
Yuejiao Sun
W. Yin
26
33
0
25 Jun 2021
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
Jiawei Huang
Nan Jiang
16
5
0
02 Jun 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
47
24
0
23 Feb 2021
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
S. Khodadadian
Thinh T. Doan
Justin Romberg
S. T. Maguluri
35
42
0
26 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen
Kaipeng Zhang
Min-Fong Hong
Tianyi Chen
35
28
0
31 Dec 2020
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
27
346
0
30 Dec 2020
Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy
Han Zhong
Xun Deng
Ethan X. Fang
Zhuoran Yang
Zhaoran Wang
Runze Li
24
3
0
28 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
42
122
0
11 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
37
100
0
22 Oct 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
1