ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.00483
  4. Cited By
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy

Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy

2 August 2020
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
ArXivPDFHTML

Papers citing "Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy"

30 / 30 papers shown
Title
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Ruijia Zhang
Siliang Zeng
Chenliang Li
Alfredo García
Mingyi Hong
67
0
0
22 Mar 2025
On The Global Convergence Of Online RLHF With Neural Parametrization
On The Global Convergence Of Online RLHF With Neural Parametrization
Mudit Gaur
Amrit Singh Bedi
Raghu Pasupathy
Vaneet Aggarwal
28
0
0
21 Oct 2024
On the Second-Order Convergence of Biased Policy Gradient Algorithms
On the Second-Order Convergence of Biased Policy Gradient Algorithms
Siqiao Mu
Diego Klabjan
50
2
0
05 Nov 2023
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap
Hang Wang
Sen Lin
Junshan Zhang
OffRL
OnRL
33
3
0
20 Jun 2023
On the Global Convergence of Natural Actor-Critic with Two-layer Neural
  Network Parametrization
On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization
Mudit Gaur
Amrit Singh Bedi
Di-di Wang
Vaneet Aggarwal
40
3
0
18 Jun 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical
  Guarantees
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
32
3
0
24 May 2023
Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators
Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators
Yin-Huan Han
Meisam Razaviyayn
Renyuan Xu
27
5
0
15 Mar 2023
Finite-time analysis of single-timescale actor-critic
Finite-time analysis of single-timescale actor-critic
Xu-yang Chen
Lin Zhao
OffRL
29
21
0
18 Oct 2022
Global Convergence of Two-timescale Actor-Critic for Solving Linear
  Quadratic Regulator
Global Convergence of Two-timescale Actor-Critic for Solving Linear Quadratic Regulator
Xu-yang Chen
Jingliang Duan
Yingbin Liang
Lin Zhao
29
6
0
18 Aug 2022
Towards Global Optimality in Cooperative MARL with the Transformation
  And Distillation Framework
Towards Global Optimality in Cooperative MARL with the Transformation And Distillation Framework
Jianing Ye
Chenghao Li
Jianhao Wang
Chongjie Zhang
45
2
0
12 Jul 2022
A Single-Timescale Analysis For Stochastic Approximation With Multiple
  Coupled Sequences
A Single-Timescale Analysis For Stochastic Approximation With Multiple Coupled Sequences
Han Shen
Tianyi Chen
45
15
0
21 Jun 2022
Finite-Time Analysis of Fully Decentralized Single-Timescale
  Actor-Critic
Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic
Qijun Luo
Xiao Li
32
1
0
12 Jun 2022
A Small Gain Analysis of Single Timescale Actor Critic
A Small Gain Analysis of Single Timescale Actor Critic
Alexander Olshevsky
Bahman Gharesifard
33
20
0
04 Mar 2022
Single Time-scale Actor-critic Method to Solve the Linear Quadratic
  Regulator with Convergence Guarantees
Single Time-scale Actor-critic Method to Solve the Linear Quadratic Regulator with Convergence Guarantees
Mo Zhou
Jianfeng Lu
29
13
0
31 Jan 2022
STOPS: Short-Term-based Volatility-controlled Policy Search and its
  Global Convergence
STOPS: Short-Term-based Volatility-controlled Policy Search and its Global Convergence
Liang Xu
Daoming Lyu
Yangchen Pan
Aiwen Jiang
Bo Liu
33
0
0
24 Jan 2022
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of
  Representation Learning in Actor-Critic
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic
Yufeng Zhang
Siyu Chen
Zhuoran Yang
Michael I. Jordan
Zhaoran Wang
68
4
0
27 Dec 2021
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
29
167
0
08 Dec 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
32
16
0
19 Oct 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
29
113
0
19 Aug 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network
  Approach
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
40
36
0
05 Aug 2021
Tighter Analysis of Alternating Stochastic Gradient Method for
  Stochastic Nested Problems
Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems
Tianyi Chen
Yuejiao Sun
W. Yin
26
33
0
25 Jun 2021
On the Convergence Rate of Off-Policy Policy Optimization Methods with
  Density-Ratio Correction
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
Jiawei Huang
Nan Jiang
16
5
0
02 Jun 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
47
24
0
23 Feb 2021
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
S. Khodadadian
Thinh T. Doan
Justin Romberg
S. T. Maguluri
35
42
0
26 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence
  and Linear Speedup
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup
Han Shen
Kaipeng Zhang
Min-Fong Hong
Tianyi Chen
35
28
0
31 Dec 2020
Is Pessimism Provably Efficient for Offline RL?
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
27
346
0
30 Dec 2020
Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds
  Globally Optimal Policy
Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy
Han Zhong
Xun Deng
Ethan X. Fang
Zhuoran Yang
Zhaoran Wang
Runze Li
24
3
0
28 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
42
122
0
11 Nov 2020
Sample Efficient Reinforcement Learning with REINFORCE
Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang
Jongho Kim
Brendan O'Donoghue
Stephen P. Boyd
37
100
0
22 Oct 2020
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
A Finite Time Analysis of Two Time-Scale Actor Critic Methods
Yue Wu
Weitong Zhang
Pan Xu
Quanquan Gu
90
146
0
04 May 2020
1