ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.10285
  4. Cited By
SBEED: Convergent Reinforcement Learning with Nonlinear Function
  Approximation

SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation

29 December 2017
Bo Dai
Albert Eaton Shaw
Lihong Li
Lin Xiao
Niao He
Zhen Liu
Jianshu Chen
Le Song
ArXivPDFHTML

Papers citing "SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation"

10 / 10 papers shown
Title
A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems
A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems
Jiawei Zhang
Peijun Xiao
Ruoyu Sun
Zhi-Quan Luo
33
97
0
29 Oct 2020
Parameterized MDPs and Reinforcement Learning Problems -- A Maximum
  Entropy Principle Based Framework
Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework
Amber Srivastava
S. Salapaka
6
11
0
17 Jun 2020
On Computation and Generalization of Generative Adversarial Imitation
  Learning
On Computation and Generalization of Generative Adversarial Imitation Learning
Minshuo Chen
Yizhou Wang
Tianyi Liu
Zhuoran Yang
Xingguo Li
Zhaoran Wang
T. Zhao
35
40
0
09 Jan 2020
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
22
67
0
16 Oct 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
24
108
0
25 Jun 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
11
328
0
10 Jun 2019
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual
  Optimization
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
Hoi-To Wai
Zhuoran Yang
Zhaoran Wang
Mingyi Hong
30
169
0
03 Jun 2018
Path Consistency Learning in Tsallis Entropy Regularized MDPs
Path Consistency Learning in Tsallis Entropy Regularized MDPs
Ofir Nachum
Yinlam Chow
Mohammad Ghavamzadeh
13
45
0
10 Feb 2018
An Alternative Softmax Operator for Reinforcement Learning
An Alternative Softmax Operator for Reinforcement Learning
Kavosh Asadi
Michael L. Littman
20
10
0
16 Dec 2016
Learning from Conditional Distributions via Dual Embeddings
Learning from Conditional Distributions via Dual Embeddings
Bo Dai
Niao He
Yunpeng Pan
Byron Boots
Le Song
35
21
0
15 Jul 2016
1