ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.09477
  4. Cited By
Addressing Function Approximation Error in Actor-Critic Methods
v1v2v3 (latest)

Addressing Function Approximation Error in Actor-Critic Methods

26 February 2018
Scott Fujimoto
H. V. Hoof
David Meger
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Addressing Function Approximation Error in Actor-Critic Methods"

50 / 2,180 papers shown
Title
Action Noise in Off-Policy Deep Reinforcement Learning: Impact on
  Exploration and Performance
Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance
Jakob J. Hollenstein
Sayantan Auddy
Matteo Saveriano
Erwan Renaudo
J. Piater
86
21
0
08 Jun 2022
Meta-Learning Parameterized Skills
Meta-Learning Parameterized Skills
Haotian Fu
Shangqun Yu
Saket Tiwari
Michael Littman
George Konidaris
115
6
0
07 Jun 2022
On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning
On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning
Mandi Zhao
Pieter Abbeel
Stephen James
OffRL
149
34
0
07 Jun 2022
Introspective Experience Replay: Look Back When Surprised
Introspective Experience Replay: Look Back When Surprised
Ramnath Kumar
Dheeraj M. Nagaraj
OffRL
65
2
0
07 Jun 2022
Robust Adversarial Attacks Detection based on Explainable Deep
  Reinforcement Learning For UAV Guidance and Planning
Robust Adversarial Attacks Detection based on Explainable Deep Reinforcement Learning For UAV Guidance and Planning
Tom Hickling
Nabil Aouf
P. Spencer
AAML
32
54
0
06 Jun 2022
Offline RL for Natural Language Generation with Implicit Language Q
  Learning
Offline RL for Natural Language Generation with Implicit Language Q Learning
Charles Burton Snell
Ilya Kostrikov
Yi Su
Mengjiao Yang
Sergey Levine
OffRL
221
115
0
05 Jun 2022
ARC - Actor Residual Critic for Adversarial Imitation Learning
ARC - Actor Residual Critic for Adversarial Imitation Learning
A. Deka
Changliu Liu
Katia Sycara
108
5
0
05 Jun 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to
  Accelerate Progress
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Rameswar Panda
Marc G. Bellemare
OffRLOnRL
126
66
0
03 Jun 2022
HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning
HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning
Michael T. Lash
86
0
0
02 Jun 2022
Equivariant Reinforcement Learning for Quadrotor UAV
Equivariant Reinforcement Learning for Quadrotor UAV
Beomyeol Yu
Taeyoung Lee
95
8
0
02 Jun 2022
Deep Transformer Q-Networks for Partially Observable Reinforcement
  Learning
Deep Transformer Q-Networks for Partially Observable Reinforcement Learning
Kevin Esslinger
Robert Platt
Chris Amato
OffRL
82
38
0
02 Jun 2022
Model Generation with Provable Coverability for Offline Reinforcement Learning
Chengxing Jia
Hao Yin
Chenxiao Gao
Tian Xu
Lei Yuan
Zongzhang Zhang
Yang Yu
OffRL
63
0
0
01 Jun 2022
ResAct: Reinforcing Long-term Engagement in Sequential Recommendation
  with Residual Actor
ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Wanqi Xue
Qingpeng Cai
Ruohan Zhan
Dong Zheng
Peng Jiang
Kun Gai
Bo An
OffRL
71
25
0
01 Jun 2022
Lessons Learned from Data-Driven Building Control Experiments:
  Contrasting Gaussian Process-based MPC, Bilevel DeePC, and Deep Reinforcement
  Learning
Lessons Learned from Data-Driven Building Control Experiments: Contrasting Gaussian Process-based MPC, Bilevel DeePC, and Deep Reinforcement Learning
L. D. Natale
Yingzhao Lian
E. Maddalena
Jicheng Shi
Colin N. Jones
55
19
0
31 May 2022
Truly Deterministic Policy Optimization
Truly Deterministic Policy Optimization
Ehsan Saleh
Saba Ghaffari
Timothy Bretl
Matthew West
OffRL
57
3
0
30 May 2022
SEREN: Knowing When to Explore and When to Exploit
SEREN: Knowing When to Explore and When to Exploit
Changmin Yu
D. Mguni
Dong Li
Aivar Sootla
Jun Wang
Neil Burgess
48
1
0
30 May 2022
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
Y. Tan
Pihe Hu
L. Pan
Jiatai Huang
Longbo Huang
OffRL
69
24
0
30 May 2022
On the Robustness of Safe Reinforcement Learning under Observational
  Perturbations
On the Robustness of Safe Reinforcement Learning under Observational Perturbations
Zuxin Liu
Zijian Guo
Zhepeng Cen
Huan Zhang
Jie Tan
Yue Liu
Ding Zhao
OODOffRL
100
37
0
29 May 2022
Frustratingly Easy Regularization on Representation Can Boost Deep
  Reinforcement Learning
Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning
Qiang He
Huangyuan Su
Jieyu Zhang
Xinwen Hou
OODOffRL
61
7
0
29 May 2022
Why So Pessimistic? Estimating Uncertainties for Offline RL through
  Ensembles, and Why Their Independence Matters
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters
Seyed Kamyar Seyed Ghasemipour
S. Gu
Ofir Nachum
OffRL
90
72
0
27 May 2022
Physics-Guided Hierarchical Reward Mechanism for Learning-Based Robotic
  Grasping
Physics-Guided Hierarchical Reward Mechanism for Learning-Based Robotic Grasping
Yunsik Jung
Lingfeng Tao
Michael Bowman
Jiucai Zhang
Xiaoli Zhang
11
0
0
26 May 2022
Constrained Reinforcement Learning for Short Video Recommendation
Constrained Reinforcement Learning for Short Video Recommendation
Qingpeng Cai
Ruohan Zhan
Chi Zhang
Jie Zheng
Guangwei Ding
Pinghua Gong
Dong Zheng
Peng Jiang
67
6
0
26 May 2022
Skill Machines: Temporal Logic Skill Composition in Reinforcement
  Learning
Skill Machines: Temporal Logic Skill Composition in Reinforcement Learning
Geraud Nangue Tasse
Devon Jarvis
Steven D. James
Benjamin Rosman
87
5
0
25 May 2022
Concurrent Credit Assignment for Data-efficient Reinforcement Learning
Concurrent Credit Assignment for Data-efficient Reinforcement Learning
Emmanuel Daucé
11
2
0
24 May 2022
Cooperative Reinforcement Learning on Traffic Signal Control
Cooperative Reinforcement Learning on Traffic Signal Control
C. Chao
J. Hsieh
Bo Wang
15
0
0
23 May 2022
When Data Geometry Meets Deep Function: Generalizing Offline
  Reinforcement Learning
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning
Jianxiong Li
Xianyuan Zhan
Haoran Xu
Xiangyu Zhu
Jingjing Liu
Ya Zhang
OffRL
80
26
0
23 May 2022
Memory-efficient Reinforcement Learning with Value-based Knowledge
  Consolidation
Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation
Qingfeng Lan
Yangchen Pan
Jun Luo
A. R. Mahmood
OffRL
113
8
0
22 May 2022
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still
  Insufficient according to an Off-Policy Measure
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy Measure
Xing Chen
Dongcui Diao
Hechang Chen
Hengshuai Yao
Haiyin Piao
Zhixiao Sun
Zhiwei Yang
Randy Goebel
Bei Jiang
Yi-Ju Chang
OffRL
142
9
0
20 May 2022
On Jointly Optimizing Partial Offloading and SFC Mapping: A Cooperative
  Dual-agent Deep Reinforcement Learning Approach
On Jointly Optimizing Partial Offloading and SFC Mapping: A Cooperative Dual-agent Deep Reinforcement Learning Approach
Xinhan Wang
Huanlai Xing
Fuhong Song
Shouxi Luo
Penglin Dai
Bowen Zhao
43
11
0
20 May 2022
Dexterous Robotic Manipulation using Deep Reinforcement Learning and
  Knowledge Transfer for Complex Sparse Reward-based Tasks
Dexterous Robotic Manipulation using Deep Reinforcement Learning and Knowledge Transfer for Complex Sparse Reward-based Tasks
Qiang Wang
Francisco Roldan Sanchez
Robert McCarthy
David Córdova Bulens
Kevin McGuinness
Noel E. O'Connor
M. Wuthrich
Felix Widmaier
Stefan Bauer
S. Redmond
107
15
0
19 May 2022
Neighborhood Mixup Experience Replay: Local Convex Interpolation for
  Improved Sample Efficiency in Continuous Control Tasks
Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks
Ryan M Sander
Wilko Schwarting
Tim Seyde
Igor Gilitschenski
S. Karaman
Daniela Rus
69
2
0
18 May 2022
Policy Distillation with Selective Input Gradient Regularization for
  Efficient Interpretability
Policy Distillation with Selective Input Gradient Regularization for Efficient Interpretability
Jinwei Xing
Takashi Nagata
Xinyun Zou
Emre Neftci
J. Krichmar
AAML
61
4
0
18 May 2022
Qualitative Differences Between Evolutionary Strategies and
  Reinforcement Learning Methods for Control of Autonomous Agents
Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents
Nicola Milano
S. Nolfi
59
0
0
16 May 2022
Reachability Constrained Reinforcement Learning
Reachability Constrained Reinforcement Learning
Dongjie Yu
Haitong Ma
Sheng Li
Jianyu Chen
117
60
0
16 May 2022
Enforcing KL Regularization in General Tsallis Entropy Reinforcement
  Learning via Advantage Learning
Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning
Lingwei Zhu
Zheng Chen
E. Uchibe
Takamitsu Matsubara
29
1
0
16 May 2022
Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and
  Benchmarking
Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and Benchmarking
Hanna Krasowski
Jakob Thumm
Marlon Müller
Lukas Schäfer
Xiao Wang
Matthias Althoff
125
23
0
13 May 2022
Simultaneous Double Q-learning with Conservative Advantage Learning for
  Actor-Critic Methods
Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic Methods
Qing Li
Wen-gang Zhou
Zhenbo Lu
Houqiang Li
OffRL
32
2
0
08 May 2022
Dynamically writing coupled memories using a reinforcement learning
  agent, meeting physical bounds
Dynamically writing coupled memories using a reinforcement learning agent, meeting physical bounds
Théo Jules
Laura Michel
A. Douin
F. Lechenault
AI4CE
16
0
0
06 May 2022
Contact Points Discovery for Soft-Body Manipulations with Differentiable
  Physics
Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics
Sizhe Li
Zhiao Huang
Tao Du
Hao Su
J. Tenenbaum
Chuang Gan
63
27
0
05 May 2022
Generative methods for sampling transition paths in molecular dynamics
Generative methods for sampling transition paths in molecular dynamics
T. Lelièvre
Geneviève Robin
Inass Sekkat
G. Stoltz
Gabriel Victorino Cardoso
GAN
46
9
0
05 May 2022
Using Deep Reinforcement Learning to solve Optimal Power Flow problem
  with generator failures
Using Deep Reinforcement Learning to solve Optimal Power Flow problem with generator failures
Muhammad Awais
36
0
0
04 May 2022
Real-time Cooperative Vehicle Coordination at Unsignalized Road
  Intersections
Real-time Cooperative Vehicle Coordination at Unsignalized Road Intersections
Jiping Luo
Tingting Zhang
Rui Hao
Donglin Li
Chunsheng Chen
Zhenyu Na
Qinyu Zhang
27
23
0
03 May 2022
Cost Effective MLaaS Federation: A Combinatorial Reinforcement Learning
  Approach
Cost Effective MLaaS Federation: A Combinatorial Reinforcement Learning Approach
Shuzhao Xie
Yuan Xue
Yifei Zhu
Zhi Wang
FedML
53
12
0
29 Apr 2022
Bilinear value networks
Bilinear value networks
Zhang-Wei Hong
Ge Yang
Pulkit Agrawal
OffRL
75
8
0
28 Apr 2022
A Computational Theory of Learning Flexible Reward-Seeking Behavior with
  Place Cells
A Computational Theory of Learning Flexible Reward-Seeking Behavior with Place Cells
Yuan Z Gao
AI4CE
36
0
0
22 Apr 2022
TASAC: a twin-actor reinforcement learning framework with stochastic
  policy for batch process control
TASAC: a twin-actor reinforcement learning framework with stochastic policy for batch process control
Tanuja Joshi
H. Kodamana
Harikumar Kandath
N. Kaisare
OffRL
35
0
0
22 Apr 2022
Learning to Fold Real Garments with One Arm: A Case Study in Cloud-Based
  Robotics Research
Learning to Fold Real Garments with One Arm: A Case Study in Cloud-Based Robotics Research
Ryan Hoque
K. Shivakumar
Shrey Aeron
Gabriel Deza
Aditya Ganapathi
Adrian S. Wong
Johnny Lee
Andy Zeng
Vincent Vanhoucke
Ken Goldberg
95
23
0
21 Apr 2022
SAAC: Safe Reinforcement Learning as an Adversarial Game of
  Actor-Critics
SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics
Yannis Flet-Berliac
D. Basu
AAML
81
8
0
20 Apr 2022
Non-Parallel Text Style Transfer with Self-Parallel Supervision
Non-Parallel Text Style Transfer with Self-Parallel Supervision
Ruibo Liu
Chongyang Gao
Chenyan Jia
Guangxuan Xu
Soroush Vosoughi
VLM
84
16
0
18 Apr 2022
Exploiting Embodied Simulation to Detect Novel Object Classes Through
  Interaction
Exploiting Embodied Simulation to Detect Novel Object Classes Through Interaction
Nikhil Krishnaswamy
Sadaf Ghaffari
50
4
0
17 Apr 2022
Previous
123...252627...424344
Next