ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.01224
  4. Cited By
Sample Efficient Actor-Critic with Experience Replay

Sample Efficient Actor-Critic with Experience Replay

3 November 2016
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
ArXivPDFHTML

Papers citing "Sample Efficient Actor-Critic with Experience Replay"

50 / 136 papers shown
Title
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
47
16
0
28 Jan 2025
SAPG: Split and Aggregate Policy Gradients
SAPG: Split and Aggregate Policy Gradients
Jayesh Singla
Ananye Agarwal
Deepak Pathak
OffRL
OnRL
42
3
0
29 Jul 2024
Mimicry and the Emergence of Cooperative Communication
Mimicry and the Emergence of Cooperative Communication
Dylan R. Cope
Peter McBurney
35
0
0
26 May 2024
Multi-agent Reinforcement Learning: A Comprehensive Survey
Multi-agent Reinforcement Learning: A Comprehensive Survey
Dom Huh
Prasant Mohapatra
AI4CE
36
8
0
15 Dec 2023
Efficient Off-Policy Safe Reinforcement Learning Using Trust Region
  Conditional Value at Risk
Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value at Risk
Dohyeong Kim
Songhwai Oh
OffRL
29
19
0
01 Dec 2023
All by Myself: Learning Individualized Competitive Behaviour with a
  Contrastive Reinforcement Learning optimization
All by Myself: Learning Individualized Competitive Behaviour with a Contrastive Reinforcement Learning optimization
Pablo V. A. Barros
A. Sciutti
SSL
33
3
0
02 Oct 2023
Symmetric Replay Training: Enhancing Sample Efficiency in Deep
  Reinforcement Learning for Combinatorial Optimization
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
Hyeon-Seob Kim
Minsu Kim
Sungsoo Ahn
Jinkyoo Park
OffRL
44
7
0
02 Jun 2023
VA-learning as a more efficient alternative to Q-learning
VA-learning as a more efficient alternative to Q-learning
Yunhao Tang
Rémi Munos
Mark Rowland
Michal Valko
OffRL
21
6
0
29 May 2023
Utilizing Reinforcement Learning for de novo Drug Design
Utilizing Reinforcement Learning for de novo Drug Design
Hampus Gummesson Svensson
C. Tyrchan
Ola Engkvist
M. Chehreghani
43
17
0
30 Mar 2023
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement
  Learning
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
Brett Daley
Martha White
Chris Amato
Marlos C. Machado
OffRL
25
3
0
26 Jan 2023
DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using
  Velocity Obstacles
DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles
Zhanteng Xie
P. Dames
46
61
0
16 Jan 2023
Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework
Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework
Zongwei Liu
Yonghong Song
Yuanlin Zhang
OffRL
35
2
0
10 Jan 2023
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Hsin-En Su
Yen-Ju Chen
Ping-Chun Hsieh
Xi Liu
OffRL
26
0
0
10 Dec 2022
Configurable Agent With Reward As Input: A Play-Style Continuum
  Generation
Configurable Agent With Reward As Input: A Play-Style Continuum Generation
Pierre Le Pelletier de Woillemont
Rémi Labory
Vincent Corruble
27
10
0
29 Nov 2022
Probing Transfer in Deep Reinforcement Learning without Task Engineering
Probing Transfer in Deep Reinforcement Learning without Task Engineering
Andrei A. Rusu
Sebastian Flennerhag
Dushyant Rao
Razvan Pascanu
R. Hadsell
39
6
0
22 Oct 2022
Time-Varying Propensity Score to Bridge the Gap between the Past and
  Present
Time-Varying Propensity Score to Bridge the Gap between the Past and Present
Rasool Fakoor
Jonas W. Mueller
Zachary Chase Lipton
Pratik Chaudhari
Alexander J. Smola
OOD
AI4TS
32
3
0
04 Oct 2022
Reinforcement Learning Algorithms: An Overview and Classification
Reinforcement Learning Algorithms: An Overview and Classification
Fadi AlMahamid
Katarina Grolinger
21
40
0
29 Sep 2022
General Policy Evaluation and Improvement by Learning to Identify Few
  But Crucial States
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States
Francesco Faccio
Aditya A. Ramesh
Vincent Herrmann
J. Harb
Jürgen Schmidhuber
OffRL
44
8
0
04 Jul 2022
Generalized Policy Improvement Algorithms with Theoretically Supported
  Sample Reuse
Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
32
2
0
28 Jun 2022
Analysis of Stochastic Processes through Replay Buffers
Analysis of Stochastic Processes through Replay Buffers
Shirli Di-Castro Shashua
Shie Mannor
Dotan Di-Castro
36
6
0
26 Jun 2022
Universally Expressive Communication in Multi-Agent Reinforcement
  Learning
Universally Expressive Communication in Multi-Agent Reinforcement Learning
Matthew Morris
Thomas D. Barrett
Arnu Pretorius
24
4
0
14 Jun 2022
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still
  Insufficient according to an Off-Policy Measure
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy Measure
Xing Chen
Dongcui Diao
Hechang Chen
Hengshuai Yao
Haiyin Piao
Zhixiao Sun
Zhiwei Yang
Randy Goebel
Bei Jiang
Yi-Ju Chang
OffRL
43
8
0
20 May 2022
Towards biologically plausible Dreaming and Planning in recurrent
  spiking networks
Towards biologically plausible Dreaming and Planning in recurrent spiking networks
C. Capone
P. Paolucci
CLL
31
7
0
20 May 2022
Learning to Constrain Policy Optimization with Virtual Trust Region
Learning to Constrain Policy Optimization with Virtual Trust Region
Hung Le
Thommen Karimpanal George
Majid Abdolshah
D. Nguyen
Kien Do
Sunil R. Gupta
Svetha Venkatesh
36
3
0
20 Apr 2022
Remember and Forget Experience Replay for Multi-Agent Reinforcement
  Learning
Remember and Forget Experience Replay for Multi-Agent Reinforcement Learning
Pascal Weber
Daniel Wälchli
Mustafa Zeqiri
Petros Koumoutsakos
CLL
OffRL
21
7
0
24 Mar 2022
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
Spyridon Mouselinos
Henryk Michalewski
Mateusz Malinowski
23
3
0
24 Feb 2022
Multi-Modal Legged Locomotion Framework with Automated Residual
  Reinforcement Learning
Multi-Modal Legged Locomotion Framework with Automated Residual Reinforcement Learning
Chenxiao Yu
A. Rosendo
29
15
0
24 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in
  Actor-Critic Algorithms
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
48
2
0
15 Feb 2022
Sequential Bayesian experimental designs via reinforcement learning
Sequential Bayesian experimental designs via reinforcement learning
Hikaru Asano
OffRL
18
0
0
14 Feb 2022
Autonomous Drone Swarm Navigation and Multi-target Tracking in 3D
  Environments with Dynamic Obstacles
Autonomous Drone Swarm Navigation and Multi-target Tracking in 3D Environments with Dynamic Obstacles
Suleman Qamar
Dr. Saddam Hussain Khan
Muhammad Arif Arshad
Maryam Qamar
Asifullah Khan
29
16
0
13 Feb 2022
Safe Reinforcement Learning with Chance-constrained Model Predictive
  Control
Safe Reinforcement Learning with Chance-constrained Model Predictive Control
Samuel Pfrommer
Tanmay Gautam
Alec Zhou
Somayeh Sojoudi
21
24
0
27 Dec 2021
Improving the Efficiency of Off-Policy Reinforcement Learning by
  Accounting for Past Decisions
Improving the Efficiency of Off-Policy Reinforcement Learning by Accounting for Past Decisions
Brett Daley
Chris Amato
OffRL
23
1
0
23 Dec 2021
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
29
168
0
08 Dec 2021
Replay For Safety
Replay For Safety
Liran Szlak
Ohad Shamir
OffRL
16
0
0
08 Dec 2021
Convergence Results For Q-Learning With Experience Replay
Convergence Results For Q-Learning With Experience Replay
Liran Szlak
Ohad Shamir
OffRL
29
5
0
08 Dec 2021
Learning Emergent Random Access Protocol for LEO Satellite Networks
Learning Emergent Random Access Protocol for LEO Satellite Networks
Ju-Hyung Lee
Hyowoon Seo
Jihong Park
M. Bennis
Young-Chai Ko
30
17
0
03 Dec 2021
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Nicolai Dorka
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
OffRL
32
9
0
24 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
35
10
0
04 Nov 2021
Proximal Policy Optimization with Continuous Bounded Action Space via
  the Beta Distribution
Proximal Policy Optimization with Continuous Bounded Action Space via the Beta Distribution
Irving G. B. Petrazzini
Eric A. Antonelo
OffRL
20
12
0
03 Nov 2021
Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms
  via Batch Prioritized Experience Replay
Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay
Dogan C. Cicek
Enes Duran
Baturay Saglam
Furkan B. Mutlu
Suleyman Serdar Kozat
OffRL
33
11
0
02 Nov 2021
Generalized Proximal Policy Optimization with Sample Reuse
Generalized Proximal Policy Optimization with Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
42
47
0
29 Oct 2021
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Raghuram Bharadwaj Diddigi
Prateek Jain
P. J
S. Bhatnagar
CML
OffRL
19
3
0
19 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
41
2
0
17 Oct 2021
Improving the sample-efficiency of neural architecture search with
  reinforcement learning
Improving the sample-efficiency of neural architecture search with reinforcement learning
A. Nagy
Ábel Boros
33
3
0
13 Oct 2021
On The Transferability of Deep-Q Networks
On The Transferability of Deep-Q Networks
M. Sabatelli
Pierre Geurts
37
2
0
06 Oct 2021
Adaptive control of a mechatronic system using constrained residual
  reinforcement learning
Adaptive control of a mechatronic system using constrained residual reinforcement learning
Tom Staessens
Tom Lefebvre
Guillaume Crevecoeur
22
16
0
06 Oct 2021
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Romain Laroche
Rémi Tachet des Combes
46
8
0
29 Sep 2021
Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with
  On-Policy Experience
Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with On-Policy Experience
C. Banerjee
Zhiyong Chen
N. Noman
19
30
0
24 Sep 2021
Optimal Actor-Critic Policy with Optimized Training Datasets
Optimal Actor-Critic Policy with Optimized Training Datasets
C. Banerjee
Zhiyong Chen
N. Noman
M. Zamani
OffRL
33
7
0
16 Aug 2021
Deep Reinforcement Learning for Demand Driven Services in Logistics and
  Transportation Systems: A Survey
Deep Reinforcement Learning for Demand Driven Services in Logistics and Transportation Systems: A Survey
Zefang Zong
Tao Feng
Tong Xia
Depeng Jin
Yong Li
27
3
0
10 Aug 2021
123
Next