Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.01224
Cited By
Sample Efficient Actor-Critic with Experience Replay
3 November 2016
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sample Efficient Actor-Critic with Experience Replay"
50 / 136 papers shown
Title
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
47
16
0
28 Jan 2025
SAPG: Split and Aggregate Policy Gradients
Jayesh Singla
Ananye Agarwal
Deepak Pathak
OffRL
OnRL
42
3
0
29 Jul 2024
Mimicry and the Emergence of Cooperative Communication
Dylan R. Cope
Peter McBurney
35
0
0
26 May 2024
Multi-agent Reinforcement Learning: A Comprehensive Survey
Dom Huh
Prasant Mohapatra
AI4CE
36
8
0
15 Dec 2023
Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value at Risk
Dohyeong Kim
Songhwai Oh
OffRL
29
19
0
01 Dec 2023
All by Myself: Learning Individualized Competitive Behaviour with a Contrastive Reinforcement Learning optimization
Pablo V. A. Barros
A. Sciutti
SSL
33
3
0
02 Oct 2023
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
Hyeon-Seob Kim
Minsu Kim
SungSoo Ahn
Jinkyoo Park
OffRL
44
7
0
02 Jun 2023
VA-learning as a more efficient alternative to Q-learning
Yunhao Tang
Rémi Munos
Mark Rowland
Michal Valko
OffRL
21
6
0
29 May 2023
Utilizing Reinforcement Learning for de novo Drug Design
Hampus Gummesson Svensson
C. Tyrchan
Ola Engkvist
M. Chehreghani
43
17
0
30 Mar 2023
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
Brett Daley
Martha White
Chris Amato
Marlos C. Machado
OffRL
25
3
0
26 Jan 2023
DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles
Zhanteng Xie
P. Dames
46
61
0
16 Jan 2023
Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework
Zongwei Liu
Yonghong Song
Yuanlin Zhang
OffRL
37
2
0
10 Jan 2023
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Hsin-En Su
Yen-Ju Chen
Ping-Chun Hsieh
Xi Liu
OffRL
28
0
0
10 Dec 2022
Configurable Agent With Reward As Input: A Play-Style Continuum Generation
Pierre Le Pelletier de Woillemont
Rémi Labory
Vincent Corruble
30
10
0
29 Nov 2022
Probing Transfer in Deep Reinforcement Learning without Task Engineering
Andrei A. Rusu
Sebastian Flennerhag
Dushyant Rao
Razvan Pascanu
R. Hadsell
39
6
0
22 Oct 2022
Time-Varying Propensity Score to Bridge the Gap between the Past and Present
Rasool Fakoor
Jonas W. Mueller
Zachary Chase Lipton
Pratik Chaudhari
Alexander J. Smola
OOD
AI4TS
34
3
0
04 Oct 2022
Reinforcement Learning Algorithms: An Overview and Classification
Fadi AlMahamid
Katarina Grolinger
21
40
0
29 Sep 2022
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States
Francesco Faccio
Aditya A. Ramesh
Vincent Herrmann
J. Harb
Jürgen Schmidhuber
OffRL
44
8
0
04 Jul 2022
Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
32
2
0
28 Jun 2022
Analysis of Stochastic Processes through Replay Buffers
Shirli Di-Castro Shashua
Shie Mannor
Dotan Di-Castro
36
6
0
26 Jun 2022
Universally Expressive Communication in Multi-Agent Reinforcement Learning
Matthew Morris
Thomas D. Barrett
Arnu Pretorius
24
4
0
14 Jun 2022
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy Measure
Xing Chen
Dongcui Diao
Hechang Chen
Hengshuai Yao
Haiyin Piao
Zhixiao Sun
Zhiwei Yang
Randy Goebel
Bei Jiang
Yi-Ju Chang
OffRL
43
8
0
20 May 2022
Towards biologically plausible Dreaming and Planning in recurrent spiking networks
C. Capone
P. Paolucci
CLL
31
7
0
20 May 2022
Learning to Constrain Policy Optimization with Virtual Trust Region
Hung Le
Thommen Karimpanal George
Majid Abdolshah
D. Nguyen
Kien Do
Sunil R. Gupta
Svetha Venkatesh
36
3
0
20 Apr 2022
Remember and Forget Experience Replay for Multi-Agent Reinforcement Learning
Pascal Weber
Daniel Wälchli
Mustafa Zeqiri
Petros Koumoutsakos
CLL
OffRL
21
7
0
24 Mar 2022
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
Spyridon Mouselinos
Henryk Michalewski
Mateusz Malinowski
23
3
0
24 Feb 2022
Multi-Modal Legged Locomotion Framework with Automated Residual Reinforcement Learning
Chenxiao Yu
A. Rosendo
29
15
0
24 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
48
2
0
15 Feb 2022
Sequential Bayesian experimental designs via reinforcement learning
Hikaru Asano
OffRL
18
0
0
14 Feb 2022
Autonomous Drone Swarm Navigation and Multi-target Tracking in 3D Environments with Dynamic Obstacles
Suleman Qamar
Dr. Saddam Hussain Khan
Muhammad Arif Arshad
Maryam Qamar
Asifullah Khan
29
16
0
13 Feb 2022
Safe Reinforcement Learning with Chance-constrained Model Predictive Control
Samuel Pfrommer
Tanmay Gautam
Alec Zhou
Somayeh Sojoudi
21
24
0
27 Dec 2021
Improving the Efficiency of Off-Policy Reinforcement Learning by Accounting for Past Decisions
Brett Daley
Chris Amato
OffRL
23
1
0
23 Dec 2021
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
29
168
0
08 Dec 2021
Replay For Safety
Liran Szlak
Ohad Shamir
OffRL
16
0
0
08 Dec 2021
Convergence Results For Q-Learning With Experience Replay
Liran Szlak
Ohad Shamir
OffRL
31
5
0
08 Dec 2021
Learning Emergent Random Access Protocol for LEO Satellite Networks
Ju-Hyung Lee
Hyowoon Seo
Jihong Park
M. Bennis
Young-Chai Ko
30
17
0
03 Dec 2021
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Nicolai Dorka
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
OffRL
32
9
0
24 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
35
10
0
04 Nov 2021
Proximal Policy Optimization with Continuous Bounded Action Space via the Beta Distribution
Irving G. B. Petrazzini
Eric A. Antonelo
OffRL
20
12
0
03 Nov 2021
Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay
Dogan C. Cicek
Enes Duran
Baturay Saglam
Furkan B. Mutlu
Suleyman Serdar Kozat
OffRL
33
11
0
02 Nov 2021
Generalized Proximal Policy Optimization with Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
42
47
0
29 Oct 2021
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Raghuram Bharadwaj Diddigi
Prateek Jain
P. J
S. Bhatnagar
CML
OffRL
19
3
0
19 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
41
2
0
17 Oct 2021
Improving the sample-efficiency of neural architecture search with reinforcement learning
A. Nagy
Ábel Boros
33
3
0
13 Oct 2021
On The Transferability of Deep-Q Networks
M. Sabatelli
Pierre Geurts
37
2
0
06 Oct 2021
Adaptive control of a mechatronic system using constrained residual reinforcement learning
Tom Staessens
Tom Lefebvre
Guillaume Crevecoeur
22
16
0
06 Oct 2021
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Romain Laroche
Rémi Tachet des Combes
46
8
0
29 Sep 2021
Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples with On-Policy Experience
C. Banerjee
Zhiyong Chen
N. Noman
19
30
0
24 Sep 2021
Optimal Actor-Critic Policy with Optimized Training Datasets
C. Banerjee
Zhiyong Chen
N. Noman
M. Zamani
OffRL
35
7
0
16 Aug 2021
Deep Reinforcement Learning for Demand Driven Services in Logistics and Transportation Systems: A Survey
Zefang Zong
Tao Feng
Tong Xia
Depeng Jin
Yong Li
27
3
0
10 Aug 2021
1
2
3
Next