Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.01783
Cited By
v1
v2 (latest)
Asynchronous Methods for Deep Reinforcement Learning
4 February 2016
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Asynchronous Methods for Deep Reinforcement Learning"
50 / 3,591 papers shown
Title
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
71
9
0
23 Feb 2022
Coordinate-Aligned Multi-Camera Collaboration for Active Multi-Object Tracking
Zeyu Fang
Jian Zhao
Mingyu Yang
Wen-gang Zhou
Zhenbo Lu
Houqiang Li
80
10
0
22 Feb 2022
A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning
Youssef Diouane
Aurelien Lucchi
Vihang Patil
84
3
0
21 Feb 2022
Cyber-Physical Defense in the Quantum Era
Michel Barbeau
Joaquín García
59
10
0
21 Feb 2022
Black-box Node Injection Attack for Graph Neural Networks
Mingxuan Ju
Yujie Fan
Yanfang Ye
Liang Zhao
AAML
121
2
0
18 Feb 2022
Open-Ended Reinforcement Learning with Neural Reward Functions
Robert Meier
Asier Mujika
101
7
0
16 Feb 2022
Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
Sébastien M. R. Arnold
P. LÉcuyer
Liyu Chen
Yi-fan Chen
Fei Sha
OffRL
84
4
0
16 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
94
2
0
15 Feb 2022
One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones
Chan Hee Song
Jihyung Kil
Tai-Yu Pan
Brian M. Sadler
Wei-Lun Chao
Yu-Chuan Su
LRM
80
33
0
14 Feb 2022
QuadSim: A Quadcopter Rotational Dynamics Simulation Framework For Reinforcement Learning Algorithms
Burak Han Demirbilek
24
0
0
14 Feb 2022
On the Convergence of SARSA with Linear Function Approximation
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
74
11
0
14 Feb 2022
Sequential Bayesian experimental designs via reinforcement learning
Hikaru Asano
OffRL
80
0
0
14 Feb 2022
Autonomous Drone Swarm Navigation and Multi-target Tracking in 3D Environments with Dynamic Obstacles
Suleman Qamar
Dr. Saddam Hussain Khan
Muhammad Arif Arshad
Maryam Qamar
Asifullah Khan
62
16
0
13 Feb 2022
A Unified Perspective on Value Backup and Exploration in Monte-Carlo Tree Search
Tuan Dam
Carlo DÉramo
Jan Peters
Joni Pajarinen
45
1
0
11 Feb 2022
Online Decision Transformer
Qinqing Zheng
Amy Zhang
Aditya Grover
OffRL
93
209
0
11 Feb 2022
Robots Learn Increasingly Complex Tasks with Intrinsic Motivation and Automatic Curriculum Learning
S. Nguyen
Nicolas Duminy
A. Manoury
D. Duhaut
Cédric Buche
59
8
0
11 Feb 2022
Uncovering Instabilities in Variational-Quantum Deep Q-Networks
Maja Franz
Lucas Wolf
Maniraman Periyasamy
Christian Ufrecht
Daniel D. Scherer
Axel Plinge
Christopher Mutschler
Wolfgang Mauerer
131
30
0
10 Feb 2022
Group-Agent Reinforcement Learning
Kaiyue Wu
Xiaoming Zeng
OOD
OffRL
37
3
0
10 Feb 2022
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory
Ruiqi Zhang
Xuezhou Zhang
Chengzhuo Ni
Mengdi Wang
OffRL
90
16
0
10 Feb 2022
skrl: Modular and Flexible Library for Reinforcement Learning
Antonio Serrano-Muñoz
D. Chrysostomou
Simon Boegh
N. Arana-Arexolaleiba
89
31
0
08 Feb 2022
What's Cracking? A Review and Analysis of Deep Learning Methods for Structural Crack Segmentation, Detection and Quantification
Jacob König
M. Jenkins
M. Mannion
P. Barrie
Gordon Morison
110
14
0
08 Feb 2022
Multi-Agent Path Finding with Prioritized Communication Learning
Wenhao Li
Hongjun Chen
Bo Jin
Wenzhe Tan
Hong Zha
Xiangfeng Wang
AI4CE
63
32
0
08 Feb 2022
Attacking c-MARL More Effectively: A Data Driven Approach
Nhan H. Pham
Lam M. Nguyen
Jie Chen
Hoang Thanh Lam
Subhro Das
Tsui-Wei Weng
AAML
99
2
0
07 Feb 2022
Red Teaming Language Models with Language Models
Ethan Perez
Saffron Huang
Francis Song
Trevor Cai
Roman Ring
John Aslanides
Amelia Glaese
Nat McAleese
G. Irving
AAML
226
672
0
07 Feb 2022
Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning
Michael Teng
M. van de Panne
Frank Wood
OOD
OffRL
39
1
0
06 Feb 2022
A Temporal-Difference Approach to Policy Gradient Estimation
Samuele Tosatto
Andrew Patterson
Martha White
A. R. Mahmood
OffRL
118
2
0
04 Feb 2022
Meta-Reinforcement Learning with Self-Modifying Networks
Mathieu Chalvidal
Thomas Serre
Rufin VanRullen
KELM
87
5
0
04 Feb 2022
A Survey on Safety-Critical Driving Scenario Generation -- A Methodological Perspective
Wenhao Ding
Chejian Xu
Mansur Arief
Hao-ming Lin
Yue Liu
Ding Zhao
119
163
0
04 Feb 2022
Pre-Trained Language Models for Interactive Decision-Making
Shuang Li
Xavier Puig
Chris Paxton
Yilun Du
Clinton Jia Wang
...
Anima Anandkumar
Jacob Andreas
Igor Mordatch
Antonio Torralba
Yuke Zhu
LM&Ro
133
264
0
03 Feb 2022
ExPoSe: Combining State-Based Exploration with Gradient-Based Online Search
Dixant Mittal
Siddharth Aravindan
W. Lee
OnRL
48
3
0
03 Feb 2022
Reinforcement learning of optimal active particle navigation
Mahdi Nasiri
B. Liebchen
80
26
0
01 Feb 2022
Accelerating Deep Reinforcement Learning for Digital Twin Network Optimization with Evolutionary Strategies
Carlos Güemes-Palau
Paul Almasan
Shihan Xiao
Xiangle Cheng
Xiang Shi
Pere Barlet-Ros
A. Cabellos-Aparicio
56
9
0
01 Feb 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
Matilde Gargiani
Andrea Zanelli
Andrea Martinelli
Tyler H. Summers
John Lygeros
76
14
0
01 Feb 2022
Warmth and competence in human-agent cooperation
Kevin R. McKee
Xuechunzi Bai
S. Fiske
116
27
0
31 Jan 2022
Trajectory balance: Improved credit assignment in GFlowNets
Nikolay Malkin
Moksh Jain
Emmanuel Bengio
Chen Sun
Yoshua Bengio
272
186
0
31 Jan 2022
Zeroth-Order Actor-Critic: An Evolutionary Framework for Sequential Decision Problems
Yuheng Lei
Jianyu Chen
Guojian Zhan
Tao Zhang
Jiangtao Li
Jianyu Chen
Shengbo Eben Li
Sifa Zheng
OffRL
82
3
0
29 Jan 2022
Explaining Reinforcement Learning Policies through Counterfactual Trajectories
Julius Frost
Olivia Watkins
Eric Weiner
Pieter Abbeel
Trevor Darrell
Bryan A. Plummer
Kate Saenko
OffRL
84
6
0
29 Jan 2022
Do You Need the Entropy Reward (in Practice)?
Haonan Yu
Haichao Zhang
Wei Xu
85
8
0
28 Jan 2022
Discovering Exfiltration Paths Using Reinforcement Learning with Attack Graphs
Tyler Cody
Abdul Rahman
Christopher Redino
Lanxiao Huang
Ryan Clark
Akshay Kakkar
Deepak Kushwaha
Paul Park
Peter A. Beling
Edward Bowen
68
14
0
28 Jan 2022
A Regret Minimization Approach to Multi-Agent Control
Udaya Ghai
Udari Madhushani
Naomi Ehrich Leonard
Elad Hazan
62
5
0
28 Jan 2022
Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods
Niklas Höpner
Ilaria Tiddi
H. V. Hoof
61
3
0
28 Jan 2022
Modeling Human Exploration Through Resource-Rational Reinforcement Learning
Marcel Binz
Eric Schulz
84
15
0
27 Jan 2022
Generative Adversarial Exploration for Reinforcement Learning
Weijun Hong
Menghui Zhu
Minghuan Liu
Weinan Zhang
Ming Zhou
Yong Yu
Peng Sun
OnRL
68
7
0
27 Jan 2022
Reinforcement Learning-Empowered Mobile Edge Computing for 6G Edge Intelligence
Pengjin Wei
Kun Guo
Ye Li
Jue Wang
W. Feng
Shi Jin
Ning Ge
Ying-Chang Liang
104
46
0
27 Jan 2022
DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators
Sheng-Chun Kao
Xiaoyu Huang
T. Krishna
AI4CE
98
9
0
26 Jan 2022
Learning Invariable Semantical Representation from Language for Extensible Policy Generalization
Yihan Li
Jinsheng Ren
Tianrun Xu
Tianren Zhang
Haichuan Gao
Feng Chen
53
1
0
26 Jan 2022
Online Attentive Kernel-Based Temporal Difference Learning
Guang Yang
Xingguo Chen
Shangdong Yang
Huihui Wang
Shaokang Dong
Yang Gao
OffRL
28
3
0
22 Jan 2022
Environment Generation for Zero-Shot Compositional Reinforcement Learning
Izzeddin Gur
Natasha Jaques
Yingjie Miao
Jongwook Choi
Manoj Kumar Tiwari
Honglak Lee
Aleksandra Faust
91
43
0
21 Jan 2022
Reinforcement Learning for Personalized Drug Discovery and Design for Complex Diseases: A Systems Pharmacology Perspective
Ryan K. Tan
Yang Liu
Lei Xie
77
2
0
21 Jan 2022
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
K. Khamaru
Eric Xia
Martin J. Wainwright
Michael I. Jordan
90
6
0
21 Jan 2022
Previous
1
2
3
...
25
26
27
...
70
71
72
Next