ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.01783
  4. Cited By
Asynchronous Methods for Deep Reinforcement Learning
v1v2 (latest)

Asynchronous Methods for Deep Reinforcement Learning

4 February 2016
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
ArXiv (abs)PDFHTML

Papers citing "Asynchronous Methods for Deep Reinforcement Learning"

50 / 3,591 papers shown
Title
LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online
  Auctions
LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions
Yu Wang
Jiayi Liu
Yuxiang Liu
Jun Hao
Yang He
Jinghe Hu
Weipeng P. Yan
Mantian Li
74
19
0
18 Aug 2017
Scalable trust-region method for deep reinforcement learning using
  Kronecker-factored approximation
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
152
631
0
17 Aug 2017
StarCraft II: A New Challenge for Reinforcement Learning
StarCraft II: A New Challenge for Reinforcement Learning
Oriol Vinyals
T. Ewalds
Sergey Bartunov
Petko Georgiev
A. Vezhnevets
...
Anthony Brunasso
David Lawrence
Anders Ekermo
J. Repp
Rodney Tsing
119
876
0
16 Aug 2017
Deep Object-Centric Representations for Generalizable Robot Learning
Deep Object-Centric Representations for Generalizable Robot Learning
Coline Devin
Pieter Abbeel
Trevor Darrell
Sergey Levine
SSLOCL
113
108
0
14 Aug 2017
Deep Reinforcement Learning for High Precision Assembly Tasks
Deep Reinforcement Learning for High Precision Assembly Tasks
Tadanobu Inoue
Giovanni De Magistris
Asim Munawar
T. Yokoya
Ryuki Tachibana
108
268
0
14 Aug 2017
Belief Tree Search for Active Object Recognition
Belief Tree Search for Active Object Recognition
Mohsen Malmir
G. Cottrell
32
7
0
13 Aug 2017
A Machine Learning Approach to Routing
A Machine Learning Approach to Routing
Asaf Valadarsky
Michael Schapira
Dafna Shahaf
Aviv Tamar
71
38
0
10 Aug 2017
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with
  Model-Free Fine-Tuning
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
Anusha Nagabandi
G. Kahn
R. Fearing
Sergey Levine
150
977
0
08 Aug 2017
An Information-Theoretic Optimality Principle for Deep Reinforcement
  Learning
An Information-Theoretic Optimality Principle for Deep Reinforcement Learning
Felix Leibfried
Jordi Grau-Moya
Haitham Bou-Ammar
101
24
0
06 Aug 2017
The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task
The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task
Amr Sharaf
Shi Feng
Khanh Nguyen
Kianté Brantley
Hal Daumé
31
4
0
03 Aug 2017
Grounding Language for Transfer in Deep Reinforcement Learning
Grounding Language for Transfer in Deep Reinforcement Learning
Karthik Narasimhan
Regina Barzilay
Tommi Jaakkola
LM&RoOffRL
108
25
0
01 Aug 2017
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
I. Higgins
Arka Pal
Andrei A. Rusu
Loic Matthey
Christopher P. Burgess
Alexander Pritzel
M. Botvinick
Charles Blundell
Alexander Lerchner
DRL
173
417
0
26 Jul 2017
Reinforcement Learning for Bandit Neural Machine Translation with
  Simulated Human Feedback
Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback
Khanh Nguyen
Hal Daumé
Jordan L. Boyd-Graber
100
138
0
24 Jul 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
698
19,363
0
20 Jul 2017
Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning
T. Weber
S. Racanière
David P. Reichert
Lars Buesing
A. Guez
...
Razvan Pascanu
Peter W. Battaglia
Demis Hassabis
David Silver
Daan Wierstra
LM&Ro
124
557
0
19 Jul 2017
On-line Building Energy Optimization using Deep Reinforcement Learning
On-line Building Energy Optimization using Deep Reinforcement Learning
Elena Mocanu
Decebal Constantin Mocanu
Phuong H. Nguyen
A. Liotta
M. Webber
M. Gibescu
J. Slootweg
OffRL
71
475
0
18 Jul 2017
Trial without Error: Towards Safe Reinforcement Learning via Human
  Intervention
Trial without Error: Towards Safe Reinforcement Learning via Human Intervention
William Saunders
Girish Sastry
Andreas Stuhlmuller
Owain Evans
OffRL
79
231
0
17 Jul 2017
Distral: Robust Multitask Reinforcement Learning
Distral: Robust Multitask Reinforcement Learning
Yee Whye Teh
V. Bapst
Wojciech M. Czarnecki
John Quan
J. Kirkpatrick
R. Hadsell
N. Heess
Razvan Pascanu
219
554
0
13 Jul 2017
Learning Macromanagement in StarCraft from Replays using Deep Learning
Learning Macromanagement in StarCraft from Replays using Deep Learning
Niels Justesen
S. Risi
103
68
0
12 Jul 2017
Value Prediction Network
Value Prediction Network
Junhyuk Oh
Satinder Singh
Honglak Lee
100
335
0
11 Jul 2017
SCAN: Learning Hierarchical Compositional Visual Concepts
SCAN: Learning Hierarchical Compositional Visual Concepts
I. Higgins
Nicolas Sonnerat
Loic Matthey
Arka Pal
Christopher P. Burgess
Matko Bosnjak
Murray Shanahan
M. Botvinick
Demis Hassabis
Alexander Lerchner
OCLDRLCoGe
93
51
0
11 Jul 2017
The Intentional Unintentional Agent: Learning to Solve Many Continuous
  Control Tasks Simultaneously
The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously
Serkan Cabi
Sergio Gomez Colmenarejo
Matthew W. Hoffman
Misha Denil
Ziyun Wang
Nando de Freitas
83
31
0
11 Jul 2017
Emergence of Locomotion Behaviours in Rich Environments
Emergence of Locomotion Behaviours in Rich Environments
N. Heess
TB Dhruva
S. Sriram
Jay Lemmon
J. Merel
...
Tom Erez
Ziyun Wang
S. M. Ali Eslami
Martin Riedmiller
David Silver
239
939
0
07 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
91
107
0
06 Jul 2017
Optimal Vehicle Dispatching Schemes via Dynamic Pricing
Optimal Vehicle Dispatching Schemes via Dynamic Pricing
Mengjing Chen
Weiran Shen
Pingzhong Tang
Song Zuo
36
8
0
06 Jul 2017
Maintaining cooperation in complex social dilemmas using deep
  reinforcement learning
Maintaining cooperation in complex social dilemmas using deep reinforcement learning
Adam Lerer
A. Peysakhovich
129
160
0
04 Jul 2017
ELF: An Extensive, Lightweight and Flexible Research Platform for
  Real-time Strategy Games
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
Yuandong Tian
Qucheng Gong
Wenling Shang
Yuxin Wu
C. L. Zitnick
OffRL
74
126
0
04 Jul 2017
Hashing over Predicted Future Frames for Informed Exploration of Deep
  Reinforcement Learning
Hashing over Predicted Future Frames for Informed Exploration of Deep Reinforcement Learning
Haiyan Yin
Jianda Chen
Sinno Jialin Pan
51
5
0
03 Jul 2017
Noisy Networks for Exploration
Noisy Networks for Exploration
Meire Fortunato
M. G. Azar
Bilal Piot
Jacob Menick
Ian Osband
...
Rémi Munos
Demis Hassabis
Olivier Pietquin
Charles Blundell
Shane Legg
116
898
0
30 Jun 2017
A Deep Reinforcement Learning Framework for the Financial Portfolio
  Management Problem
A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem
Zhengyao Jiang
Dixing Xu
Jinjun Liang
OOD
92
350
0
30 Jun 2017
Neural Sequence Model Training via $α$-divergence Minimization
Neural Sequence Model Training via ααα-divergence Minimization
Sotetsu Koyamada
Yuta Kikuchi
Atsunori Kanemura
S. Maeda
S. Ishii
104
0
0
30 Jun 2017
Actor-Critic Sequence Training for Image Captioning
Actor-Critic Sequence Training for Image Captioning
Li Zhang
Flood Sung
Feng Liu
Tao Xiang
S. Gong
Yongxin Yang
Timothy M. Hospedales
86
111
0
29 Jun 2017
Path Integral Networks: End-to-End Differentiable Optimal Control
Path Integral Networks: End-to-End Differentiable Optimal Control
Masashi Okada
Luca Rigazio
T. Aoshima
PINN
66
56
0
29 Jun 2017
Learning to Learn: Meta-Critic Networks for Sample Efficient Learning
Learning to Learn: Meta-Critic Networks for Sample Efficient Learning
Flood Sung
Li Zhang
Tao Xiang
Timothy M. Hospedales
Yongxin Yang
OffRL
81
129
0
29 Jun 2017
Neural SLAM: Learning to Explore with External Memory
Neural SLAM: Learning to Explore with External Memory
Jingwei Zhang
L. Tai
Ming-Yuan Liu
Joschka Boedecker
Wolfram Burgard
106
71
0
29 Jun 2017
Count-Based Exploration in Feature Space for Reinforcement Learning
Count-Based Exploration in Feature Space for Reinforcement Learning
Jarryd Martin
S. N. Sasikumar
Tom Everitt
Marcus Hutter
76
124
0
25 Jun 2017
Gated-Attention Architectures for Task-Oriented Language Grounding
Gated-Attention Architectures for Task-Oriented Language Grounding
Devendra Singh Chaplot
Kanthashree Mysore Sathyendra
Rama Kumar Pasumarthi
Dheeraj Rajagopal
Ruslan Salakhutdinov
LM&Ro
86
280
0
22 Jun 2017
Observational Learning by Reinforcement Learning
Observational Learning by Reinforcement Learning
Diana Borsa
Bilal Piot
Rémi Munos
Olivier Pietquin
OffRL
54
45
0
20 Jun 2017
Grounded Language Learning in a Simulated 3D World
Grounded Language Learning in a Simulated 3D World
Karl Moritz Hermann
Felix Hill
Simon Green
Fumin Wang
Ryan Faulkner
...
Denis Teplyashin
Marcus Wainwright
C. Apps
Demis Hassabis
Phil Blunsom
LM&Ro
100
306
0
20 Jun 2017
Dex: Incremental Learning for Complex Environments in Deep Reinforcement
  Learning
Dex: Incremental Learning for Complex Environments in Deep Reinforcement Learning
Nick Erickson
Qi Zhao
CLLOffRL
422
2
0
19 Jun 2017
Value-Decomposition Networks For Cooperative Multi-Agent Learning
Value-Decomposition Networks For Cooperative Multi-Agent Learning
P. Sunehag
Guy Lever
A. Gruslys
Wojciech M. Czarnecki
V. Zambaldi
...
Marc Lanctot
Nicolas Sonnerat
Joel Z Leibo
K. Tuyls
T. Graepel
132
1,014
0
16 Jun 2017
Expected Policy Gradients
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
131
58
0
15 Jun 2017
Sobolev Training for Neural Networks
Sobolev Training for Neural Networks
Wojciech M. Czarnecki
Simon Osindero
Max Jaderberg
G. Swirszcz
Razvan Pascanu
91
248
0
15 Jun 2017
Schema Networks: Zero-shot Transfer with a Generative Causal Model of
  Intuitive Physics
Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics
Ken Kansky
Tom Silver
David A. Mély
Mohamed Eldawy
Miguel Lazaro-Gredilla
Xinghua Lou
N. Dorfman
Szymon Sidor
Scott Phoenix
Dileep George
AI4CE
125
236
0
14 Jun 2017
Hybrid Reward Architecture for Reinforcement Learning
Hybrid Reward Architecture for Reinforcement Learning
H. V. Seijen
Mehdi Fatemi
Joshua Romoff
Romain Laroche
Tavian Barnes
Jeffrey Tsang
100
253
0
13 Jun 2017
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
230
3,388
0
12 Jun 2017
ACCNet: Actor-Coordinator-Critic Net for "Learning-to-Communicate" with
  Deep Multi-agent Reinforcement Learning
ACCNet: Actor-Coordinator-Critic Net for "Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning
Hangyu Mao
Zhibo Gong
Yan Ni
Zhen Xiao
87
45
0
10 Jun 2017
Generalized Value Iteration Networks: Life Beyond Lattices
Generalized Value Iteration Networks: Life Beyond Lattices
Sufeng Niu
Siheng Chen
Hanyu Guo
Colin Targonski
M. C. Smith
J. Kovacevic
GNN
78
56
0
08 Jun 2017
The Atari Grand Challenge Dataset
The Atari Grand Challenge Dataset
Vitaly Kurin
Sebastian Nowozin
Katja Hofmann
Lucas Beyer
Bastian Leibe
OffRL
86
45
0
31 May 2017
Non-Markovian Control with Gated End-to-End Memory Policy Networks
Non-Markovian Control with Gated End-to-End Memory Policy Networks
J. Perez
T. Silander
OffRL
67
6
0
31 May 2017
Previous
123...6869707172
Next