ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.01561
  4. Cited By
IMPALA: Scalable Distributed Deep-RL with Importance Weighted
  Actor-Learner Architectures
v1v2v3 (latest)

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

5 February 2018
L. Espeholt
Hubert Soyer
Rémi Munos
Karen Simonyan
Volodymyr Mnih
Tom Ward
Yotam Doron
Vlad Firoiu
Tim Harley
Iain Dunning
Shane Legg
Koray Kavukcuoglu
ArXiv (abs)PDFHTML

Papers citing "IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures"

50 / 1,000 papers shown
Title
The Phenomenon of Policy Churn
The Phenomenon of Policy Churn
Tom Schaul
André Barreto
John Quan
Georg Ostrovski
89
28
0
01 Jun 2022
Efficient Scheduling of Data Augmentation for Deep Reinforcement
  Learning
Efficient Scheduling of Data Augmentation for Deep Reinforcement Learning
Byungchan Ko
Jungseul Ok
OnRL
110
5
0
01 Jun 2022
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Yiding Chen
Xuezhou Zhang
Kai Zhang
Mengdi Wang
Xiaojin Zhu
OffRL
135
18
0
01 Jun 2022
BRExIt: On Opponent Modelling in Expert Iteration
BRExIt: On Opponent Modelling in Expert Iteration
Daniel Hernández
Hendrik Baier
Michael Kaisers
65
2
0
31 May 2022
Reinforcement Learning with a Terminator
Reinforcement Learning with a Terminator
Guy Tennenholtz
Nadav Merlis
Lior Shani
Shie Mannor
Uri Shalit
Gal Chechik
Assaf Hallak
Gal Dalal
65
5
0
30 May 2022
Off-Beat Multi-Agent Reinforcement Learning
Off-Beat Multi-Agent Reinforcement Learning
Wei Qiu
Weixun Wang
Rongpin Wang
Bo An
Yujing Hu
S. Obraztsova
Zinovi Rabinovich
Jianye Hao
Yingfeng Chen
Changjie Fan
OffRL
58
2
0
27 May 2022
History Compression via Language Models in Reinforcement Learning
History Compression via Language Models in Reinforcement Learning
Fabian Paischer
Thomas Adler
Vihang Patil
Angela Bitto-Nemling
Markus Holzleitner
Sebastian Lehner
Hamid Eghbalzadeh
Sepp Hochreiter
OffRLAI4TS
123
46
0
24 May 2022
An Evaluation Study of Intrinsic Motivation Techniques applied to
  Reinforcement Learning over Hard Exploration Environments
An Evaluation Study of Intrinsic Motivation Techniques applied to Reinforcement Learning over Hard Exploration Environments
Alain Andres
Esther Villar-Rodriguez
Javier Del Ser
69
9
0
23 May 2022
Learning Task-relevant Representations for Generalization via
  Characteristic Functions of Reward Sequence Distributions
Learning Task-relevant Representations for Generalization via Characteristic Functions of Reward Sequence Distributions
Rui Yang
Jie Wang
Zijie Geng
Mingxuan Ye
Shuiwang Ji
Bin Li
Fengli Wu
OOD
82
22
0
20 May 2022
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still
  Insufficient according to an Off-Policy Measure
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy Measure
Xing Chen
Dongcui Diao
Hechang Chen
Hengshuai Yao
Haiyin Piao
Zhixiao Sun
Zhiwei Yang
Randy Goebel
Bei Jiang
Yi-Ju Chang
OffRL
144
9
0
20 May 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&RoLLMAGAI4CE
217
827
0
12 May 2022
Efficient Distributed Framework for Collaborative Multi-Agent
  Reinforcement Learning
Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning
Shuhan Qi
Shuhao Zhang
Xiaohan Hou
Jia-jia Zhang
Xinyu Wang
Jing Xiao
54
0
0
11 May 2022
Interactive Grounded Language Understanding in a Collaborative
  Environment: IGLU 2021
Interactive Grounded Language Understanding in a Collaborative Environment: IGLU 2021
Julia Kiseleva
Ziming Li
Mohammad Aliannejadi
Shrestha Mohanty
Maartje ter Hoeve
...
I. Churin
Putra Manggala
Kata Naszádi
Michiel van der Meer
Taewoon Kim
LLMAG
94
30
0
05 May 2022
Collaborative Target Search with a Visual Drone Swarm: An Adaptive
  Curriculum Embedded Multistage Reinforcement Learning Approach
Collaborative Target Search with a Visual Drone Swarm: An Adaptive Curriculum Embedded Multistage Reinforcement Learning Approach
Jiaping Xiao
Phumrapee Pisutsin
Mir Feroskhan
110
23
0
26 Apr 2022
Graph Neural Network based Agent in Google Research Football
Graph Neural Network based Agent in Google Research Football
Yizhan Niu
Jinglong Liu
Yuhao Shi
Jiren Zhu
GNN
82
2
0
23 Apr 2022
Local Feature Swapping for Generalization in Reinforcement Learning
Local Feature Swapping for Generalization in Reinforcement Learning
David Bertoin
Emmanuel Rachelson
OOD
81
15
0
13 Apr 2022
Dynamic Dialogue Policy for Continual Reinforcement Learning
Dynamic Dialogue Policy for Continual Reinforcement Learning
Christian Geishauser
Carel van Niekerk
Nurul Lubis
Michael Heck
Hsien-chin Lin
Shutong Feng
Milica Gavsić
CLLOffRL
84
14
0
12 Apr 2022
When Should We Prefer Offline Reinforcement Learning Over Behavioral
  Cloning?
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Aviral Kumar
Joey Hong
Anika Singh
Sergey Levine
OffRL
117
84
0
12 Apr 2022
Semantic Exploration from Language Abstractions and Pretrained
  Representations
Semantic Exploration from Language Abstractions and Pretrained Representations
Allison C. Tam
Neil C. Rabinowitz
Andrew Kyle Lampinen
Nicholas A. Roy
Stephanie C. Y. Chan
D. Strouse
Jane X. Wang
Andrea Banino
Felix Hill
LM&Ro
134
70
0
08 Apr 2022
Federated Reinforcement Learning with Environment Heterogeneity
Federated Reinforcement Learning with Environment Heterogeneity
Hao Jin
Yang Peng
Wenhao Yang
Shusen Wang
Zhihua Zhang
102
79
0
06 Apr 2022
Imitate and Repurpose: Learning Reusable Robot Movement Skills From
  Human and Animal Behaviors
Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors
Steven Bohez
S. Tunyasuvunakool
Philemon Brakel
Fereshteh Sadeghi
Leonard Hasenclever
...
Nathan Batchelor
Federico Casarini
J. Merel
R. Hadsell
N. Heess
98
51
0
31 Mar 2022
PerfectDou: Dominating DouDizhu with Perfect Information Distillation
PerfectDou: Dominating DouDizhu with Perfect Information Distillation
Yang Guan
Minghuan Liu
Weijun Hong
Weinan Zhang
Fei Fang
Guangjun Zeng
Yue Lin
119
28
0
30 Mar 2022
Marginalized Operators for Off-policy Reinforcement Learning
Marginalized Operators for Off-policy Reinforcement Learning
Yunhao Tang
Mark Rowland
Rémi Munos
Michal Valko
OffRL
70
0
0
30 Mar 2022
Asynchronous Reinforcement Learning for Real-Time Control of Physical
  Robots
Asynchronous Reinforcement Learning for Real-Time Control of Physical Robots
Yufeng Yuan
Rupam Mahmood
OffRL
110
19
0
23 Mar 2022
Insights From the NeurIPS 2021 NetHack Challenge
Insights From the NeurIPS 2021 NetHack Challenge
Eric Hambro
Sharada Mohanty
Dmitrii Babaev
Mi-Ra Byeon
Dipam Chakraborty
...
Dan Rothermel
Mikayel Samvelyan
Dmitry Sorokin
Maciej Sypetkowski
Michal Sypetkowski
75
19
0
22 Mar 2022
Tactile Pose Estimation and Policy Learning for Unknown Object
  Manipulation
Tactile Pose Estimation and Policy Learning for Unknown Object Manipulation
Tarik Kelestemur
Robert Platt
T. Padır
71
32
0
21 Mar 2022
Symmetry-Based Representations for Artificial and Biological General
  Intelligence
Symmetry-Based Representations for Artificial and Biological General Intelligence
I. Higgins
S. Racanière
Danilo Jimenez Rezende
AI4CE
97
46
0
17 Mar 2022
Zipfian environments for Reinforcement Learning
Zipfian environments for Reinforcement Learning
Stephanie C. Y. Chan
Andrew Kyle Lampinen
Pierre Harvey Richemond
Felix Hill
OffRL
125
15
0
15 Mar 2022
Switch Trajectory Transformer with Distributional Value Approximation
  for Multi-Task Reinforcement Learning
Switch Trajectory Transformer with Distributional Value Approximation for Multi-Task Reinforcement Learning
Qinjie Lin
Han Liu
B. Sengupta
OffRL
74
12
0
14 Mar 2022
Temporal Difference Learning for Model Predictive Control
Temporal Difference Learning for Model Predictive Control
Nicklas Hansen
Xiaolong Wang
H. Su
PINNMU
103
256
0
09 Mar 2022
The Unsurprising Effectiveness of Pre-Trained Vision Models for Control
The Unsurprising Effectiveness of Pre-Trained Vision Models for Control
Simone Parisi
Aravind Rajeswaran
Senthil Purushwalkam
Abhinav Gupta
LM&Ro
132
198
0
07 Mar 2022
Hierarchically Structured Scheduling and Execution of Tasks in a
  Multi-Agent Environment
Hierarchically Structured Scheduling and Execution of Tasks in a Multi-Agent Environment
Diogo S. Carvalho
B. Sengupta
63
2
0
06 Mar 2022
AutoDIME: Automatic Design of Interesting Multi-Agent Environments
AutoDIME: Automatic Design of Interesting Multi-Agent Environments
I. Kanitscheider
Harrison Edwards
58
0
0
04 Mar 2022
Avalanche RL: a Continual Reinforcement Learning Library
Avalanche RL: a Continual Reinforcement Learning Library
Nicolo Lucchesi
Antonio Carta
Vincenzo Lomonaco
Davide Bacciu
82
6
0
28 Feb 2022
Collaborative Training of Heterogeneous Reinforcement Learning Agents in
  Environments with Sparse Rewards: What and When to Share?
Collaborative Training of Heterogeneous Reinforcement Learning Agents in Environments with Sparse Rewards: What and When to Share?
Alain Andres
Esther Villar-Rodriguez
Javier Del Ser
84
9
0
24 Feb 2022
Improving Intrinsic Exploration with Language Abstractions
Improving Intrinsic Exploration with Language Abstractions
Jesse Mu
Victor Zhong
Roberta Raileanu
Minqi Jiang
Noah D. Goodman
Tim Rocktaschel
Edward Grefenstette
177
66
0
17 Feb 2022
MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned
MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned
Anssi Kanervisto
Stephanie Milani
Karolis Ramanauskas
Nicholay Topin
Zichuan Lin
...
Franccois Fleuret
Alexander Nikulin
Yury Belousov
Oleg Svidchenko
A. Shpilman
OffRL
127
33
0
17 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in
  Actor-Critic Algorithms
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
94
2
0
15 Feb 2022
Compute Trends Across Three Eras of Machine Learning
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
116
280
0
11 Feb 2022
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
Kazuki Irie
Imanol Schlag
Róbert Csordás
Jürgen Schmidhuber
50
28
0
11 Feb 2022
PRIMA: Planner-Reasoner Inside a Multi-task Reasoning Agent
PRIMA: Planner-Reasoner Inside a Multi-task Reasoning Agent
Daoming Lyu
Bo Liu
Jianshu Chen
LRM
78
1
0
01 Feb 2022
Accelerating Deep Reinforcement Learning for Digital Twin Network
  Optimization with Evolutionary Strategies
Accelerating Deep Reinforcement Learning for Digital Twin Network Optimization with Evolutionary Strategies
Carlos Güemes-Palau
Paul Almasan
Shihan Xiao
Xiangle Cheng
Xiang Shi
Pere Barlet-Ros
A. Cabellos-Aparicio
58
9
0
01 Feb 2022
You May Not Need Ratio Clipping in PPO
You May Not Need Ratio Clipping in PPO
Mingfei Sun
Vitaly Kurin
Guoqing Liu
Sam Devlin
Tao Qin
Katja Hofmann
Shimon Whiteson
62
16
0
31 Jan 2022
DeepRNG: Towards Deep Reinforcement Learning-Assisted Generative Testing
  of Software
DeepRNG: Towards Deep Reinforcement Learning-Assisted Generative Testing of Software
Chuan-Yung Tsai
Graham W. Taylor
28
3
0
29 Jan 2022
Efficient Embedding of Semantic Similarity in Control Policies via
  Entangled Bisimulation
Efficient Embedding of Semantic Similarity in Control Policies via Entangled Bisimulation
Martín Bertrán
Walter A. Talbott
Nitish Srivastava
J. Susskind
93
3
0
28 Jan 2022
Leveraging class abstraction for commonsense reinforcement learning via
  residual policy gradient methods
Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods
Niklas Höpner
Ilaria Tiddi
H. V. Hoof
61
3
0
28 Jan 2022
Chaining Value Functions for Off-Policy Learning
Chaining Value Functions for Off-Policy Learning
Simon Schmitt
John Shawe-Taylor
Hado van Hasselt
OffRL
54
3
0
17 Jan 2022
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning
Emanuel Metzenthin
Christian Bartz
Christoph Meinel
OffRL
66
2
0
13 Jan 2022
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
Jack Parker-Holder
Raghunandan Rajan
Xingyou Song
André Biedenkapp
Yingjie Miao
...
Vu-Linh Nguyen
Roberto Calandra
Aleksandra Faust
Frank Hutter
Marius Lindauer
AI4CE
116
107
0
11 Jan 2022
The Effects of Reward Misspecification: Mapping and Mitigating
  Misaligned Models
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models
Alexander Pan
Kush S. Bhatia
Jacob Steinhardt
128
184
0
10 Jan 2022
Previous
123...8910...181920
Next