Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.13350
Cited By
Agent57: Outperforming the Atari Human Benchmark
30 March 2020
Adria Puigdomenech Badia
Bilal Piot
Steven Kapturowski
Pablo Sprechmann
Alex Vitvitskyi
Daniel Guo
Charles Blundell
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Agent57: Outperforming the Atari Human Benchmark"
50 / 105 papers shown
Title
Adventurer: Exploration with BiGAN for Deep Reinforcement Learning
Yongshuai Liu
Xin Liu
GAN
103
2
0
24 Mar 2025
Adaptive Data Exploitation in Deep Reinforcement Learning
Mingqi Yuan
Bo Li
Xin Jin
Wenjun Zeng
OffRL
192
0
0
22 Jan 2025
Multi-Agent Quantum Reinforcement Learning using Evolutionary Optimization
Michael Kolle
Felix Topp
Thomy Phan
Philipp Altmann
Jonas Nusslein
Claudia Linnhoff-Popien
AI4CE
59
5
0
03 Jan 2025
BlendRL: A Framework for Merging Symbolic and Neural Policy Learning
Hikaru Shindo
Quentin Delfosse
Devendra Singh Dhami
Kristian Kersting
43
3
0
15 Oct 2024
Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic
Thomy Phan
Benran Zhang
Shao-Hung Chan
Sven Koenig
AI4CE
28
0
0
06 Aug 2024
ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning
Hosung Lee
Sejin Kim
Seungpil Lee
Sanha Hwang
Jihwan Lee
Byung-Jun Lee
Sundong Kim
LRM
37
8
0
30 Jul 2024
Proximal Policy Distillation
Giacomo Spigler
OffRL
28
1
0
21 Jul 2024
Random Latent Exploration for Deep Reinforcement Learning
Srinath Mahankali
Zhang-Wei Hong
Ayush Sekhari
Alexander Rakhlin
Pulkit Agrawal
33
3
0
18 Jul 2024
Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning
Augustine N. Mavor-Parker
Matthew J. Sargent
Caswell Barry
Lewis D. Griffin
Clare Lyle
47
2
0
09 Jul 2024
Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization
Liam Schramm
Abdeslam Boularias
25
1
0
07 Jul 2024
Simplifying Deep Temporal Difference Learning
Matteo Gallici
Mattie Fellows
Benjamin Ellis
B. Pou
Ivan Masmitja
Jakob Foerster
Mario Martin
OffRL
62
15
0
05 Jul 2024
When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions
Zhening Li
Gabriel Poesia
Armando Solar-Lezama
OffRL
42
1
0
12 Jun 2024
Offline Imitation of Badminton Player Behavior via Experiential Contexts and Brownian Motion
Kuang-Da Wang
Wei-Yao Wang
Ping-Chun Hsieh
Wenjie Peng
OffRL
34
0
0
19 Mar 2024
Multi-agent Reinforcement Learning: A Comprehensive Survey
Dom Huh
Prasant Mohapatra
AI4CE
36
8
0
15 Dec 2023
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRL
OOD
78
5
0
13 Dec 2023
Neuro-Inspired Fragmentation and Recall to Overcome Catastrophic Forgetting in Curiosity
Jaedong Hwang
Zhang-Wei Hong
Eric Chen
Akhilan Boopathy
Pulkit Agrawal
Ila Fiete
CLL
35
5
0
26 Oct 2023
LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework
Woojun Kim
Jeonghye Kim
Young-Jin Sung
22
5
0
05 Oct 2023
MIMEx: Intrinsic Rewards from Masked Input Modeling
Toru Lin
Allan Jabri
OffRL
28
6
0
15 May 2023
Supplementing Gradient-Based Reinforcement Learning with Simple Evolutionary Ideas
H. Khadilkar
27
0
0
10 May 2023
Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting
Nicolai Dorka
Tim Welschehold
Wolfram Burgard
16
3
0
17 Mar 2023
Mastering Strategy Card Game (Legends of Code and Magic) via End-to-End Policy and Optimistic Smooth Fictitious Play
Wei Xi
Yongxin Zhang
Changnan Xiao
Xuefeng Huang
Shihong Deng
Haowei Liang
Jie Chen
Peng Sun
OffRL
50
8
0
07 Mar 2023
The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting
Hongyao Tang
Hao Fei
Jianye Hao
23
1
0
02 Mar 2023
Backstepping Temporal Difference Learning
Han-Dong Lim
Dong-hwan Lee
OffRL
23
2
0
20 Feb 2023
TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play
Fanqing Lin
Shiyu Huang
Tim Pearce
Wenze Chen
Weijuan Tu
26
17
0
15 Feb 2023
Diversity Through Exclusion (DTE): Niche Identification for Reinforcement Learning through Value-Decomposition
P. Sunehag
A. Vezhnevets
Edgar A. Duénez-Guzmán
Igor Mordach
Joel Z Leibo
26
2
0
02 Feb 2023
Anti-Exploration by Random Network Distillation
Alexander Nikulin
Vladislav Kurenkov
Denis Tarasov
Sergey Kolesnikov
38
24
0
31 Jan 2023
Sample Efficient Deep Reinforcement Learning via Local Planning
Dong Yin
S. Thiagarajan
N. Lazić
Nived Rajaraman
Botao Hao
Csaba Szepesvári
25
4
0
29 Jan 2023
Self-Motivated Multi-Agent Exploration
Shaowei Zhang
Jiahan Cao
Lei Yuan
Yang Yu
De-Chuan Zhan
47
5
0
05 Jan 2023
On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective
Ying Wen
Bo Liu
M. Zhou
Shufang Hou
Zhe Cao
Chenyang Le
Jingxiao Chen
Zheng Tian
Weinan Zhang
Jun Wang
AI4CE
23
10
0
24 Dec 2022
Convergence Analysis for Training Stochastic Neural Networks via Stochastic Gradient Descent
Richard Archibald
F. Bao
Yanzhao Cao
Hui‐Jie Sun
52
2
0
17 Dec 2022
Generalizing LTL Instructions via Future Dependent Options
Duo Xu
Faramarz Fekri
OffRL
AI4CE
24
1
0
08 Dec 2022
Actively Learning Costly Reward Functions for Reinforcement Learning
André Eberhard
Houssam Metni
G. Fahland
A. Stroh
Pascal Friederich
OffRL
35
0
0
23 Nov 2022
Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function
Clément Bonnet
Laurence Midgley
Alexandre Laterre
24
1
0
19 Nov 2022
Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments
Daniel Jarrett
Corentin Tallec
Florent Altché
Thomas Mesnard
Rémi Munos
Michal Valko
48
5
0
18 Nov 2022
Reservoir Computing via Quantum Recurrent Neural Networks
Samuel Yen-Chi Chen
D. Fry
Amol Deshmukh
V. Rastunkov
Charlee Stefanski
21
16
0
04 Nov 2022
Teacher-student curriculum learning for reinforcement learning
Yanick Schraner
OffRL
37
2
0
31 Oct 2022
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning
Yifan Xu
Nicklas Hansen
Zirui Wang
Yung-Chieh Chan
H. Su
Z. Tu
OffRL
31
15
0
19 Oct 2022
Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View
Han-Dong Lim
Dong-hwan Lee
21
1
0
25 Jul 2022
Bayesian Generational Population-Based Training
Xingchen Wan
Cong Lu
Jack Parker-Holder
Philip J. Ball
Vu-Linh Nguyen
Binxin Ru
Michael A. Osborne
OffRL
31
15
0
19 Jul 2022
A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games
Zihan Ding
DiJia Su
Qinghua Liu
Chi Jin
33
3
0
18 Jul 2022
Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models
Alex Lamb
Riashat Islam
Yonathan Efroni
Aniket Didolkar
Dipendra Kumar Misra
Dylan J. Foster
Lekan Molu
Rajan Chari
A. Krishnamurthy
John Langford
41
24
0
17 Jul 2022
Associative Memory Based Experience Replay for Deep Reinforcement Learning
Mengyuan Li
Arman Kazemi
Ann Franchesca Laguna
Sharon Hu
VLM
16
8
0
16 Jul 2022
Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs
Benjamin Fuhrer
Yuval Shpigelman
Chen Tessler
Shie Mannor
Gal Chechik
E. Zahavi
Gal Dalal
25
4
0
05 Jul 2022
BYOL-Explore: Exploration by Bootstrapped Prediction
Z. Guo
S. Thakoor
Miruna Pislar
Bernardo Avila-Pires
Florent Altché
...
Yunhao Tang
Michal Valko
Rémi Munos
M. G. Azar
Bilal Piot
22
68
0
16 Jun 2022
Uniqueness and Complexity of Inverse MDP Models
Marcus Hutter
Steven Hansen
22
4
0
02 Jun 2022
Image Augmentation Based Momentum Memory Intrinsic Reward for Sparse Reward Visual Scenes
Zheng Fang
Biao Zhao
Guizhong Liu
16
2
0
19 May 2022
Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios
Ugo Lecerf
Christelle Yemdji Tchassi
S. Aubert
Pietro Michiardi
21
0
0
11 Apr 2022
Semantic Exploration from Language Abstractions and Pretrained Representations
Allison C. Tam
Neil C. Rabinowitz
Andrew Kyle Lampinen
Nicholas A. Roy
Stephanie C. Y. Chan
D. Strouse
Jane X. Wang
Andrea Banino
Felix Hill
LM&Ro
36
67
0
08 Apr 2022
Robust Meta-Reinforcement Learning with Curriculum-Based Task Sampling
Morio Matsumoto
Hiroya Matsuba
Toshihiro Kujirai
16
2
0
31 Mar 2022
Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation
Alex Long
Alan Blair
H. V. Hoof
23
3
0
07 Mar 2022
1
2
3
Next