Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.11593
Cited By
Observe and Look Further: Achieving Consistent Performance on Atari
29 May 2018
Tobias Pohlen
Bilal Piot
Todd Hester
M. G. Azar
Dan Horgan
David Budden
Gabriel Barth-Maron
H. V. Hasselt
John Quan
Mel Vecerík
Matteo Hessel
Rémi Munos
Olivier Pietquin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Observe and Look Further: Achieving Consistent Performance on Atari"
38 / 38 papers shown
Title
ReZero: Boosting MCTS-based Algorithms by Backward-view and Entire-buffer Reanalyze
Chunyu Xuan
Yazhe Niu
Yuan Pu
Shuai Hu
Yu Liu
Jing Yang
73
0
0
03 Jan 2025
Enhancing Reinforcement Learning Agents with Local Guides
Paul Daoudi
Bogdan Robu
Christophe Prieur
Ludovic Dos Santos
M. Barlier
OnRL
31
3
0
21 Feb 2024
Policy-Based Self-Competition for Planning Problems
Jonathan Pirnay
Q. Göttl
Jakob Burger
D. G. Grimm
46
3
0
07 Jun 2023
Robust Scheduling with GFlowNets
David W. Zhang
Corrado Rainone
M. Peschl
Roberto Bondesan
34
51
0
17 Jan 2023
MAN: Multi-Action Networks Learning
Keqin Wang
Alison Bartsch
A. Farimani
21
3
0
19 Sep 2022
Learning Dynamics and Generalization in Reinforcement Learning
Clare Lyle
Mark Rowland
Will Dabney
Marta Z. Kwiatkowska
Y. Gal
OOD
OffRL
30
12
0
05 Jun 2022
Deep Q-learning: a robust control approach
B. Varga
Balázs Kulcsár
M. Chehreghani
OOD
30
9
0
21 Jan 2022
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Aviral Kumar
Rishabh Agarwal
Tengyu Ma
Aaron Courville
George Tucker
Sergey Levine
OffRL
31
65
0
09 Dec 2021
Self-Consistent Models and Values
Roy Miles
Kate Baumli
Zita Marinho
Angelos Filos
Matteo Hessel
Hado van Hasselt
David Silver
38
8
0
25 Oct 2021
On Bonus-Based Exploration Methods in the Arcade Learning Environment
Adrien Ali Taïga
W. Fedus
Marlos C. Machado
Aaron Courville
Marc G. Bellemare
24
58
0
22 Sep 2021
Convergent and Efficient Deep Q Network Algorithm
Zhikang T. Wang
Masahito Ueda
27
12
0
29 Jun 2021
Taylor Expansion of Discount Factors
Yunhao Tang
Mark Rowland
Rémi Munos
Michal Valko
OffRL
32
5
0
11 Jun 2021
Regularized Behavior Value Estimation
Çağlar Gülçehre
Sergio Gomez Colmenarejo
Ziyun Wang
Jakub Sygnowski
T. Paine
Konrad Zolna
Yutian Chen
Matthew W. Hoffman
Razvan Pascanu
Nando de Freitas
OffRL
31
37
0
17 Mar 2021
A Survey on Deep Reinforcement Learning for Audio-Based Applications
S. Latif
Heriberto Cuayáhuitl
Farrukh Pervez
Fahad Shamshad
Hafiz Shehbaz Ali
Min Zhang
OffRL
60
73
0
01 Jan 2021
Offline Learning from Demonstrations and Unlabeled Experience
Konrad Zolna
Alexander Novikov
Ksenia Konyushkova
Çağlar Gülçehre
Ziyun Wang
Y. Aytar
Misha Denil
Nando de Freitas
Scott E. Reed
SSL
OffRL
32
67
0
27 Nov 2020
Learning Abstract Models for Strategic Exploration and Fast Reward Transfer
E. Liu
Ramtin Keramati
Sudarshan Seshadri
Kelvin Guu
Panupong Pasupat
Emma Brunskill
Percy Liang
OffRL
27
5
0
12 Jul 2020
Rinascimento: using event-value functions for playing Splendor
Ivan Bravi
Simon Lucas
27
2
0
10 Jun 2020
Acme: A Research Framework for Distributed Reinforcement Learning
Matthew W. Hoffman
Bobak Shahriari
John Aslanides
Gabriel Barth-Maron
Nikola Momchev
...
Srivatsan Srinivasan
A. Cowie
Ziyun Wang
Bilal Piot
Nando de Freitas
65
225
0
01 Jun 2020
Agent57: Outperforming the Atari Human Benchmark
Adria Puigdomenech Badia
Bilal Piot
Steven Kapturowski
Pablo Sprechmann
Alex Vitvitskyi
Daniel Guo
Charles Blundell
OffRL
29
510
0
30 Mar 2020
A Survey of Deep Reinforcement Learning in Video Games
Kun Shao
Zhentao Tang
Yuanheng Zhu
Nannan Li
Dongbin Zhao
OffRL
AI4TS
43
188
0
23 Dec 2019
Integrating Behavior Cloning and Reinforcement Learning for Improved Performance in Dense and Sparse Reward Environments
Vinicius G. Goecks
Gregory M. Gremillion
Vernon J. Lawhern
J. Valasek
Nicholas R. Waytowich
OffRL
19
31
0
09 Oct 2019
I'm sorry Dave, I'm afraid I can't do that, Deep Q-learning from forbidden action
Mathieu Seurin
Philippe Preux
Olivier Pietquin
18
12
0
04 Oct 2019
Task-Relevant Adversarial Imitation Learning
Konrad Zolna
Scott E. Reed
Alexander Novikov
Sergio Gomez Colmenarejo
David Budden
Serkan Cabi
Misha Denil
Nando de Freitas
Ziyun Wang
GAN
28
61
0
02 Oct 2019
Scaling data-driven robotics with reward sketching and batch reinforcement learning
Serkan Cabi
Sergio Gomez Colmenarejo
Alexander Novikov
Ksenia Konyushkova
Scott E. Reed
...
David Barker
Jonathan Scholz
Misha Denil
Nando de Freitas
Ziyun Wang
OffRL
28
29
0
26 Sep 2019
Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing field
Marin Toromanoff
É. Wirbel
Fabien Moutarde
OffRL
24
25
0
13 Aug 2019
Attentive Multi-Task Deep Reinforcement Learning
Timo Bram
Gino Brunner
Oliver Richter
Roger Wattenhofer
CLL
23
18
0
05 Jul 2019
From semantics to execution: Integrating action planning with reinforcement learning for robotic causal problem-solving
Manfred Eppe
Phuong D. H. Nguyen
S. Wermter
25
41
0
23 May 2019
Learning and Exploiting Multiple Subgoals for Fast Exploration in Hierarchical Reinforcement Learning
Libo Xing
16
4
0
13 May 2019
World Discovery Models
M. G. Azar
Bilal Piot
Bernardo Avila-Pires
Jean-Bastien Grill
Florent Altché
Rémi Munos
21
26
0
20 Feb 2019
Certified Reinforcement Learning with Logic Guidance
Mohammadhosein Hasanbeig
Daniel Kroening
Alessandro Abate
21
53
0
02 Feb 2019
Go-Explore: a New Approach for Hard-Exploration Problems
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
AI4TS
24
363
0
30 Jan 2019
Ablation Studies in Artificial Neural Networks
Richard Meyes
Melanie Lu
Constantin Waubert de Puiseau
Tobias Meisen
16
210
0
24 Jan 2019
Learning Montezuma's Revenge from a Single Demonstration
Tim Salimans
Richard J. Chen
42
136
0
08 Dec 2018
Sample Efficient Adaptive Text-to-Speech
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott E. Reed
...
Ben Laurie
Çağlar Gülçehre
Aaron van den Oord
Oriol Vinyals
Nando de Freitas
35
149
0
27 Sep 2018
Expert-augmented actor-critic for ViZDoom and Montezumas Revenge
Michal Garmulewicz
Henryk Michalewski
Piotr Milos
16
8
0
10 Sep 2018
RUDDER: Return Decomposition for Delayed Rewards
Jose A. Arjona-Medina
Michael Gillhofer
Michael Widrich
Thomas Unterthiner
Johannes Brandstetter
Sepp Hochreiter
30
213
0
20 Jun 2018
Playing hard exploration games by watching YouTube
Y. Aytar
Tobias Pfaff
David Budden
T. Paine
Ziyun Wang
Nando de Freitas
35
269
0
29 May 2018
Constrained Policy Improvement for Safe and Efficient Reinforcement Learning
Elad Sarafian
Aviv Tamar
Sarit Kraus
OffRL
32
11
0
20 May 2018
1