Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2011.04021
Cited By
v1
v2 (latest)
On the role of planning in model-based deep reinforcement learning
8 November 2020
Jessica B. Hamrick
A. Friesen
Feryal M. P. Behbahani
A. Guez
Fabio Viola
Sims Witherspoon
Thomas W. Anthony
Lars Buesing
Petar Velickovic
T. Weber
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On the role of planning in model-based deep reinforcement learning"
50 / 50 papers shown
Bootstrap Off-policy with World Model
Guojian Zhan
Likun Wang
Xiangteng Zhang
Jiaxin Gao
Masayoshi Tomizuka
Shengbo Eben Li
OffRL
OnRL
505
2
0
01 Nov 2025
Path Channels and Plan Extension Kernels: a Mechanistic Description of Planning in a Sokoban RNN
Mohammad Taufeeque
Aaron David Tucker
Adam Gleave
Adrià Garriga-Alonso
322
0
0
11 Jun 2025
Trust-Region Twisted Policy Improvement
Joery A. de Vries
Jinke He
Yaniv Oren
M. Spaan
OffRL
LRM
579
2
0
08 Apr 2025
Extendable Planning via Multiscale Diffusion
Chang Chen
Hany Hamed
Doojin Baek
Taegu Kang
Samyeul Noh
Yoshua Bengio
Sungjin Ahn
541
4
0
25 Mar 2025
On-line Policy Improvement using Monte-Carlo Search
Neural Information Processing Systems (NeurIPS), 1996
Gerald Tesauro
Gregory R. Galperin
475
276
0
09 Jan 2025
Demystifying MuZero Planning: Interpreting the Learned Model
IEEE Transactions on Artificial Intelligence (IEEE TAI), 2024
Hung Guei
Yan-Ru Ju
Wei-Yu Chen
Tai-Lin Wu
329
2
0
07 Nov 2024
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Jiayu Chen
Wentse Chen
Shiyu Huang
Jeff Schneider
OffRL
501
8
0
15 Oct 2024
How to Choose a Reinforcement-Learning Algorithm
Fabian Bongratz
Vladimir Golkov
Lukas Mautner
Luca Della Libera
Frederik Heetmeyer
Felix Czaja
Julian Rodemann
Daniel Cremers
239
2
0
30 Jul 2024
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs
Xuan Zhang
Chao Du
Tianyu Pang
Qian Liu
Wei Gao
Min Lin
LRM
AI4CE
341
136
0
13 Jun 2024
Learning to Play Atari in a World of Tokens
Pranav Agarwal
Sheldon Andrews
Samira Ebrahimi Kahou
OffRL
273
6
0
03 Jun 2024
Dynamic Model Predictive Shielding for Provably Safe Reinforcement Learning
Arko Banerjee
Kia Rahmani
Joydeep Biswas
Işıl Dillig
238
9
0
22 May 2024
How does the primate brain combine generative and discriminative computations in vision?
Benjamin Peters
J. DiCarlo
Todd Gureckis
Ralf Haefner
Leyla Isik
...
Kimberly Stachenfeld
Zenna Tavares
Doris Y. Tsao
Ilker Yildirim
N. Kriegeskorte
274
8
0
11 Jan 2024
Simple Hierarchical Planning with Diffusion
Chang Chen
Fei Deng
Kenji Kawaguchi
Çağlar Gülçehre
Sungjin Ahn
OffRL
DiffM
294
75
0
05 Jan 2024
Predictive auxiliary objectives in deep RL mimic learning in the brain
International Conference on Learning Representations (ICLR), 2023
Ching Fang
Kimberly L. Stachenfeld
317
16
0
09 Oct 2023
Efficient Planning with Latent Diffusion
International Conference on Learning Representations (ICLR), 2023
Wenhao Li
DiffM
456
10
0
30 Sep 2023
Thinker: Learning to Plan and Act
Neural Information Processing Systems (NeurIPS), 2023
Stephen Chung
Ivan Anokhin
David M. Krueger
LLMAG
OffRL
LRM
353
12
0
27 Jul 2023
What model does MuZero learn?
European Conference on Artificial Intelligence (ECAI), 2023
Jinke He
Thomas M. Moerland
F. Oliehoek
370
5
0
01 Jun 2023
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse
Information Sciences (Inf. Sci.), 2023
Jiafei Lyu
Le Wan
Zongqing Lu
Xiu Li
OffRL
221
17
0
29 May 2023
The Update-Equivalence Framework for Decision-Time Planning
International Conference on Learning Representations (ICLR), 2023
Samuel Sokota
Gabriele Farina
David J. Wu
Hengyuan Hu
Kevin A. Wang
J. Zico Kolter
Noam Brown
366
5
0
25 Apr 2023
Equivariant MuZero
Andreea Deac
T. Weber
George Papamakarios
236
4
0
09 Feb 2023
Learning Interaction-aware Motion Prediction Model for Decision-making in Autonomous Driving
Zhiyu Huang
Haochen Liu
Jingda Wu
Wenhui Huang
Chen Lv
251
23
0
08 Feb 2023
PushWorld: A benchmark for manipulation planning with tools and movable obstacles
Ken Kansky
Skanda Vaidyanath
Scott Swingle
Xinghua Lou
Miguel Lazaro-Gredilla
Dileep George
365
4
0
24 Jan 2023
Safe Reinforcement Learning using Data-Driven Predictive Control
International Conference on Communications, Signal Processing, and their Applications (ICCSPA), 2022
Mahmoud Selim
Amr Alanwar
M. El-Kharashi
Hazem Abbas
Karl H. Johansson
OffRL
252
7
0
20 Nov 2022
Continuous Monte Carlo Graph Search
Adaptive Agents and Multi-Agent Systems (AAMAS), 2022
Kalle Kujanpää
Amin Babadi
Yi Zhao
Arno Solin
Alexander Ilin
Joni Pajarinen
LRM
984
3
0
04 Oct 2022
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective
International Conference on Learning Representations (ICLR), 2022
Raj Ghugare
Homanga Bharadhwaj
Benjamin Eysenbach
Sergey Levine
Ruslan Salakhutdinov
OffRL
413
31
0
18 Sep 2022
A model-based approach to meta-Reinforcement Learning: Transformers and tree search
The European Symposium on Artificial Neural Networks (ESANN), 2022
Brieuc Pinon
Jean-Charles Delvenne
Raphaël Jungers
OffRL
234
4
0
24 Aug 2022
Efficient Planning in a Compact Latent Action Space
International Conference on Learning Representations (ICLR), 2022
Zhengyao Jiang
Tianjun Zhang
Michael Janner
Yueying Li
Tim Rocktaschel
Edward Grefenstette
Yuandong Tian
OffRL
351
57
0
22 Aug 2022
Intelligent problem-solving as integrated hierarchical reinforcement learning
Nature Machine Intelligence (Nat. Mach. Intell.), 2022
Manfred Eppe
Christian Gumbsch
Matthias Kerzel
Phuong D. H. Nguyen
Martin Volker Butz
S. Wermter
298
90
0
18 Aug 2022
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation
IEEE International Conference on Robotics and Automation (ICRA), 2022
Maximilian Igl
Daewoo Kim
Alex Kuefler
Paul Mougin
Punit Shah
K. Shiarlis
Drago Anguelov
Mark Palatucci
Brandyn White
Shimon Whiteson
266
82
0
06 May 2022
Physical Design using Differentiable Learned Simulators
Kelsey R. Allen
Tatiana López-Guevara
Kimberly L. Stachenfeld
Alvaro Sanchez-Gonzalez
Peter W. Battaglia
Jessica B. Hamrick
Tobias Pfaff
AI4CE
285
51
0
01 Feb 2022
Inferring perceptual decision making parameters from behavior in production and reproduction tasks
Nils Neupärtl
Constantin Rothkopf
192
1
0
31 Dec 2021
Learning Generalizable Behavior via Visual Rewrite Rules
Yiheng Xie
Mingxuan Li
Shangqun Yu
Michael Littman
DRL
273
1
0
09 Dec 2021
Procedural Generalization by Planning with Self-Supervised World Models
International Conference on Learning Representations (ICLR), 2021
Ankesh Anand
Jacob Walker
Yazhe Li
Eszter Vértes
Julian Schrittwieser
Sherjil Ozair
T. Weber
Jessica B. Hamrick
197
34
0
02 Nov 2021
Self-Consistent Models and Values
Neural Information Processing Systems (NeurIPS), 2021
Roy Miles
Kate Baumli
Zita Marinho
Angelos Filos
Matteo Hessel
Hado van Hasselt
David Silver
259
9
0
25 Oct 2021
Model-based Reinforcement Learning for Service Mesh Fault Resiliency in a Web Application-level
Applied and Computational Engineering (ACE), 2021
Fanfei Meng
L. Jagadeesan
M. Thottan
AI4CE
128
14
0
21 Oct 2021
Neural Algorithmic Reasoners are Implicit Planners
Neural Information Processing Systems (NeurIPS), 2021
Andreea Deac
Petar Velivcković
Ognjen Milinković
Pierre-Luc Bacon
Jian Tang
Mladen Nikolic
OffRL
178
26
0
11 Oct 2021
Evaluating model-based planning and planner amortization for continuous control
Arunkumar Byravan
Leonard Hasenclever
Piotr Trochim
M. Berk Mirza
Alessandro Davide Ialongo
...
Jost Tobias Springenberg
A. Abdolmaleki
N. Heess
J. Merel
Martin Riedmiller
200
18
0
07 Oct 2021
Potential-based Reward Shaping in Sokoban
Zhao Yang
Mike Preuss
Aske Plaat
OffRL
180
3
0
10 Sep 2021
Subgoal Search For Complex Reasoning Tasks
Neural Information Processing Systems (NeurIPS), 2021
K. Czechowski
Tomasz Odrzygó'zd'z
Marek Zbysiñski
Michał Zawalski
Krzysztof Olejnik
Yuhuai Wu
Lukasz Kuciñski
Piotr Milo's
ReLM
LRM
276
40
0
25 Aug 2021
Deep Multiagent Reinforcement Learning: Challenges and Directions
Artificial Intelligence Review (AIR), 2021
Annie Wong
Thomas Bäck
Anna V. Kononova
Aske Plaat
AI4CE
313
161
0
29 Jun 2021
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2021
Mingde Zhao
Zhen Liu
Sitao Luan
Shuyuan Zhang
Doina Precup
Yoshua Bengio
475
40
0
03 Jun 2021
Towards Deeper Deep Reinforcement Learning with Spectral Normalization
Neural Information Processing Systems (NeurIPS), 2021
Johan Bjorck
Daniel Schwalbe-Koda
Kilian Q. Weinberger
394
26
0
02 Jun 2021
Learning Neuro-Symbolic Relational Transition Models for Bilevel Planning
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2021
Rohan Chitnis
Tom Silver
J. Tenenbaum
Tomas Lozano-Perez
L. Kaelbling
387
71
0
28 May 2021
Transfer Learning and Curriculum Learning in Sokoban
Zhao Yang
Mike Preuss
Aske Plaat
OffRL
293
3
0
25 May 2021
MBRL-Lib: A Modular Library for Model-based Reinforcement Learning
Luis Pineda
Brandon Amos
Amy Zhang
Nathan Lambert
Roberto Calandra
OffRL
392
53
0
20 Apr 2021
Muesli: Combining Improvements in Policy Optimization
International Conference on Machine Learning (ICML), 2021
Matteo Hessel
Ivo Danihelka
Fabio Viola
A. Guez
Simon Schmitt
Laurent Sifre
T. Weber
David Silver
H. V. Hasselt
281
69
0
13 Apr 2021
Planning and Learning Using Adaptive Entropy Tree Search
IEEE International Joint Conference on Neural Network (IJCNN), 2021
Piotr Kozakowski
Mikolaj Pacek
Piotr Milo's
218
3
0
12 Feb 2021
Autotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short Survey
Journal of Artificial Intelligence Research (JAIR), 2020
Cédric Colas
Tristan Karch
Olivier Sigaud
Pierre-Yves Oudeyer
908
125
0
17 Dec 2020
On the model-based stochastic value gradient for continuous reinforcement learning
Conference on Learning for Dynamics & Control (L4DC), 2020
Brandon Amos
Samuel Stanton
Denis Yarats
A. Wilson
432
78
0
28 Aug 2020
A Unifying Framework for Reinforcement Learning and Planning
Thomas M. Moerland
Joost Broekens
Aske Plaat
Catholijn M. Jonker
OffRL
544
15
0
26 Jun 2020
1
Page 1 of 1