ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.04021
  4. Cited By
On the role of planning in model-based deep reinforcement learning

On the role of planning in model-based deep reinforcement learning

8 November 2020
Jessica B. Hamrick
A. Friesen
Feryal M. P. Behbahani
A. Guez
Fabio Viola
Sims Witherspoon
Thomas W. Anthony
Lars Buesing
Petar Velickovic
T. Weber
    OffRL
ArXivPDFHTML

Papers citing "On the role of planning in model-based deep reinforcement learning"

47 / 47 papers shown
Title
Trust-Region Twisted Policy Improvement
Trust-Region Twisted Policy Improvement
Joery A. de Vries
Jinke He
Yaniv Oren
M. Spaan
OffRL
LRM
59
0
0
08 Apr 2025
Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion
Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion
Chang Chen
Hany Hamed
Doojin Baek
Taegu Kang
Yoshua Bengio
Sungjin Ahn
68
0
0
25 Mar 2025
On-line Policy Improvement using Monte-Carlo Search
On-line Policy Improvement using Monte-Carlo Search
Gerald Tesauro
Gregory R. Galperin
113
270
0
09 Jan 2025
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Jiayu Chen
Wentse Chen
Jeff Schneider
OffRL
66
3
0
15 Oct 2024
The Value Equivalence Principle for Model-Based Reinforcement Learning
The Value Equivalence Principle for Model-Based Reinforcement Learning
Christopher Grimm
André Barreto
Satinder Singh
David Silver
OffRL
40
85
0
06 Nov 2020
Local Search for Policy Iteration in Continuous Control
Local Search for Policy Iteration in Continuous Control
Jost Tobias Springenberg
N. Heess
D. Mankowitz
J. Merel
Arunkumar Byravan
...
Julian Schrittwieser
Yuval Tassa
J. Buchli
Dan Belov
Martin Riedmiller
OffRL
41
15
0
12 Oct 2020
Monte-Carlo Tree Search as Regularized Policy Optimization
Monte-Carlo Tree Search as Regularized Policy Optimization
Jean-Bastien Grill
Florent Altché
Yunhao Tang
Thomas Hubert
Michal Valko
Ioannis Antonoglou
Rémi Munos
48
73
0
24 Jul 2020
Acme: A Research Framework for Distributed Reinforcement Learning
Acme: A Research Framework for Distributed Reinforcement Learning
Matthew W. Hoffman
Bobak Shahriari
John Aslanides
Gabriel Barth-Maron
Nikola Momchev
...
Srivatsan Srinivasan
A. Cowie
Ziyun Wang
Bilal Piot
Nando de Freitas
97
225
0
01 Jun 2020
Mirror Descent Policy Optimization
Mirror Descent Policy Optimization
Manan Tomar
Lior Shani
Yonathan Efroni
Mohammad Ghavamzadeh
74
85
0
20 May 2020
Planning to Explore via Self-Supervised World Models
Planning to Explore via Self-Supervised World Models
Ramanan Sekar
Oleh Rybkin
Kostas Daniilidis
Pieter Abbeel
Danijar Hafner
Deepak Pathak
SSL
50
403
0
12 May 2020
A Game Theoretic Framework for Model Based Reinforcement Learning
A Game Theoretic Framework for Model Based Reinforcement Learning
Aravind Rajeswaran
Igor Mordatch
Vikash Kumar
OffRL
42
127
0
16 Apr 2020
Combining Q-Learning and Search with Amortized Value Estimates
Combining Q-Learning and Search with Amortized Value Estimates
Jessica B. Hamrick
V. Bapst
Alvaro Sanchez-Gonzalez
Tobias Pfaff
T. Weber
Lars Buesing
Peter W. Battaglia
OffRL
52
48
0
05 Dec 2019
Dream to Control: Learning Behaviors by Latent Imagination
Dream to Control: Learning Behaviors by Latent Imagination
Danijar Hafner
Timothy Lillicrap
Jimmy Ba
Mohammad Norouzi
VLM
95
1,333
0
03 Dec 2019
Imagined Value Gradients: Model-Based Policy Optimization with
  Transferable Latent Dynamics Models
Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
Arunkumar Byravan
Jost Tobias Springenberg
A. Abdolmaleki
Roland Hafner
Michael Neunert
Thomas Lampe
Noah Y. Siegel
N. Heess
Martin Riedmiller
OffRL
53
41
0
09 Oct 2019
Online Planning with Lookahead Policies
Online Planning with Lookahead Policies
Yonathan Efroni
Mohammad Ghavamzadeh
Shie Mannor
18
5
0
10 Sep 2019
OpenSpiel: A Framework for Reinforcement Learning in Games
OpenSpiel: A Framework for Reinforcement Learning in Games
Marc Lanctot
Edward Lockhart
Jean-Baptiste Lespiau
V. Zambaldi
Satyaki Upadhyay
...
Julian Schrittwieser
Thomas W. Anthony
Edward Hughes
Ivo Danihelka
Jonah Ryan-Davis
OffRL
51
249
0
26 Aug 2019
Benchmarking Model-Based Reinforcement Learning
Benchmarking Model-Based Reinforcement Learning
Tingwu Wang
Xuchan Bao
I. Clavera
Jerrick Hoang
Yeming Wen
Eric D. Langlois
Matthew Shunshi Zhang
Guodong Zhang
Pieter Abbeel
Jimmy Ba
OffRL
57
361
0
03 Jul 2019
When to Trust Your Model: Model-Based Policy Optimization
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
55
939
0
19 Jun 2019
When to use parametric models in reinforcement learning?
When to use parametric models in reinforcement learning?
H. V. Hasselt
Matteo Hessel
John Aslanides
64
192
0
12 Jun 2019
Policy Gradient Search: Online Planning and Expert Iteration without
  Search Trees
Policy Gradient Search: Online Planning and Expert Iteration without Search Trees
Thomas W. Anthony
Robert Nishihara
Philipp Moritz
Tim Salimans
John Schulman
50
30
0
07 Apr 2019
Structured agents for physical construction
Structured agents for physical construction
V. Bapst
Alvaro Sanchez-Gonzalez
Carl Doersch
Kimberly L. Stachenfeld
Pushmeet Kohli
Peter W. Battaglia
Jessica B. Hamrick
AI4CE
98
99
0
05 Apr 2019
Model-Based Reinforcement Learning for Atari
Model-Based Reinforcement Learning for Atari
Lukasz Kaiser
Mohammad Babaeizadeh
Piotr Milos
B. Osinski
R. Campbell
...
Sergey Levine
Afroz Mohiuddin
Ryan Sepassi
George Tucker
Henryk Michalewski
OffRL
98
851
0
01 Mar 2019
Discretizing Continuous Action Space for On-Policy Optimization
Discretizing Continuous Action Space for On-Policy Optimization
Yunhao Tang
Shipra Agrawal
OffRL
52
119
0
29 Jan 2019
An investigation of model-free planning
An investigation of model-free planning
A. Guez
M. Berk Mirza
Karol Gregor
Rishabh Kabra
S. Racanière
...
Laurent Orseau
Tom Eccles
Greg Wayne
David Silver
Timothy Lillicrap
OffRL
65
111
0
11 Jan 2019
Credit Assignment Techniques in Stochastic Computation Graphs
Credit Assignment Techniques in Stochastic Computation Graphs
T. Weber
N. Heess
Lars Buesing
David Silver
40
45
0
07 Jan 2019
Visual Foresight: Model-Based Deep Reinforcement Learning for
  Vision-Based Robotic Control
Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control
F. Ebert
Chelsea Finn
Sudeep Dasari
Annie Xie
Alex X. Lee
Sergey Levine
SSL
85
383
0
03 Dec 2018
Plan Online, Learn Offline: Efficient Learning and Exploration via
  Model-Based Control
Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control
Kendall Lowrey
Aravind Rajeswaran
Sham Kakade
G. Haro
Igor Mordatch
OffRL
56
224
0
05 Nov 2018
Recurrent World Models Facilitate Policy Evolution
Recurrent World Models Facilitate Policy Evolution
David R Ha
Jürgen Schmidhuber
SyDa
TPM
107
930
0
04 Sep 2018
Algorithmic Framework for Model-based Deep Reinforcement Learning with
  Theoretical Guarantees
Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
Yuping Luo
Huazhe Xu
Yuanzhi Li
Yuandong Tian
Trevor Darrell
Tengyu Ma
OffRL
90
225
0
10 Jul 2018
Surprising Negative Results for Generative Adversarial Tree Search
Surprising Negative Results for Generative Adversarial Tree Search
Kamyar Azizzadenesheli
Brandon Yang
Weitang Liu
Zachary Chase Lipton
Anima Anandkumar
25
13
0
15 Jun 2018
Maximum a Posteriori Policy Optimisation
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
64
471
0
14 Jun 2018
The Effect of Planning Shape on Dyna-style Planning in High-dimensional
  State Spaces
The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces
G. Z. Holland
Erik Talvitie
Michael Bowling
AI4CE
37
43
0
05 Jun 2018
Deep Reinforcement Learning in a Handful of Trials using Probabilistic
  Dynamics Models
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Kurtland Chua
Roberto Calandra
R. McAllister
Sergey Levine
BDL
166
1,263
0
30 May 2018
Dual Policy Iteration
Dual Policy Iteration
Wen Sun
Geoffrey J. Gordon
Byron Boots
J. Andrew Bagnell
OffRL
71
56
0
28 May 2018
Model-Ensemble Trust-Region Policy Optimization
Model-Ensemble Trust-Region Policy Optimization
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
50
450
0
28 Feb 2018
DeepMind Control Suite
DeepMind Control Suite
Yuval Tassa
Yotam Doron
Alistair Muldal
Tom Erez
Yazhe Li
...
A. Abdolmaleki
J. Merel
Andrew Lefrancq
Timothy Lillicrap
Martin Riedmiller
ELM
LM&Ro
BDL
105
1,116
0
02 Jan 2018
Building machines that adapt and compute like brains
Building machines that adapt and compute like brains
Brenden M. Lake
J. Tenenbaum
AI4CE
FedML
NAI
AILaw
315
887
0
11 Nov 2017
Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning
T. Weber
S. Racanière
David P. Reichert
Lars Buesing
A. Guez
...
Razvan Pascanu
Peter W. Battaglia
Demis Hassabis
David Silver
Daan Wierstra
LM&Ro
65
552
0
19 Jul 2017
Schema Networks: Zero-shot Transfer with a Generative Causal Model of
  Intuitive Physics
Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics
Ken Kansky
Tom Silver
David A. Mély
Mohamed Eldawy
Miguel Lazaro-Gredilla
Xinghua Lou
N. Dorfman
Szymon Sidor
Scott Phoenix
Dileep George
AI4CE
65
233
0
14 Jun 2017
Thinking Fast and Slow with Deep Learning and Tree Search
Thinking Fast and Slow with Deep Learning and Tree Search
Thomas W. Anthony
Zheng Tian
David Barber
78
387
0
23 May 2017
On Learning to Think: Algorithmic Information Theory for Novel
  Combinations of Reinforcement Learning Controllers and Recurrent Neural World
  Models
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models
Jürgen Schmidhuber
44
104
0
30 Nov 2015
Learning Continuous Control Policies by Stochastic Value Gradients
Learning Continuous Control Policies by Stochastic Value Gradients
N. Heess
Greg Wayne
David Silver
Timothy Lillicrap
Yuval Tassa
Tom Erez
80
560
0
30 Oct 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
183
13,174
0
09 Sep 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
237
6,722
0
19 Feb 2015
Approximate Policy Iteration Schemes: A Comparison
Approximate Policy Iteration Schemes: A Comparison
B. Scherrer
44
92
0
12 May 2014
The Arcade Learning Environment: An Evaluation Platform for General
  Agents
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
73
2,992
0
19 Jul 2012
Reinforcement Learning by Value Gradients
Reinforcement Learning by Value Gradients
Michael Fairbank
SSL
92
28
0
25 Mar 2008
1