Muesli: Combining Improvements in Policy Optimization

v1v2 (latest)

Muesli: Combining Improvements in Policy Optimization

13 April 2021

Ivo Danihelka

David Silver

ArXiv (abs)PDF HTML

Papers citing "Muesli: Combining Improvements in Policy Optimization"

10 / 60 papers shown

Title
OpenAI Gym Greg Brockman Vicki Cheung Ludwig Pettersson Jonas Schneider John Schulman Jie Tang Wojciech Zaremba OffRL ODL 223 5,086 0 05 Jun 2016
Learning values across many orders of magnitude H. V. Hasselt A. Guez Matteo Hessel Volodymyr Mnih David Silver 65 170 0 24 Feb 2016
Value Iteration Networks Aviv Tamar Yi Wu G. Thomas Sergey Levine Pieter Abbeel 79 654 0 09 Feb 2016
Asynchronous Methods for Deep Reinforcement Learning Volodymyr Mnih Adria Puigdomenech Badia M. Berk Mirza Alex Graves Timothy Lillicrap Tim Harley David Silver Koray Kavukcuoglu 207 8,879 0 04 Feb 2016
Learning to Predict Independent of Span H. V. Hasselt R. Sutton OOD 41 19 0 19 Aug 2015
Action-Conditional Video Prediction using Deep Networks in Atari Games Junhyuk Oh Xiaoxiao Guo Honglak Lee Richard L. Lewis Satinder Singh 106 855 0 31 Jul 2015
Trust Region Policy Optimization John Schulman Sergey Levine Philipp Moritz Michael I. Jordan Pieter Abbeel 277 6,796 0 19 Feb 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 2.0K 150,312 0 22 Dec 2014
Auto-Encoding Variational Bayes Diederik P. Kingma Max Welling BDL 455 16,923 0 20 Dec 2013
The Arcade Learning Environment: An Evaluation Platform for General Agents Marc G. Bellemare Yavar Naddaf J. Veness Michael Bowling 120 3,021 0 19 Jul 2012