Muesli: Combining Improvements in Policy Optimization

Muesli: Combining Improvements in Policy Optimization

13 April 2021

Ivo Danihelka

David Silver

Papers citing "Muesli: Combining Improvements in Policy Optimization"

17 / 17 papers shown

Title
OptionZero: Planning with Learned Options Po-Wei Huang Pei-Chiun Peng Hung Guei Ti-Rong Wu 57 1 0 23 Feb 2025
The Role of Deep Learning Regularizations on Actors in Offline RL Denis Tarasov Anja Surina Çağlar Gülçehre OffRL AI4CE 53 1 0 11 Sep 2024
Reinforcement Learning for Sustainable Energy: A Survey Koen Ponse Felix Kleuker Márton Fejér Álvaro Serra-Gómez Aske Plaat Thomas M. Moerland OffRL AI4CE 40 1 0 26 Jul 2024
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning? Denis Tarasov Kirill Brilliantov Dmitrii Kharlapenko OffRL 32 2 0 10 Jun 2024
Value Improved Actor Critic Algorithms Yaniv Oren Moritz A. Zanger Pascal R. van der Vaart M. Spaan Wendelin Bohmer Wendelin Bohmer OffRL 33 0 0 03 Jun 2024
Vision-Language Models as a Source of Rewards Kate Baumli Satinder Baveja Feryal M. P. Behbahani Harris Chan Gheorghe Comanici ... Yannick Schroecker Stephen Spencer Richie Steigerwald Luyu Wang Lei Zhang VLM LRM 45 26 0 14 Dec 2023
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse Jiafei Lyu Le Wan Zongqing Lu Xiu Li OffRL 31 9 0 29 May 2023
Accelerating exploration and representation learning with offline pre-training Bogdan Mazoure Jake Bruce Doina Precup Rob Fergus Ankit Anand OffRL 31 5 0 31 Mar 2023
Human-Timescale Adaptation in an Open-Ended Task Space Adaptive Agent Team Jakob Bauer Kate Baumli Satinder Baveja Feryal M. P. Behbahani ... Jakub Sygnowski K. Tuyls Sarah York Alexander Zacherl Lei Zhang LM&Ro OffRL AI4CE LRM 38 108 0 18 Jan 2023
Policy Optimization to Learn Adaptive Motion Primitives in Path Planning with Dynamic Obstacles Brian Angulo Aleksandr I. Panov Konstantin Yakovlev 18 12 0 29 Dec 2022
A Generalist Agent Scott E. Reed Konrad Zolna Emilio Parisotto Sergio Gomez Colmenarejo Alexander Novikov ... Yutian Chen R. Hadsell Oriol Vinyals Mahyar Bordbar Nando de Freitas LM&Ro LLMAG AI4CE 59 787 0 12 May 2022
Model-Value Inconsistency as a Signal for Epistemic Uncertainty Angelos Filos Eszter Vértes Zita Marinho Gregory Farquhar Diana Borsa A. Friesen Feryal M. P. Behbahani Tom Schaul André Barreto Simon Osindero 44 7 0 08 Dec 2021
Self-Consistent Models and Values Roy Miles Kate Baumli Zita Marinho Angelos Filos Matteo Hessel Hado van Hasselt David Silver 38 8 0 25 Oct 2021
GrowSpace: Learning How to Shape Plants Yasmeen Hitti Ionelia Buzatu Manuel Del Verme M. Lefsrud Florian Golemo A. Durand 19 2 0 15 Oct 2021
Bootstrapped Meta-Learning Sebastian Flennerhag Yannick Schroecker Tom Zahavy Hado van Hasselt David Silver Satinder Singh 38 58 0 09 Sep 2021
Deep Reinforcement Learning at the Edge of the Statistical Precipice Rishabh Agarwal Max Schwarzer Pablo Samuel Castro Aaron Courville Marc G. Bellemare OffRL 59 633 0 30 Aug 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems Sergey Levine Aviral Kumar George Tucker Justin Fu OffRL GP 340 1,960 0 04 May 2020