ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.06159
  4. Cited By
Muesli: Combining Improvements in Policy Optimization

Muesli: Combining Improvements in Policy Optimization

13 April 2021
Matteo Hessel
Ivo Danihelka
Fabio Viola
A. Guez
Simon Schmitt
Laurent Sifre
T. Weber
David Silver
H. V. Hasselt
ArXivPDFHTML

Papers citing "Muesli: Combining Improvements in Policy Optimization"

17 / 17 papers shown
Title
OptionZero: Planning with Learned Options
OptionZero: Planning with Learned Options
Po-Wei Huang
Pei-Chiun Peng
Hung Guei
Ti-Rong Wu
57
1
0
23 Feb 2025
The Role of Deep Learning Regularizations on Actors in Offline RL
The Role of Deep Learning Regularizations on Actors in Offline RL
Denis Tarasov
Anja Surina
Çağlar Gülçehre
OffRL
AI4CE
53
1
0
11 Sep 2024
Reinforcement Learning for Sustainable Energy: A Survey
Reinforcement Learning for Sustainable Energy: A Survey
Koen Ponse
Felix Kleuker
Márton Fejér
Álvaro Serra-Gómez
Aske Plaat
Thomas M. Moerland
OffRL
AI4CE
40
1
0
26 Jul 2024
Is Value Functions Estimation with Classification Plug-and-play for
  Offline Reinforcement Learning?
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?
Denis Tarasov
Kirill Brilliantov
Dmitrii Kharlapenko
OffRL
32
2
0
10 Jun 2024
Value Improved Actor Critic Algorithms
Value Improved Actor Critic Algorithms
Yaniv Oren
Moritz A. Zanger
Pascal R. van der Vaart
M. Spaan
Wendelin Bohmer
Wendelin Bohmer
OffRL
33
0
0
03 Jun 2024
Vision-Language Models as a Source of Rewards
Vision-Language Models as a Source of Rewards
Kate Baumli
Satinder Baveja
Feryal M. P. Behbahani
Harris Chan
Gheorghe Comanici
...
Yannick Schroecker
Stephen Spencer
Richie Steigerwald
Luyu Wang
Lei Zhang
VLM
LRM
45
26
0
14 Dec 2023
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control
  via Sample Multiple Reuse
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse
Jiafei Lyu
Le Wan
Zongqing Lu
Xiu Li
OffRL
31
9
0
29 May 2023
Accelerating exploration and representation learning with offline
  pre-training
Accelerating exploration and representation learning with offline pre-training
Bogdan Mazoure
Jake Bruce
Doina Precup
Rob Fergus
Ankit Anand
OffRL
31
5
0
31 Mar 2023
Human-Timescale Adaptation in an Open-Ended Task Space
Human-Timescale Adaptation in an Open-Ended Task Space
Adaptive Agent Team
Jakob Bauer
Kate Baumli
Satinder Baveja
Feryal M. P. Behbahani
...
Jakub Sygnowski
K. Tuyls
Sarah York
Alexander Zacherl
Lei Zhang
LM&Ro
OffRL
AI4CE
LRM
38
108
0
18 Jan 2023
Policy Optimization to Learn Adaptive Motion Primitives in Path Planning
  with Dynamic Obstacles
Policy Optimization to Learn Adaptive Motion Primitives in Path Planning with Dynamic Obstacles
Brian Angulo
Aleksandr I. Panov
Konstantin Yakovlev
18
12
0
29 Dec 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
59
787
0
12 May 2022
Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Angelos Filos
Eszter Vértes
Zita Marinho
Gregory Farquhar
Diana Borsa
A. Friesen
Feryal M. P. Behbahani
Tom Schaul
André Barreto
Simon Osindero
44
7
0
08 Dec 2021
Self-Consistent Models and Values
Self-Consistent Models and Values
Roy Miles
Kate Baumli
Zita Marinho
Angelos Filos
Matteo Hessel
Hado van Hasselt
David Silver
38
8
0
25 Oct 2021
GrowSpace: Learning How to Shape Plants
GrowSpace: Learning How to Shape Plants
Yasmeen Hitti
Ionelia Buzatu
Manuel Del Verme
M. Lefsrud
Florian Golemo
A. Durand
19
2
0
15 Oct 2021
Bootstrapped Meta-Learning
Bootstrapped Meta-Learning
Sebastian Flennerhag
Yannick Schroecker
Tom Zahavy
Hado van Hasselt
David Silver
Satinder Singh
38
58
0
09 Sep 2021
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Aaron Courville
Marc G. Bellemare
OffRL
59
633
0
30 Aug 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
340
1,960
0
04 May 2020
1