ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.12719
  4. Cited By
Learning General World Models in a Handful of Reward-Free Deployments

Learning General World Models in a Handful of Reward-Free Deployments

23 October 2022
Yingchen Xu
Jack Parker-Holder
Aldo Pacchiano
Philip J. Ball
Oleh Rybkin
Stephen J. Roberts
Tim Rocktaschel
Edward Grefenstette
    OffRL
ArXivPDFHTML

Papers citing "Learning General World Models in a Handful of Reward-Free Deployments"

49 / 49 papers shown
Title
Task Aware Dreamer for Task Generalization in Reinforcement Learning
Task Aware Dreamer for Task Generalization in Reinforcement Learning
Chengyang Ying
Zhongkai Hao
Xinning Zhou
Hang Su
Songming Liu
Dong Yan
Jun Zhu
153
3
0
17 Feb 2025
K-level Reasoning for Zero-Shot Coordination in Hanabi
K-level Reasoning for Zero-Shot Coordination in Hanabi
Brandon Cui
Hengyuan Hu
Luis Pineda
Jakob N. Foerster
OffRL
LRM
48
34
0
14 Jul 2022
The Phenomenon of Policy Churn
The Phenomenon of Policy Churn
Tom Schaul
André Barreto
John Quan
Georg Ostrovski
67
28
0
01 Jun 2022
INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL
INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL
Homanga Bharadhwaj
Mohammad Babaeizadeh
D. Erhan
Sergey Levine
61
31
0
18 Apr 2022
BcMON: Blockchain Middleware for Offline Networks
BcMON: Blockchain Middleware for Offline Networks
Yi-Lan Lin
Zhipeng Gao
Qian Wang
Lanlan Rui
Yang Yang
OffRL
26
3
0
05 Apr 2022
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Michael Ahn
Anthony Brohan
Noah Brown
Yevgen Chebotar
Omar Cortes
...
Ted Xiao
Peng Xu
Sichun Xu
Mengyuan Yan
Andy Zeng
LM&Ro
129
1,922
0
04 Apr 2022
Continuously Discovering Novel Strategies via Reward-Switching Policy
  Optimization
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization
Zihan Zhou
Wei Fu
Bingliang Zhang
Yi Wu
53
29
0
04 Apr 2022
Don't Change the Algorithm, Change the Data: Exploratory Data for
  Offline Reinforcement Learning
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning
Denis Yarats
David Brandfonbrener
Hao Liu
Michael Laskin
Pieter Abbeel
A. Lazaric
Lerrel Pinto
OffRL
OnRL
53
86
0
31 Jan 2022
Can Wikipedia Help Offline Reinforcement Learning?
Can Wikipedia Help Offline Reinforcement Learning?
Machel Reid
Yutaro Yamada
S. Gu
3DV
RALM
OffRL
174
95
0
28 Jan 2022
The Challenges of Exploration for Offline Reinforcement Learning
The Challenges of Exploration for Offline Reinforcement Learning
Nathan Lambert
Markus Wulfmeier
William F. Whitney
Arunkumar Byravan
Michael Bloesch
Vibhavari Dasagi
Tim Hertweck
Martin Riedmiller
OffRL
65
27
0
27 Jan 2022
Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Angelos Filos
Eszter Vértes
Zita Marinho
Gregory Farquhar
Diana Borsa
A. Friesen
Feryal M. P. Behbahani
Tom Schaul
André Barreto
Simon Osindero
60
7
0
08 Dec 2021
Maximum Entropy Model-based Reinforcement Learning
Maximum Entropy Model-based Reinforcement Learning
Oleg Svidchenko
A. Shpilman
36
6
0
02 Dec 2021
Collective Intelligence for Deep Learning: A Survey of Recent
  Developments
Collective Intelligence for Deep Learning: A Survey of Recent Developments
David R Ha
Yu Tang
AI4CE
45
69
0
29 Nov 2021
Generalized Decision Transformer for Offline Hindsight Information
  Matching
Generalized Decision Transformer for Offline Hindsight Information Matching
Hiroki Furuta
Y. Matsuo
S. Gu
OffRL
42
102
0
19 Nov 2021
Procedural Generalization by Planning with Self-Supervised World Models
Procedural Generalization by Planning with Self-Supervised World Models
Ankesh Anand
Jacob Walker
Yazhe Li
Eszter Vértes
Julian Schrittwieser
Sherjil Ozair
T. Weber
Jessica B. Hamrick
55
31
0
02 Nov 2021
URLB: Unsupervised Reinforcement Learning Benchmark
URLB: Unsupervised Reinforcement Learning Benchmark
Michael Laskin
Denis Yarats
Hao Liu
Kimin Lee
Albert Zhan
Kevin Lu
Catherine Cang
Lerrel Pinto
Pieter Abbeel
SSL
OffRL
49
134
0
28 Oct 2021
On Reward-Free RL with Kernel and Neural Function Approximations:
  Single-Agent MDP and Markov Game
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
Shuang Qiu
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
62
23
0
19 Oct 2021
Discovering and Achieving Goals via World Models
Discovering and Achieving Goals via World Models
Russell Mendonca
Oleh Rybkin
Kostas Daniilidis
Danijar Hafner
Deepak Pathak
40
123
0
18 Oct 2021
Medical Dead-ends and Learning to Identify High-risk States and
  Treatments
Medical Dead-ends and Learning to Identify High-risk States and Treatments
Mehdi Fatemi
Taylor W. Killian
J. Subramanian
Marzyeh Ghassemi
OffRL
64
37
0
08 Oct 2021
Dynamics-Aware Quality-Diversity for Efficient Learning of Skill
  Repertoires
Dynamics-Aware Quality-Diversity for Efficient Learning of Skill Repertoires
Bryan Lim
Luca Grillotti
Lorenzo Bernasconi
Antoine Cully
88
28
0
16 Sep 2021
Benchmarking the Spectrum of Agent Capabilities
Benchmarking the Spectrum of Agent Capabilities
Danijar Hafner
ELM
47
131
0
14 Sep 2021
APS: Active Pretraining with Successor Features
APS: Active Pretraining with Successor Features
Hao Liu
Pieter Abbeel
76
119
0
31 Aug 2021
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Aaron Courville
Marc G. Bellemare
OffRL
74
652
0
30 Aug 2021
Learning more skills through optimistic exploration
Learning more skills through optimistic exploration
D. Strouse
Kate Baumli
David Warde-Farley
Vlad Mnih
Steven Hansen
SSL
23
45
0
29 Jul 2021
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Iou-Jen Liu
Unnat Jain
Raymond A. Yeh
Alex Schwing
61
104
0
23 Jul 2021
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Michael Janner
Qiyang Li
Sergey Levine
OffRL
97
667
0
03 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
82
1,608
0
02 Jun 2021
Augmented World Models Facilitate Zero-Shot Dynamics Generalization From
  a Single Offline Environment
Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment
Philip J. Ball
Cong Lu
Jack Parker-Holder
Stephen J. Roberts
OffRL
44
43
0
12 Apr 2021
Learning One Representation to Optimize All Rewards
Learning One Representation to Optimize All Rewards
Ahmed Touati
Yann Ollivier
OffRL
40
62
0
14 Mar 2021
Behavior From the Void: Unsupervised Active Pre-Training
Behavior From the Void: Unsupervised Active Pre-Training
Hao Liu
Pieter Abbeel
VLM
SSL
62
196
0
08 Mar 2021
Reinforcement Learning with Prototypical Representations
Reinforcement Learning with Prototypical Representations
Denis Yarats
Rob Fergus
A. Lazaric
Lerrel Pinto
SSL
46
221
0
22 Feb 2021
Model-free Representation Learning and Exploration in Low-rank MDPs
Model-free Representation Learning and Exploration in Low-rank MDPs
Aditya Modi
Jinglin Chen
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
OffRL
116
79
0
14 Feb 2021
Adversarially Guided Actor-Critic
Adversarially Guided Actor-Critic
Yannis Flet-Berliac
Johan Ferret
Olivier Pietquin
Philippe Preux
Matthieu Geist
37
71
0
08 Feb 2021
Offline Reinforcement Learning from Images with Latent Space Models
Offline Reinforcement Learning from Images with Latent Space Models
Rafael Rafailov
Tianhe Yu
Aravind Rajeswaran
Chelsea Finn
OffRL
45
126
0
21 Dec 2020
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the
  Hessian
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
Jack Parker-Holder
Luke Metz
Cinjon Resnick
Hengyuan Hu
Adam Lerer
Alistair Letcher
A. Peysakhovich
Aldo Pacchiano
Jakob N. Foerster
23
24
0
12 Nov 2020
Efficient Wasserstein Natural Gradients for Reinforcement Learning
Efficient Wasserstein Natural Gradients for Reinforcement Learning
Theodore H. Moskovitz
Michael Arbel
Ferenc Huszár
Arthur Gretton
17
20
0
12 Oct 2020
Mastering Atari with Discrete World Models
Mastering Atari with Discrete World Models
Danijar Hafner
Timothy Lillicrap
Mohammad Norouzi
Jimmy Ba
DRL
81
834
0
05 Oct 2020
Model-Based Offline Planning
Model-Based Offline Planning
Arthur Argenson
Gabriel Dulac-Arnold
OffRL
50
152
0
12 Aug 2020
dm_control: Software and Tasks for Continuous Control
dm_control: Software and Tasks for Continuous Control
Yuval Tassa
S. Tunyasuvunakool
Alistair Muldal
Yotam Doron
Piotr Trochim
...
Steven Bohez
J. Merel
Tom Erez
Timothy Lillicrap
N. Heess
LM&Ro
66
403
0
22 Jun 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank
  MDPs
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal
Sham Kakade
A. Krishnamurthy
Wen Sun
OffRL
90
225
0
18 Jun 2020
MOReL : Model-Based Offline Reinforcement Learning
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
75
662
0
12 May 2020
Dota 2 with Large Scale Deep Reinforcement Learning
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI OpenAI
:
Christopher Berner
Greg Brockman
Brooke Chan
...
Szymon Sidor
Ilya Sutskever
Jie Tang
Filip Wolski
Susan Zhang
GNN
VLM
CLL
AI4CE
LRM
94
1,811
0
13 Dec 2019
Model-Based Active Exploration
Model-Based Active Exploration
Pranav Shyam
Wojciech Ja'skowski
Faustino J. Gomez
60
179
0
29 Oct 2018
Learning Dexterous In-Hand Manipulation
Learning Dexterous In-Hand Manipulation
OpenAI OpenAI
Marcin Andrychowicz
Bowen Baker
Maciek Chociej
Rafal Jozefowicz
...
Szymon Sidor
Joshua Tobin
Peter Welinder
Lilian Weng
Wojciech Zaremba
86
1,865
0
01 Aug 2018
Improving Exploration in Evolution Strategies for Deep Reinforcement
  Learning via a Population of Novelty-Seeking Agents
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
Edoardo Conti
Vashisht Madhavan
F. Such
Joel Lehman
Kenneth O. Stanley
Jeff Clune
49
346
0
18 Dec 2017
Curiosity-driven Exploration by Self-supervised Prediction
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRM
SSL
96
2,423
0
15 May 2017
Deep Exploration via Randomized Value Functions
Deep Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Daniel Russo
Zheng Wen
71
302
0
22 Mar 2017
Auto-Encoding Variational Bayes
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
372
16,962
0
20 Dec 2013
The Arcade Learning Environment: An Evaluation Platform for General
  Agents
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
80
2,992
0
19 Jul 2012
1