Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.22976
Cited By
Toward Memory-Aided World Models: Benchmarking via Spatial Consistency
29 May 2025
Kewei Lian
Shaofei Cai
Yilun Du
Yitao Liang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Toward Memory-Aided World Models: Benchmarking via Spatial Consistency"
40 / 40 papers shown
Title
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
Junliang Guo
Yang Ye
Tianyu He
Haoyu Wu
Yushu Jiang
Tim Pearce
Li Zhao
VGen
SyDa
98
6
0
11 Apr 2025
ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment
Shaofei Cai
Zhancun Mu
Hoang Trung-Dung
Yitao Liang
60
5
0
04 Mar 2025
Continuous 3D Perception Model with Persistent State
Qianqian Wang
Yifei Zhang
Aleksander Holyñski
Alexei A. Efros
Angjoo Kanazawa
VGen
118
41
0
21 Jan 2025
Optimizing Latent Goal by Learning from Trajectory Preference
Guangyu Zhao
Kewei Lian
Haowei Lin
Haobo Fu
Qiang Fu
Shaofei Cai
Zihao Wang
Yitao Liang
117
4
0
03 Dec 2024
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Yining Hong
Beide Liu
Maxine Wu
Yuanhao Zhai
Kai-Wei Chang
...
Chung-Ching Lin
Jianfeng Wang
Zhiyong Yang
Yingnian Wu
Lijuan Wang
VGen
78
7
0
30 Oct 2024
Diffusion Models Are Real-Time Game Engines
Dani Valevski
Yaniv Leviathan
Moab Arar
Shlomi Fruchter
DiffM
VGen
AI4CE
82
78
0
27 Aug 2024
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Boyuan Chen
Diego Marti Monso
Yilun Du
Max Simchowitz
Russ Tedrake
Vincent Sitzmann
DiffM
79
93
0
01 Jul 2024
Depth Anything V2
Lihe Yang
Bingyi Kang
Zilong Huang
Zhen Zhao
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
DiffM
VLM
MDE
104
406
0
13 Jun 2024
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
124
91
0
27 May 2024
Diffusion for World Modeling: Visual Details Matter in Atari
Eloi Alonso
Adam Jelley
Vincent Micheli
Anssi Kanervisto
Amos Storkey
Tim Pearce
Franccois Fleuret
84
28
0
20 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
131
44
0
06 May 2024
Video as the New Language for Real-World Decision Making
Sherry Yang
Jacob Walker
Jack Parker-Holder
Yilun Du
Jake Bruce
Andre Barreto
Pieter Abbeel
Dale Schuurmans
VGen
96
51
0
27 Feb 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
VLM
201
793
0
19 Jan 2024
Learning Interactive Real-World Simulators
Mengjiao Yang
Yilun Du
Kamyar Ghasemipour
Jonathan Tompson
Leslie Kaelbling
Dale Schuurmans
Pieter Abbeel
LM&Ro
PINN
58
205
0
09 Oct 2023
GAIA-1: A Generative World Model for Autonomous Driving
Masane Fuchi
Lloyd Russell
Hudson Yeo
Zak Murez
Hiroto Minami
Alex Kendall
Tomohiro Takagi
Gianluca Corrado
VGen
78
237
0
29 Sep 2023
DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving
Xiaofeng Wang
Zheng Hua Zhu
Guan Huang
Xinze Chen
Jiagang Zhu
Jiwen Lu
VGen
80
161
0
18 Sep 2023
GridMM: Grid Memory Map for Vision-and-Language Navigation
Zihan Wang
Xiangyang Li
Jiahao Yang
Yeqi Liu
Shuqiang Jiang
73
56
0
24 Jul 2023
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft
Shalev Lifshitz
Keiran Paster
Harris Chan
Jimmy Ba
Sheila A. McIlraith
LM&Ro
65
74
0
01 Jun 2023
Transformer-based World Models Are Happy With 100k Interactions
Jan Robine
Marc Höftmann
Tobias Uelwer
Stefan Harmeling
OffRL
70
82
0
13 Mar 2023
Mastering Diverse Domains through World Models
Danijar Hafner
J. Pašukonis
Jimmy Ba
Timothy Lillicrap
68
600
0
10 Jan 2023
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
90
2,299
0
19 Dec 2022
Topological Semantic Graph Memory for Image-Goal Navigation
Nuri Kim
Obin Kwon
Hwiyeon Yoo
Yunho Choi
Jeongho Park
Songhwai Oh
72
53
0
17 Sep 2022
Transformers are Sample-Efficient World Models
Vincent Micheli
Eloi Alonso
Franccois Fleuret
VLM
OffRL
126
180
0
01 Sep 2022
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
111
298
0
23 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
121
377
0
17 Jun 2022
Playable Video Generation
Willi Menapace
Stéphane Lathuilière
Sergey Tulyakov
Aliaksandr Siarohin
Elisa Ricci
SSL
VGen
57
66
0
28 Jan 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
632
41,003
0
22 Oct 2020
Dream to Control: Learning Behaviors by Latent Imagination
Danijar Hafner
Timothy Lillicrap
Jimmy Ba
Mohammad Norouzi
VLM
113
1,354
0
03 Dec 2019
MineRL: A Large-Scale Dataset of Minecraft Demonstrations
William H. Guss
Brandon Houghton
Nicholay Topin
Phillip Wang
Cayden R. Codel
Manuela Veloso
Ruslan Salakhutdinov
OffRL
57
225
0
29 Jul 2019
Towards Accurate Generative Models of Video: A New Metric & Challenges
Thomas Unterthiner
Sjoerd van Steenkiste
Karol Kurach
Raphaël Marinier
Marcin Michalski
Sylvain Gelly
EGVM
VGen
91
728
0
03 Dec 2018
Recurrent World Models Facilitate Policy Evolution
David R Ha
Jürgen Schmidhuber
SyDa
TPM
117
941
0
04 Sep 2018
Semi-parametric Topological Memory for Navigation
Nikolay Savinov
Alexey Dosovitskiy
V. Koltun
66
382
0
01 Mar 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
376
11,790
0
11 Jan 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
687
131,526
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
229
8,015
0
22 May 2017
Neural Map: Structured Memory for Deep Reinforcement Learning
Emilio Parisotto
Ruslan Salakhutdinov
71
260
0
27 Feb 2017
Cognitive Mapping and Planning for Visual Navigation
Saurabh Gupta
Varun Tolani
James Davidson
Sergey Levine
Rahul Sukthankar
Jitendra Malik
81
714
0
13 Feb 2017
Action-Conditional Video Prediction using Deep Networks in Atari Games
Junhyuk Oh
Xiaoxiao Guo
Honglak Lee
Richard L. Lewis
Satinder Singh
103
852
0
31 Jul 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
1.8K
77,133
0
18 May 2015
Neural Turing Machines
Alex Graves
Greg Wayne
Ivo Danihelka
97
2,327
0
20 Oct 2014
1