ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.22976
  4. Cited By
Toward Memory-Aided World Models: Benchmarking via Spatial Consistency

Toward Memory-Aided World Models: Benchmarking via Spatial Consistency

29 May 2025
Kewei Lian
Shaofei Cai
Yilun Du
Yitao Liang
ArXivPDFHTML

Papers citing "Toward Memory-Aided World Models: Benchmarking via Spatial Consistency"

40 / 40 papers shown
Title
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
Junliang Guo
Yang Ye
Tianyu He
Haoyu Wu
Yushu Jiang
Tim Pearce
Li Zhao
VGen
SyDa
98
6
0
11 Apr 2025
ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment
Shaofei Cai
Zhancun Mu
Hoang Trung-Dung
Yitao Liang
60
5
0
04 Mar 2025
Continuous 3D Perception Model with Persistent State
Continuous 3D Perception Model with Persistent State
Qianqian Wang
Yifei Zhang
Aleksander Holyñski
Alexei A. Efros
Angjoo Kanazawa
VGen
118
41
0
21 Jan 2025
Optimizing Latent Goal by Learning from Trajectory Preference
Optimizing Latent Goal by Learning from Trajectory Preference
Guangyu Zhao
Kewei Lian
Haowei Lin
Haobo Fu
Qiang Fu
Shaofei Cai
Zihao Wang
Yitao Liang
117
4
0
03 Dec 2024
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video
  Generation
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Yining Hong
Beide Liu
Maxine Wu
Yuanhao Zhai
Kai-Wei Chang
...
Chung-Ching Lin
Jianfeng Wang
Zhiyong Yang
Yingnian Wu
Lijuan Wang
VGen
78
7
0
30 Oct 2024
Diffusion Models Are Real-Time Game Engines
Diffusion Models Are Real-Time Game Engines
Dani Valevski
Yaniv Leviathan
Moab Arar
Shlomi Fruchter
DiffM
VGen
AI4CE
82
78
0
27 Aug 2024
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Boyuan Chen
Diego Marti Monso
Yilun Du
Max Simchowitz
Russ Tedrake
Vincent Sitzmann
DiffM
79
93
0
01 Jul 2024
Depth Anything V2
Depth Anything V2
Lihe Yang
Bingyi Kang
Zilong Huang
Zhen Zhao
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
DiffM
VLM
MDE
104
406
0
13 Jun 2024
Vista: A Generalizable Driving World Model with High Fidelity and
  Versatile Controllability
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
124
91
0
27 May 2024
Diffusion for World Modeling: Visual Details Matter in Atari
Diffusion for World Modeling: Visual Details Matter in Atari
Eloi Alonso
Adam Jelley
Vincent Micheli
Anssi Kanervisto
Amos Storkey
Tim Pearce
Franccois Fleuret
84
28
0
20 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World
  Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
131
44
0
06 May 2024
Video as the New Language for Real-World Decision Making
Video as the New Language for Real-World Decision Making
Sherry Yang
Jacob Walker
Jack Parker-Holder
Yilun Du
Jake Bruce
Andre Barreto
Pieter Abbeel
Dale Schuurmans
VGen
96
51
0
27 Feb 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
VLM
201
793
0
19 Jan 2024
Learning Interactive Real-World Simulators
Learning Interactive Real-World Simulators
Mengjiao Yang
Yilun Du
Kamyar Ghasemipour
Jonathan Tompson
Leslie Kaelbling
Dale Schuurmans
Pieter Abbeel
LM&Ro
PINN
58
205
0
09 Oct 2023
GAIA-1: A Generative World Model for Autonomous Driving
GAIA-1: A Generative World Model for Autonomous Driving
Masane Fuchi
Lloyd Russell
Hudson Yeo
Zak Murez
Hiroto Minami
Alex Kendall
Tomohiro Takagi
Gianluca Corrado
VGen
78
237
0
29 Sep 2023
DriveDreamer: Towards Real-world-driven World Models for Autonomous
  Driving
DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving
Xiaofeng Wang
Zheng Hua Zhu
Guan Huang
Xinze Chen
Jiagang Zhu
Jiwen Lu
VGen
80
161
0
18 Sep 2023
GridMM: Grid Memory Map for Vision-and-Language Navigation
GridMM: Grid Memory Map for Vision-and-Language Navigation
Zihan Wang
Xiangyang Li
Jiahao Yang
Yeqi Liu
Shuqiang Jiang
73
56
0
24 Jul 2023
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft
Shalev Lifshitz
Keiran Paster
Harris Chan
Jimmy Ba
Sheila A. McIlraith
LM&Ro
65
74
0
01 Jun 2023
Transformer-based World Models Are Happy With 100k Interactions
Transformer-based World Models Are Happy With 100k Interactions
Jan Robine
Marc Höftmann
Tobias Uelwer
Stefan Harmeling
OffRL
70
82
0
13 Mar 2023
Mastering Diverse Domains through World Models
Mastering Diverse Domains through World Models
Danijar Hafner
J. Pašukonis
Jimmy Ba
Timothy Lillicrap
68
600
0
10 Jan 2023
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
90
2,299
0
19 Dec 2022
Topological Semantic Graph Memory for Image-Goal Navigation
Topological Semantic Graph Memory for Image-Goal Navigation
Nuri Kim
Obin Kwon
Hwiyeon Yoo
Yunho Choi
Jeongho Park
Songhwai Oh
72
53
0
17 Sep 2022
Transformers are Sample-Efficient World Models
Transformers are Sample-Efficient World Models
Vincent Micheli
Eloi Alonso
Franccois Fleuret
VLM
OffRL
126
180
0
01 Sep 2022
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online
  Videos
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
111
298
0
23 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
  Knowledge
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
121
377
0
17 Jun 2022
Playable Video Generation
Playable Video Generation
Willi Menapace
Stéphane Lathuilière
Sergey Tulyakov
Aliaksandr Siarohin
Elisa Ricci
SSL
VGen
57
66
0
28 Jan 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
632
41,003
0
22 Oct 2020
Dream to Control: Learning Behaviors by Latent Imagination
Dream to Control: Learning Behaviors by Latent Imagination
Danijar Hafner
Timothy Lillicrap
Jimmy Ba
Mohammad Norouzi
VLM
113
1,354
0
03 Dec 2019
MineRL: A Large-Scale Dataset of Minecraft Demonstrations
MineRL: A Large-Scale Dataset of Minecraft Demonstrations
William H. Guss
Brandon Houghton
Nicholay Topin
Phillip Wang
Cayden R. Codel
Manuela Veloso
Ruslan Salakhutdinov
OffRL
57
225
0
29 Jul 2019
Towards Accurate Generative Models of Video: A New Metric & Challenges
Towards Accurate Generative Models of Video: A New Metric & Challenges
Thomas Unterthiner
Sjoerd van Steenkiste
Karol Kurach
Raphaël Marinier
Marcin Michalski
Sylvain Gelly
EGVM
VGen
91
728
0
03 Dec 2018
Recurrent World Models Facilitate Policy Evolution
Recurrent World Models Facilitate Policy Evolution
David R Ha
Jürgen Schmidhuber
SyDa
TPM
117
941
0
04 Sep 2018
Semi-parametric Topological Memory for Navigation
Semi-parametric Topological Memory for Navigation
Nikolay Savinov
Alexey Dosovitskiy
V. Koltun
66
382
0
01 Mar 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
376
11,790
0
11 Jan 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
687
131,526
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
229
8,015
0
22 May 2017
Neural Map: Structured Memory for Deep Reinforcement Learning
Neural Map: Structured Memory for Deep Reinforcement Learning
Emilio Parisotto
Ruslan Salakhutdinov
71
260
0
27 Feb 2017
Cognitive Mapping and Planning for Visual Navigation
Cognitive Mapping and Planning for Visual Navigation
Saurabh Gupta
Varun Tolani
James Davidson
Sergey Levine
Rahul Sukthankar
Jitendra Malik
81
714
0
13 Feb 2017
Action-Conditional Video Prediction using Deep Networks in Atari Games
Action-Conditional Video Prediction using Deep Networks in Atari Games
Junhyuk Oh
Xiaoxiao Guo
Honglak Lee
Richard L. Lewis
Satinder Singh
103
852
0
31 Jul 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
1.8K
77,133
0
18 May 2015
Neural Turing Machines
Neural Turing Machines
Alex Graves
Greg Wayne
Ivo Danihelka
97
2,327
0
20 Oct 2014
1