ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.11302
19
0
v1v2v3 (latest)

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset and World Model for Autonomy

12 June 2025
Héctor Carrión
Yutong Bai
Víctor A. Hernández Castro
Kishan Panaganti
Ayush Zenith
Matthew Trang
Tony Zhang
Pietro Perona
Jitendra Malik
    VGen
ArXiv (abs)PDFHTML
Main:8 Pages
22 Figures
Bibliography:5 Pages
2 Tables
Appendix:15 Pages
Abstract

World models aim to simulate environments and enable effective agent behavior. However, modeling real-world environments presents unique challenges as they dynamically change across both space and, crucially, time. To capture these composed dynamics, we introduce a Spatio-Temporal Road Image Dataset for Exploration (STRIDE) permuting 360-degree panoramic imagery into rich interconnected observation, state and action nodes. Leveraging this structure, we can simultaneously model the relationship between egocentric views, positional coordinates, and movement commands across both space and time. We benchmark this dataset via TARDIS, a transformer-based generative world model that integrates spatial and temporal dynamics through a unified autoregressive framework trained on STRIDE. We demonstrate robust performance across a range of agentic tasks such as controllable photorealistic image synthesis, instruction following, autonomous self-control, and state-of-the-art georeferencing. These results suggest a promising direction towards sophisticated generalist agents--capable of understanding and manipulating the spatial and temporal aspects of their material environments--with enhanced embodied reasoning capabilities. Training code, datasets, and model checkpoints are made available at this https URL.

View on arXiv
@article{carrión2025_2506.11302,
  title={ TARDIS STRIDE: A Spatio-Temporal Road Image Dataset and World Model for Autonomy },
  author={ Héctor Carrión and Yutong Bai and Víctor A. Hernández Castro and Kishan Panaganti and Ayush Zenith and Matthew Trang and Tony Zhang and Pietro Perona and Jitendra Malik },
  journal={arXiv preprint arXiv:2506.11302},
  year={ 2025 }
}
Comments on this paper