PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations

30 May 2025

Main:8 Pages

31 Figures

Bibliography:7 Pages

12 Tables

Appendix:26 Pages

Abstract

We introduce PDE-Transformer, an improved transformer-based architecture for surrogate modeling of physics simulations on regular grids. We combine recent architectural improvements of diffusion transformers with adjustments specific for large-scale simulations to yield a more scalable and versatile general-purpose transformer architecture, which can be used as the backbone for building large-scale foundation models in physical sciences. We demonstrate that our proposed architecture outperforms state-of-the-art transformer architectures for computer vision on a large dataset of 16 different types of PDEs. We propose to embed different physical channels individually as spatio-temporal tokens, which interact via channel-wise self-attention. This helps to maintain a consistent information density of tokens when learning multiple types of PDEs simultaneously. We demonstrate that our pre-trained models achieve improved performance on several challenging downstream tasks compared to training from scratch and also beat other foundation model architectures for physics simulations.

View on arXiv

@article{holzschuh2025_2505.24717,
  title={ PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations },
  author={ Benjamin Holzschuh and Qiang Liu and Georg Kohl and Nils Thuerey },
  journal={arXiv preprint arXiv:2505.24717},
  year={ 2025 }
}

Comments on this paper