SoccerDiffusion: Toward Learning End-to-End Humanoid Robot Soccer from Gameplay Recordings

29 April 2025

Florian Vahl

Jörn Griepenburg

Jan Gutsche

Jasper Güldenstein

Jianwei Zhang

VGen

ArXiv PDF HTML

Abstract

This paper introduces SoccerDiffusion, a transformer-based diffusion model designed to learn end-to-end control policies for humanoid robot soccer directly from real-world gameplay recordings. Using data collected from RoboCup competitions, the model predicts joint command trajectories from multi-modal sensor inputs, including vision, proprioception, and game state. We employ a distillation technique to enable real-time inference on embedded platforms that reduces the multi-step diffusion process to a single step. Our results demonstrate the model's ability to replicate complex motion behaviors such as walking, kicking, and fall recovery both in simulation and on physical robots. Although high-level tactical behavior remains limited, this work provides a robust foundation for subsequent reinforcement learning or preference optimization methods. We release the dataset, pretrained models, and code under:this https URL

View on arXiv

@article{vahl2025_2504.20808,
  title={ SoccerDiffusion: Toward Learning End-to-End Humanoid Robot Soccer from Gameplay Recordings },
  author={ Florian Vahl and Jörn Griepenburg and Jan Gutsche and Jasper Güldenstein and Jianwei Zhang },
  journal={arXiv preprint arXiv:2504.20808},
  year={ 2025 }
}

Comments on this paper