CTRL-GS: Cascaded Temporal Residue Learning for 4D Gaussian Splatting
- 3DGS

Recently, Gaussian Splatting methods have emerged as a desirable substitute for prior Radiance Field methods for novel-view synthesis of scenes captured with multi-view images or videos. In this work, we propose a novel extension to 4D Gaussian Splatting for dynamic scenes. Drawing on ideas from residual learning, we hierarchically decompose the dynamic scene into a "video-segment-frame" structure, with segments dynamically adjusted by optical flow. Then, instead of directly predicting the time-dependent signals, we model the signal as the sum of video-constant values, segment-constant values, and frame-specific residuals, as inspired by the success of residual learning. This approach allows more flexible models that adapt to highly variable scenes. We demonstrate state-of-the-art visual quality and real-time rendering on several established datasets, with the greatest improvements on complex scenes with large movements, occlusions, and fine details, where current methods degrade most.
View on arXiv@article{hou2025_2505.18306, title={ CTRL-GS: Cascaded Temporal Residue Learning for 4D Gaussian Splatting }, author={ Karly Hou and Wanhua Li and Hanspeter Pfister }, journal={arXiv preprint arXiv:2505.18306}, year={ 2025 } }