57
1

Towards Better Robustness: Pose-Free 3D Gaussian Splatting for Arbitrarily Long Videos

Abstract

3D Gaussian Splatting (3DGS) has emerged as a powerful representation due to its efficiency and high-fidelity rendering. 3DGS training requires a known camera pose for each input view, typically obtained by Structure-from-Motion (SfM) pipelines. Pioneering works have attempted to relax this restriction but still face difficulties when handling long sequences with complex camera trajectories. In this paper, we propose Rob-GS, a robust framework to progressively estimate camera poses and optimize 3DGS for arbitrarily long video inputs. In particular, by leveraging the inherent continuity of videos, we design an adjacent pose tracking method to ensure stable pose estimation between consecutive frames. To handle arbitrarily long inputs, we propose a Gaussian visibility retention check strategy to adaptively split the video sequence into several segments and optimize them separately. Extensive experiments on Tanks and Temples, ScanNet, and a self-captured dataset show that Rob-GS outperforms the state-of-the-arts.

View on arXiv
@article{dong2025_2501.15096,
  title={ Towards Better Robustness: Pose-Free 3D Gaussian Splatting for Arbitrarily Long Videos },
  author={ Zhen-Hui Dong and Sheng Ye and Yu-Hui Wen and Nannan Li and Yong-Jin Liu },
  journal={arXiv preprint arXiv:2501.15096},
  year={ 2025 }
}
Comments on this paper