SE(3)-Equivariant Diffusion Policy in Spherical Fourier Space

Diffusion Policies are effective at learning closed-loop manipulation policies from human demonstrations but generalize poorly to novel arrangements of objects in 3D space, hurting real-world performance. To address this issue, we propose Spherical Diffusion Policy (SDP), an SE(3) equivariant diffusion policy that adapts trajectories according to 3D transformations of the scene. Such equivariance is achieved by embedding the states, actions, and the denoising process in spherical Fourier space. Additionally, we employ novel spherical FiLM layers to condition the action denoising process equivariantly on the scene embeddings. Lastly, we propose a spherical denoising temporal U-net that achieves spatiotemporal equivariance with computational efficiency. In the end, SDP is end-to-end SE(3) equivariant, allowing robust generalization across transformed 3D scenes. SDP demonstrates a large performance improvement over strong baselines in 20 simulation tasks and 5 physical robot tasks including single-arm and bi-manual embodiments. Code is available atthis https URL.
View on arXiv@article{zhu2025_2507.01723, title={ SE(3)-Equivariant Diffusion Policy in Spherical Fourier Space }, author={ Xupeng Zhu and Fan Wang and Robin Walters and Jane Shi }, journal={arXiv preprint arXiv:2507.01723}, year={ 2025 } }