Restereo: Diffusion stereo video generation and restoration
- DiffMVGen

Stereo video generation has been gaining increasing attention with recent advancements in video diffusion models. However, most existing methods focus on generating 3D stereoscopic videos from monocular 2D videos. These approaches typically assume that the input monocular video is of high quality, making the task primarily about inpainting occluded regions in the warped video while preserving disoccluded areas. In this paper, we introduce a new pipeline that not only generates stereo videos but also enhances both left-view and right-view videos consistently with a single model. Our approach achieves this by fine-tuning the model on degraded data for restoration, as well as conditioning the model on warped masks for consistent stereo generation. As a result, our method can be fine-tuned on a relatively small synthetic stereo video datasets and applied to low-quality real-world videos, performing both stereo video generation and restoration. Experiments demonstrate that our method outperforms existing approaches both qualitatively and quantitatively in stereo video generation from low-resolution inputs.
View on arXiv@article{huang2025_2506.06023, title={ Restereo: Diffusion stereo video generation and restoration }, author={ Xingchang Huang and Ashish Kumar Singh and Florian Dubost and Cristina Nader Vasconcelos and Sakar Khattar and Liang Shi and Christian Theobalt and Cengiz Oztireli and Gurprit Singh }, journal={arXiv preprint arXiv:2506.06023}, year={ 2025 } }