SAVeD: Learning to Denoise Low-SNR Video for Improved Downstream Performance

Abstract
Foundation models excel at vision tasks in natural images but fail in low signal-to-noise ratio (SNR) videos, such as underwater sonar, ultrasound, and microscopy. We introduce Spatiotemporal Augmentations and denoising in Video for Downstream Tasks (SAVeD), a self-supervised method that denoises low-SNR sensor videos and is trained using only the raw noisy data. By leveraging differences in foreground and background motion, SAVeD enhances object visibility using an encoder-decoder with a temporal bottleneck. Our approach improves classification, detection, tracking, and counting, outperforming state-of-the-art video denoising methods with lower resource requirements. Project page:this https URLCode page:this https URL
View on arXiv@article{stathatos2025_2504.00161, title={ SAVeD: Learning to Denoise Low-SNR Video for Improved Downstream Performance }, author={ Suzanne Stathatos and Michael Hobley and Markus Marks and Pietro Perona }, journal={arXiv preprint arXiv:2504.00161}, year={ 2025 } }
Comments on this paper