ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2511.18706
264
0
v1v2 (latest)

CoD: A Diffusion Foundation Model for Image Compression

24 November 2025
Zhaoyang Jia
Zihan Zheng
Naifu Xue
Jiahao Li
Bin Li
Zongyu Guo
Xiaoyi Zhang
Houqiang Li
Yan Lu
    DiffM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)
Main:8 Pages
18 Figures
Bibliography:3 Pages
5 Tables
Appendix:7 Pages
Abstract

Existing diffusion codecs typically build on text-to-image diffusion foundation models like Stable Diffusion. However, text conditioning is suboptimal from a compression perspective, hindering the potential of downstream diffusion codecs, particularly at ultra-low bitrates. To address it, we introduce \textbf{CoD}, the first \textbf{Co}mpression-oriented \textbf{D}iffusion foundation model, trained from scratch to enable end-to-end optimization of both compression and generation. CoD is not a fixed codec but a general foundation model designed for various diffusion-based codecs. It offers several advantages: \textbf{High compression efficiency}, replacing Stable Diffusion with CoD in downstream codecs like DiffC achieves SOTA results, especially at ultra-low bitrates (e.g., 0.0039 bpp); \textbf{Low-cost and reproducible training}, 300×\times× faster training than Stable Diffusion (∼\sim∼ 20 vs. ∼\sim∼ 6,250 A100 GPU days) on entirely open image-only datasets; \textbf{Providing new insights}, e.g., We find pixel-space diffusion can achieve VTM-level PSNR with high perceptual quality and can outperform GAN-based codecs using fewer parameters. We hope CoD lays the foundation for future diffusion codec research. Codes will be released.

View on arXiv
Comments on this paper