Quantizing Diffusion Models from a Sampling-Aware Perspective

Diffusion models have recently emerged as the dominant approach in visual generation tasks. However, the lengthy denoising chains and the computationally intensive noise estimation networks hinder their applicability in low-latency and resource-limited environments. Previous research has endeavored to address these limitations in a decoupled manner, utilizing either advanced samplers or efficient model quantization techniques. In this study, we uncover that quantization-induced noise disrupts directional estimation at each sampling step, further distorting the precise directional estimations of higher-order samplers when solving the sampling equations through discretized numerical methods, thereby altering the optimal sampling trajectory. To attain dual acceleration with high fidelity, we propose a sampling-aware quantization strategy, wherein a Mixed-Order Trajectory Alignment technique is devised to impose a more stringent constraint on the error bounds at each sampling step, facilitating a more linear probability flow. Extensive experiments on sparse-step fast sampling across multiple datasets demonstrate that our approach preserves the rapid convergence characteristics of high-speed samplers while maintaining superior generation quality. Code will be made publicly available soon.
View on arXiv@article{zeng2025_2505.02242, title={ Quantizing Diffusion Models from a Sampling-Aware Perspective }, author={ Qian Zeng and Jie Song and Yuanyu Wan and Huiqiong Wang and Mingli Song }, journal={arXiv preprint arXiv:2505.02242}, year={ 2025 } }