FreSca: Scaling in Frequency Space Enhances Diffusion Models

2 April 2025

Abstract

Latent diffusion models (LDMs) have achieved remarkable success in a variety of image tasks, yet achieving fine-grained, disentangled control over global structures versus fine details remains challenging. This paper explores frequency-based control within latent diffusion models. We first systematically analyze frequency characteristics across pixel space, VAE latent space, and internal LDM representations. This reveals that the "noise difference" term, derived from classifier-free guidance at each step t, is a uniquely effective and semantically rich target for manipulation. Building on this insight, we introduce FreSca, a novel and plug-and-play framework that decomposes noise difference into low- and high-frequency components and applies independent scaling factors to them via spatial or energy-based cutoffs. Essentially, FreSca operates without any model retraining or architectural change, offering model- and task-agnostic control. We demonstrate its versatility and effectiveness in improving generation quality and structural emphasis on multiple architectures (e.g., SD3, SDXL) and across applications including image generation, editing, depth estimation, and video synthesis, thereby unlocking a new dimension of expressive control within LDMs.

View on arXiv

@article{huang2025_2504.02154,
  title={ FreSca: Scaling in Frequency Space Enhances Diffusion Models },
  author={ Chao Huang and Susan Liang and Yunlong Tang and Jing Bi and Li Ma and Yapeng Tian and Chenliang Xu },
  journal={arXiv preprint arXiv:2504.02154},
  year={ 2025 }
}

Comments on this paper