Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling

26 May 2025

Abstract

Recent studies show that simple training-free techniques can dramatically improve the quality of text-to-2D generation outputs, e.g. Classifier-Free Guidance (CFG) or FreeU. However, these training-free techniques have been underexplored in the lens of Score Distillation Sampling (SDS), which is a popular and effective technique to leverage the power of pretrained text-to-2D diffusion models for various tasks. In this paper, we aim to shed light on the effect such training-free techniques have on SDS, via a particular application of text-to-3D generation via 2D lifting. We present our findings, which show that varying the scales of CFG presents a trade-off between object size and surface smoothness, while varying the scales of FreeU presents a trade-off between texture details and geometric errors. Based on these findings, we provide insights into how we can effectively harness training-free techniques for SDS, via a strategic scaling of such techniques in a dynamic manner with respect to the timestep or optimization iteration step. We show that using our proposed scheme strikes a favorable balance between texture details and surface smoothness in text-to-3D generations, while preserving the size of the output and mitigating the occurrence of geometric defects.

View on arXiv

@article{lee2025_2505.19868,
  title={ Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling },
  author={ Junhong Lee and Seungwook Kim and Minsu Cho },
  journal={arXiv preprint arXiv:2505.19868},
  year={ 2025 }
}

Comments on this paper