1

Coarse-to-Real: Generative Rendering for Populated Dynamic Scenes

Gonzalo Gomez-Nogales
Yicong Hong
Chongjian Ge
Marc Comino-Trinidad
Dan Casas
Yi Zhou
Main:7 Pages
15 Figures
Bibliography:2 Pages
1 Tables
Appendix:2 Pages
Abstract

Traditional rendering pipelines rely on complex assets, accurate materials and lighting, and substantial computational resources to produce realistic imagery, yet they still face challenges in scalability and realism for populated dynamic scenes. We present C2R (Coarse-to-Real), a generative rendering framework that synthesizes real-style urban crowd videos from coarse 3D simulations. Our approach uses coarse 3D renderings to explicitly control scene layout, camera motion, and human trajectories, while a learned neural renderer generates realistic appearance, lighting, and fine-scale dynamics guided by text prompts. To overcome the lack of paired training data between coarse simulations and real videos, we adopt a two-phase mixed CG-real training strategy that learns a strong generative prior from large-scale real footage and introduces controllability through shared implicit spatio-temporal features across domains. The resulting system supports coarse-to-fine control, generalizes across diverse CG and game inputs, and produces temporally consistent, controllable, and realistic urban scene videos from minimal 3D input. We will release the model and project webpage atthis https URL.

View on arXiv
Comments on this paper