EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion

2 May 2024

Nassir Navab

Abstract

We present EchoScene, an interactive and controllable generative model that generates 3D indoor scenes on scene graphs. EchoScene leverages a dual-branch diffusion model that dynamically adapts to scene graphs. Existing methods struggle to handle scene graphs due to varying numbers of nodes, multiple edge combinations, and manipulator-induced node-edge operations. EchoScene overcomes this by associating each node with a denoising process and enables collaborative information exchange, enhancing controllable and consistent generation aware of global constraints. This is achieved through an information echo scheme in both shape and layout branches. At every denoising step, all processes share their denoising data with an information exchange unit that combines these updates using graph convolution. The scheme ensures that the denoising processes are influenced by a holistic understanding of the scene graph, facilitating the generation of globally coherent scenes. The resulting scenes can be manipulated during inference by editing the input scene graph and sampling the noise in the diffusion model. Extensive experiments validate our approach, which maintains scene controllability and surpasses previous methods in generation fidelity. Moreover, the generated scenes are of high quality and thus directly compatible with off-the-shelf texture generation. Code and trained models are open-sourced.

View on arXiv

@article{zhai2025_2405.00915,
  title={ EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion },
  author={ Guangyao Zhai and Evin Pınar Örnek and Dave Zhenyu Chen and Ruotong Liao and Yan Di and Nassir Navab and Federico Tombari and Benjamin Busam },
  journal={arXiv preprint arXiv:2405.00915},
  year={ 2025 }
}

Comments on this paper