22
0

A collaborative constrained graph diffusion model for the generation of realistic synthetic molecules

Main:17 Pages
11 Figures
Bibliography:3 Pages
4 Tables
Appendix:8 Pages
Abstract

Developing new molecular compounds is crucial to address pressing challenges, from health to environmental sustainability. However, exploring the molecular space to discover new molecules is difficult due to the vastness of the space. Here we introduce CoCoGraph, a collaborative and constrained graph diffusion model capable of generating molecules that are guaranteed to be chemically valid. Thanks to the constraints built into the model and to the collaborative mechanism, CoCoGraph outperforms state-of-the-art approaches on standard benchmarks while requiring up to an order of magnitude fewer parameters. Analysis of 36 chemical properties also demonstrates that CoCoGraph generates molecules with distributions more closely matching real molecules than current models. Leveraging the model's efficiency, we created a database of 8.2M million synthetically generated molecules and conducted a Turing-like test with organic chemistry experts to further assess the plausibility of the generated molecules, and potential biases and limitations of CoCoGraph.

View on arXiv
@article{ruiz-botella2025_2505.16365,
  title={ A collaborative constrained graph diffusion model for the generation of realistic synthetic molecules },
  author={ Manuel Ruiz-Botella and Marta Sales-Pardo and Roger Guimerà },
  journal={arXiv preprint arXiv:2505.16365},
  year={ 2025 }
}
Comments on this paper