Reimagining Target-Aware Molecular Generation through Retrieval-Enhanced Aligned Diffusion

17 June 2025

Main:9 Pages

5 Figures

Bibliography:4 Pages

5 Tables

Abstract

Breakthroughs in high-accuracy protein structure prediction, such as AlphaFold, have established receptor-based molecule design as a critical driver for rapid early-phase drug discovery. However, most approaches still struggle to balance pocket-specific geometric fit with strict valence and synthetic constraints. To resolve this trade-off, a Retrieval-Enhanced Aligned Diffusion termed READ is introduced, which is the first to merge molecular Retrieval-Augmented Generation with an SE(3)-equivariant diffusion model. Specifically, a contrastively pre-trained encoder aligns atom-level representations during training, then retrieves graph embeddings of pocket-matched scaffolds to guide each reverse-diffusion step at inference. This single mechanism can inject real-world chemical priors exactly where needed, producing valid, diverse, and shape-complementary ligands. Experimental results demonstrate that READ can achieve very competitive performance in CBGBench, surpassing state-of-the-art generative models and even native ligands. That suggests retrieval and diffusion can be co-optimized for faster, more reliable structure-based drug design.

View on arXiv

@article{xu2025_2506.14488,
  title={ Reimagining Target-Aware Molecular Generation through Retrieval-Enhanced Aligned Diffusion },
  author={ Dong Xu and Zhangfan Yang and Ka-chun Wong and Zexuan Zhu and Jiangqiang Li and Junkai Ji },
  journal={arXiv preprint arXiv:2506.14488},
  year={ 2025 }
}

Comments on this paper