ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.06687
46
0

UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion

9 March 2025
Gongbo Zhang
Y. Li
Renqian Luo
Pipi Hu
Zeru Zhao
Lingbo Li
Guoqing Liu
Zun Wang
Ran Bi
Kaiyuan Gao
Liya Guo
Yu Xie
Chang-Shu Liu
Jia Zhang
Tian Xie
Robert Pinsler
Claudio Zeni
Ziheng Lu
Yingce Xia
Marwin H. S. Segler
Maik Riechert
Li-ming Yuan
Lei Chen
Haiguang Liu
Tao Qin
    DiffM
ArXivPDFHTML
Abstract

Unified generation of sequence and structure for scientific data (e.g., materials, molecules, proteins) is a critical task. Existing approaches primarily rely on either autoregressive sequence models or diffusion models, each offering distinct advantages and facing notable limitations. Autoregressive models, such as GPT, Llama, and Phi-4, have demonstrated remarkable success in natural language generation and have been extended to multimodal tasks (e.g., image, video, and audio) using advanced encoders like VQ-VAE to represent complex modalities as discrete sequences. However, their direct application to scientific domains is challenging due to the high precision requirements and the diverse nature of scientific data. On the other hand, diffusion models excel at generating high-dimensional scientific data, such as protein, molecule, and material structures, with remarkable accuracy. Yet, their inability to effectively model sequences limits their potential as general-purpose multimodal foundation models. To address these challenges, we propose UniGenX, a unified framework that combines autoregressive next-token prediction with conditional diffusion models. This integration leverages the strengths of autoregressive models to ease the training of conditional diffusion models, while diffusion-based generative heads enhance the precision of autoregressive predictions. We validate the effectiveness of UniGenX on material and small molecule generation tasks, achieving a significant leap in state-of-the-art performance for material crystal structure prediction and establishing new state-of-the-art results for small molecule structure prediction, de novo design, and conditional generation. Notably, UniGenX demonstrates significant improvements, especially in handling long sequences for complex structures, showcasing its efficacy as a versatile tool for scientific data generation.

View on arXiv
@article{zhang2025_2503.06687,
  title={ UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion },
  author={ Gongbo Zhang and Yanting Li and Renqian Luo and Pipi Hu and Zeru Zhao and Lingbo Li and Guoqing Liu and Zun Wang and Ran Bi and Kaiyuan Gao and Liya Guo and Yu Xie and Chang Liu and Jia Zhang and Tian Xie and Robert Pinsler and Claudio Zeni and Ziheng Lu and Yingce Xia and Marwin Segler and Maik Riechert and Li Yuan and Lei Chen and Haiguang Liu and Tao Qin },
  journal={arXiv preprint arXiv:2503.06687},
  year={ 2025 }
}
Comments on this paper