CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation

29 April 2025

Jianyu Wu

Yizhou Wang

Xiangyu Yue

Xinzhu Ma

Jingyang Guo

Dongzhan Zhou

Wanli Ouyang

Shixiang Tang

ArXiv PDF HTML

Abstract

While accurate and user-friendly Computer-Aided Design (CAD) is crucial for industrial design and manufacturing, existing methods still struggle to achieve this due to their over-simplified representations or architectures incapable of supporting multimodal design requirements. In this paper, we attempt to tackle this problem from both methods and datasets aspects. First, we propose a cascade MAR with topology predictor (CMT), the first multimodal framework for CAD generation based on Boundary Representation (B-Rep). Specifically, the cascade MAR can effectively capture the ``edge-counters-surface'' priors that are essential in B-Reps, while the topology predictor directly estimates topology in B-Reps from the compact tokens in MAR. Second, to facilitate large-scale training, we develop a large-scale multimodal CAD dataset, mmABC, which includes over 1.3 million B-Rep models with multimodal annotations, including point clouds, text descriptions, and multi-view images. Extensive experiments show the superior of CMT in both conditional and unconditional CAD generation tasks. For example, we improve Coverage and Valid ratio by +10.68% and +10.3%, respectively, compared to state-of-the-art methods on ABC in unconditional generation. CMT also improves +4.01 Chamfer on image conditioned CAD generation on mmABC. The dataset, code and pretrained network shall be released.

View on arXiv

@article{wu2025_2504.20830,
  title={ CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation },
  author={ Jianyu Wu and Yizhou Wang and Xiangyu Yue and Xinzhu Ma and Jingyang Guo and Dongzhan Zhou and Wanli Ouyang and Shixiang Tang },
  journal={arXiv preprint arXiv:2504.20830},
  year={ 2025 }
}

Comments on this paper