4

A probabilistic foundation model for crystal structure denoising, phase classification, and order parameters

Hyuna Kwon
Babak Sadigh
Sebastien Hamel
Vincenzo Lordi
John Klepeis
Fei Zhou
Main:27 Pages
12 Figures
Bibliography:5 Pages
1 Tables
Appendix:4 Pages
Abstract

Atomistic simulations generate large volumes of noisy structural data, but extracting phase labels, order parameters (OPs), and defect information in a way that is universal, robust, and interpretable remains challenging. Existing tools such as PTM and CNA are restricted to a small set of hand-crafted lattices (e.g.\ FCC/BCC/HCP), degrade under strong thermal disorder or defects, and produce hard, template-based labels without per-atom probability or confidence scores. Here we introduce a log-probability foundation model that unifies denoising, phase classification, and OP extraction within a single probabilistic framework. We reuse the MACE-MP foundation interatomic potential on crystal structures mapped to AFLOW prototypes, training it to predict per-atom, per-phase logits ll and to aggregate them into a global log-density logP^θ(r)\log \hat{P}_\theta(\boldsymbol{r}) whose gradient defines a conservative score field. Denoising corresponds to gradient ascent on this learned log-density, phase labels follow from argmaxclac\arg\max_c l_{ac}, and the ll values act as continuous, defect-sensitive and interpretable OPs quantifying the Euclidean distance to ideal phases. We demonstrate universality across hundreds of prototypes, robustness under strong thermal and defect-induced disorder, and accurate treatment of complex systems such as ice polymorphs, ice--water interfaces, and shock-compressed Ti.

View on arXiv
Comments on this paper