38
0

ChemAU: Harness the Reasoning of LLMs in Chemical Research with Adaptive Uncertainty Estimation

Main:14 Pages
15 Figures
Bibliography:3 Pages
2 Tables
Abstract

Large Language Models (LLMs) are widely used across various scenarios due to their exceptional reasoning capabilities and natural language understanding. While LLMs demonstrate strong performance in tasks involving mathematics and coding, their effectiveness diminishes significantly when applied to chemistry-related problems. Chemistry problems typically involve long and complex reasoning steps, which contain specific terminology, including specialized symbol systems and complex nomenclature conventions. These characteristics often cause general LLMs to experience hallucinations during the reasoning process due to their lack of specific knowledge. However, existing methods are struggling to effectively leverage chemical expertise and formulas. Moreover, current uncertainty estimation methods, designed to mitigate potential reasoning errors, are unable to precisely identify specific steps or key knowledge. In this work, we propose a novel framework called ChemAU, which incorporates our adaptive uncertainty estimation method that applies different uncertainty values based on the position of reasoning steps within the whole reasoning chain. Leveraging this method, ChemAU identifies gaps in chemistry knowledge and precisely supplements chemical expertise with the specialized domain model, thereby correcting and updating the previously flawed reasoning chain. Our experiments with three popular LLMs across three chemistry datasets demonstrate that ChemAU significantly enhances both reasoning accuracy and uncertainty estimation.

View on arXiv
@article{liu2025_2506.01116,
  title={ ChemAU: Harness the Reasoning of LLMs in Chemical Research with Adaptive Uncertainty Estimation },
  author={ Xinyi Liu and Lipeng Ma and Yixuan Li and Weidong Yang and Qingyuan Zhou and Jiayi Song and Shuhao Li and Ben Fei },
  journal={arXiv preprint arXiv:2506.01116},
  year={ 2025 }
}
Comments on this paper