A key factor for lunar mission planning is the ability to assess the local availability of raw materials. However, many potentially relevant measurements are scattered across a variety of scientific publications. In this paper we consider the viability of obtaining lunar composition data by leveraging LLMs to rapidly process a corpus of scientific publications. While leveraging LLMs to obtain knowledge from scientific documents is not new, this particular application presents interesting challenges due to the heterogeneity of lunar samples and the nuances involved in their characterization. Accuracy and uncertainty quantification are particularly crucial since many materials properties can be sensitive to small variations in composition. Our findings indicate that off-the-shelf LLMs are generally effective at extracting data from tables commonly found in these documents. However, there remains opportunity to further refine the data we extract in this initial approach; in particular, to capture fine-grained mineralogy information and to improve performance on more subtle/complex pieces of information.
View on arXiv@article{pekala2025_2504.20125, title={ Towards Large Language Models for Lunar Mission Planning and In Situ Resource Utilization }, author={ Michael Pekala and Gregory Canal and Samuel Barham and Milena B. Graziano and Morgan Trexler and Leslie Hamilton and Elizabeth Reilly and Christopher D. Stiles }, journal={arXiv preprint arXiv:2504.20125}, year={ 2025 } }