A Survey on Hypothesis Generation for Scientific Discovery in the Era of Large Language Models

7 April 2025

Abstract

Hypothesis generation is a fundamental step in scientific discovery, yet it is increasingly challenged by information overload and disciplinary fragmentation. Recent advances in Large Language Models (LLMs) have sparked growing interest in their potential to enhance and automate this process. This paper presents a comprehensive survey of hypothesis generation with LLMs by (i) reviewing existing methods, from simple prompting techniques to more complex frameworks, and proposing a taxonomy that categorizes these approaches; (ii) analyzing techniques for improving hypothesis quality, such as novelty boosting and structured reasoning; (iii) providing an overview of evaluation strategies; and (iv) discussing key challenges and future directions, including multimodal integration and human-AI collaboration. Our survey aims to serve as a reference for researchers exploring LLMs for hypothesis generation.

View on arXiv

@article{alkan2025_2504.05496,
  title={ A Survey on Hypothesis Generation for Scientific Discovery in the Era of Large Language Models },
  author={ Atilla Kaan Alkan and Shashwat Sourav and Maja Jablonska and Simone Astarita and Rishabh Chakrabarty and Nikhil Garuda and Pranav Khetarpal and Maciej Pióro and Dimitrios Tanoglidis and Kartheik G. Iyer and Mugdha S. Polimera and Michael J. Smith and Tirthankar Ghosal and Marc Huertas-Company and Sandor Kruk and Kevin Schawinski and Ioana Ciucă },
  journal={arXiv preprint arXiv:2504.05496},
  year={ 2025 }
}

Comments on this paper