37

Does Less Hallucination Mean Less Creativity? An Empirical Investigation in LLMs

Mohor Banerjee
Nadya Yuki Wangsajaya
Syed Ali Redha Alsagoff
Min Sen Tan
Zachary Choy Kit Chun
Alvin Chan Guo Wei
Main:6 Pages
7 Figures
Bibliography:2 Pages
1 Tables
Appendix:4 Pages
Abstract

Large Language Models (LLMs) exhibit remarkable capabilities in natural language understanding and reasoning, but suffer from hallucination: the generation of factually incorrect content. While numerous methods have been developed to reduce hallucinations, their impact on creative generations remains unexplored. This gap is particularly critical for AI-assisted scientific discovery, which requires both factual accuracy and creative hypothesis generation. We investigate how three hallucination-reduction techniques: Chain of Verification (CoVe), Decoding by Contrasting Layers (DoLa), and Retrieval-Augmented Generation (RAG), affect creativity in LLMs. Evaluating multiple model families (LLaMA, Qwen, Mistral) at varying scales (1B - 70B parameters) on two creativity benchmarks (NeoCoder and CS4), we find that these methods have opposing effects on divergent creativity. CoVe enhances divergent thinking, DoLa suppresses it, and RAG shows minimal impact. Our findings provide guidance for selecting appropriate hallucination-reduction methods in scientific applications, where the balance between factual accuracy and creative exploration is crucial.

View on arXiv
Comments on this paper