Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations

18 March 2025

Abstract

LLMs often adopt an assertive language style also when making false claims. Such ``overconfident hallucinations'' mislead users and erode trust. Achieving the ability to express in language the actual degree of uncertainty around a claim is therefore of great importance. We find that ``verbal uncertainty'' is governed by a single linear feature in the representation space of LLMs, and show that this has only moderate correlation with the actual ``semantic uncertainty'' of the model. We apply this insight and show that (1) the mismatch between semantic and verbal uncertainty is a better predictor of hallucinations than semantic uncertainty alone and (2) we can intervene on verbal uncertainty at inference time and reduce confident hallucinations on short-form answers, achieving an average relative reduction of ~30%.

View on arXiv

@article{ji2025_2503.14477,
  title={ Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations },
  author={ Ziwei Ji and Lei Yu and Yeskendir Koishekenov and Yejin Bang and Anthony Hartshorn and Alan Schelten and Cheng Zhang and Pascale Fung and Nicola Cancedda },
  journal={arXiv preprint arXiv:2503.14477},
  year={ 2025 }
}

Comments on this paper