ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.12969
12
0

Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down

19 May 2025
Yingzhi Wang
Anas Alhmoud
Saad Alsahly
Muhammad Alqurishi
Mirco Ravanelli
ArXivPDFHTML
Abstract

OpenAI's Whisper has achieved significant success in Automatic Speech Recognition. However, it has consistently been found to exhibit hallucination issues, particularly in non-speech segments, which limits its broader application in complex industrial settings.In this paper, we introduce a novel method to reduce Whisper's hallucination on non-speech segments without using any pre- or post-possessing techniques. Specifically, we benchmark the contribution of each self-attentional head in the Whisper-large-v3 decoder to the hallucination problem by performing a head-wise mask. Our findings reveal that only 3 of the 20 heads account for over 75% of the hallucinations on the UrbanSound dataset. We then fine-tune these three crazy heads using a collection of non-speech data. The results show that our best fine-tuned model, namely Calm-Whisper, achieves over 80% reduction in non-speech hallucination with only less than 0.1% WER degradation on LibriSpeech test-clean and test-other.

View on arXiv
@article{wang2025_2505.12969,
  title={ Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down },
  author={ Yingzhi Wang and Anas Alhmoud and Saad Alsahly and Muhammad Alqurishi and Mirco Ravanelli },
  journal={arXiv preprint arXiv:2505.12969},
  year={ 2025 }
}
Comments on this paper