LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context

23 May 2025

Main:4 Pages

3 Figures

Bibliography:1 Pages

5 Tables

Abstract

Generative error correction (GER) with large language models (LLMs) has emerged as an effective post-processing approach to improve automatic speech recognition (ASR) performance. However, it often struggles with rare or domain-specific words due to limited training data. Furthermore, existing LLM-based GER approaches primarily rely on textual information, neglecting phonetic cues, which leads to over-correction. To address these issues, we propose a novel LLM-based GER approach that targets rare words and incorporates phonetic information. First, we generate synthetic data to contain rare words for fine-tuning the GER model. Second, we integrate ASR's N-best hypotheses along with phonetic context to mitigate over-correction. Experimental results show that our method not only improves the correction of rare words but also reduces the WER and CER across both English and Japanese datasets.

View on arXiv

@article{yamashita2025_2505.17410,
  title={ LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context },
  author={ Natsuo Yamashita and Masaaki Yamamoto and Hiroaki Kokubo and Yohei Kawaguchi },
  journal={arXiv preprint arXiv:2505.17410},
  year={ 2025 }
}

Comments on this paper