Intelligibility of Text-to-Speech Systems for Mathematical Expressions

5 June 2025

Main:4 Pages

2 Figures

Bibliography:1 Pages

4 Tables

Abstract

There has been limited evaluation of advanced Text-to-Speech (TTS) models with Mathematical eXpressions (MX) as inputs. In this work, we design experiments to evaluate quality and intelligibility of five TTS models through listening and transcribing tests for various categories of MX. We use two Large Language Models (LLMs) to generate English pronunciation from LaTeX MX as TTS models cannot process LaTeX directly. We use Mean Opinion Score from user ratings and quantify intelligibility through transcription correctness using three metrics. We also compare listener preference of TTS outputs with respect to human expert rendition of same MX. Results establish that output of TTS models for MX is not necessarily intelligible, the gap in intelligibility varies across TTS models and MX category. For most categories, performance of TTS models is significantly worse than that of expert rendition. The effect of choice of LLM is limited. This establishes the need to improve TTS models for MX.

View on arXiv

@article{roychowdhury2025_2506.11086,
  title={ Intelligibility of Text-to-Speech Systems for Mathematical Expressions },
  author={ Sujoy Roychowdhury and H. G. Ranjani and Sumit Soman and Nishtha Paul and Subhadip Bandyopadhyay and Siddhanth Iyengar },
  journal={arXiv preprint arXiv:2506.11086},
  year={ 2025 }
}

Comments on this paper