ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.14648
7
0

Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits

20 May 2025
Tiantian Feng
Jihwan Lee
Anfeng Xu
Yoonjeong Lee
Thanathai Lertpetchpun
Xuan Shi
Helin Wang
Thomas Thebaud
Laureano Moro Velázquez
D. Byrd
Najim Dehak
Shrikanth Narayanan
ArXivPDFHTML
Abstract

We introduce Vox-Profile, a comprehensive benchmark to characterize rich speaker and speech traits using speech foundation models. Unlike existing works that focus on a single dimension of speaker traits, Vox-Profile provides holistic and multi-dimensional profiles that reflect both static speaker traits (e.g., age, sex, accent) and dynamic speech properties (e.g., emotion, speech flow). This benchmark is grounded in speech science and linguistics, developed with domain experts to accurately index speaker and speech characteristics. We report benchmark experiments using over 15 publicly available speech datasets and several widely used speech foundation models that target various static and dynamic speaker and speech properties. In addition to benchmark experiments, we showcase several downstream applications supported by Vox-Profile. First, we show that Vox-Profile can augment existing speech recognition datasets to analyze ASR performance variability. Vox-Profile is also used as a tool to evaluate the performance of speech generation systems. Finally, we assess the quality of our automated profiles through comparison with human evaluation and show convergent validity. Vox-Profile is publicly available at:this https URL.

View on arXiv
@article{feng2025_2505.14648,
  title={ Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits },
  author={ Tiantian Feng and Jihwan Lee and Anfeng Xu and Yoonjeong Lee and Thanathai Lertpetchpun and Xuan Shi and Helin Wang and Thomas Thebaud and Laureano Moro-Velazquez and Dani Byrd and Najim Dehak and Shrikanth Narayanan },
  journal={arXiv preprint arXiv:2505.14648},
  year={ 2025 }
}
Comments on this paper