49
0

Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation

Abstract

Speech emotion recognition (SER) plays a crucial role in human-computer interaction. The emergence of edge devices in the Internet of Things (IoT) presents challenges in constructing intricate deep learning models due to constraints in memory and computational resources. Moreover, emotional speech data often contains private information, raising concerns about privacy leakage during the deployment of SER models. To address these challenges, we propose a data distillation framework to facilitate efficient development of SER models in IoT applications using a synthesised, smaller, and distilled dataset. Our experiments demonstrate that the distilled dataset can be effectively utilised to train SER models with fixed initialisation, achieving performances comparable to those developed using the original full emotional speech dataset.

View on arXiv
@article{chang2025_2406.15119,
  title={ Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation },
  author={ Yi Chang and Zhao Ren and Zhonghao Zhao and Thanh Tam Nguyen and Kun Qian and Tanja Schultz and Björn W. Schuller },
  journal={arXiv preprint arXiv:2406.15119},
  year={ 2025 }
}
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.