52
9

Towards Robust Deep Neural Networks for Affect and Depression Recognition

Abstract

Intelligent monitoring systems and affective computing applications have emerged in recent years to enhance healthcare. Examples of these applications include assessment of affective states such as Major Depressive Disorder (MDD). MDD describes the constant expression of certain emotions: negative emotions (low Valence) and lack of interest (low Arousal). High-performing intelligent systems would enhance MDD diagnosis in its early stages. In this paper, we present a new deep neural network architecture, called EmoAudioNet, for emotion and depression recognition from speech. Deep EmoAudioNet learns from the time-frequency representation of the audio signal and the visual representation of its spectrum of frequencies. Our model outperforms the state-of-the-art methods for RECOLA and for DAIC-WOZ datasets and it reaches high accuracies of 89.30%, 91.44% and 73.25% in predicting arousal, valence, and depression, respectively.

View on arXiv
Comments on this paper