Learning Annotation Consensus for Continuous Emotion Recognition

In affective computing, datasets often contain multiple annotations from different annotators, which may lack full agreement. Typically, these annotations are merged into a single gold standard label, potentially losing valuable inter-rater variability. We propose a multi-annotator training approach for continuous emotion recognition (CER) that seeks a consensus across all annotators rather than relying on a single reference label. Our method employs a consensus network to aggregate annotations into a unified representation, guiding the main arousal-valence predictor to better reflect collective inputs. Tested on the RECOLA and COGNIMUSE datasets, our approach outperforms traditional methods that unify annotations into a single label. This underscores the benefits of fully leveraging multi-annotator data in emotion recognition and highlights its applicability across various fields where annotations are abundant yet inconsistent.
View on arXiv@article{shoer2025_2505.21196, title={ Learning Annotation Consensus for Continuous Emotion Recognition }, author={ Ibrahim Shoer and Engin Erzin }, journal={arXiv preprint arXiv:2505.21196}, year={ 2025 } }