Couple Learning for semi-supervised sound event detection

The recently proposed Mean Teacher method, which exploits large-scale unlabeled data in a self-ensembling manner, has achieved state-of-the-art results in several semi-supervised learning benchmarks. Spurred by current achievements, this paper proposes an effective Couple Learning method that combines a well-trained model and a Mean Teacher model. The suggested pseudo-labels generated model (PLG) increases strongly- and weakly-labeled data to improve the Mean Teacher method-s performance. Moreover, the Mean Teacher-s consistency cost reduces the noise impact in the pseudo-labels introduced by detection errors. The experimental results on Task 4 of the DCASE2020 challenge demonstrate the superiority of the proposed method, achieving about 44.25% F1-score on the public evaluation set, significantly outperforming the baseline system-s 32.39%. At the same time, we also propose a simple and effective experiment called the Variable Order Input (VOI) experiment, which proves the significance of the Couple Learning method. Our developed Couple Learning code is available on GitHub.
View on arXiv