10
1

A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation

Mingshuo Ding
Yi Ma
Abstract

Despite recent achievements of deep learning automatic music generation algorithms, few approaches have been proposed to evaluate whether a single-track music excerpt is composed by automatons or Homo sapiens. To tackle this problem, we apply a masked language model based on ALBERT for composers classification. The aim is to obtain a model that can suggest the probability a MIDI clip might be composed condition on the auto-generation hypothesis, and which is trained with only AI-composed single-track MIDI. In this paper, the amount of parameters is reduced, two methods on data augmentation are proposed as well as a refined loss function to prevent overfitting. The experiment results show our model ranks 3rd3^{rd} in all the 77 teams in the data challenge in CSMT(2020). Furthermore, this inspiring method could be spread to other music information retrieval tasks that are based on a small dataset.

View on arXiv
Comments on this paper