Generative Transfer Learning between Recurrent Neural Networks
Training a neural network demands a large amount of labeled data. Keeping the data after the training may not be allowed because of hardware or power restrictions for on-device learning. In this study, we train a new RNN, called a student network, using a previously developed RNN, the teacher network, without using the original data. The teacher network is used for generating data to train the student network. The softmax output of the teacher RNN is used as for the soft target when training a student network. The performance evaluation is conducted using a character-level language model. The experimental results show that the proposed method yields good performance approaching that of the original data based training. This work not only gives the insight to connect between the learning and generation but also can be useful when the original training data is not available.
View on arXiv