Effects of Layer Freezing when Transferring DeepSpeech to New Languages

Conference on Natural Language Processing (NLP), 2021

8 February 2021

Abstract

In this paper, we train Mozilla's DeepSpeech architecture on German and Swiss German speech datasets and compare the results of different training methods. We first train the models from scratch on both languages and then improve upon the results by using an English pretrained version of DeepSpeech for weight initialization and experiment with the effects of freezing different layers during training. We see that even freezing only one layer already improves the results dramatically.

View on arXiv

Comments on this paper