166

Effects of Layer Freezing when Transferring DeepSpeech to New Languages

Conference on Natural Language Processing (NLP), 2021
Abstract

In this paper, we train Mozilla's DeepSpeech architecture on German and Swiss German speech datasets and compare the results of different training methods. We first train the models from scratch on both languages and then improve upon the results by using an English pretrained version of DeepSpeech for weight initialization and experiment with the effects of freezing different layers during training. We see that even freezing only one layer already improves the results dramatically.

View on arXiv
Comments on this paper