Overfitting is a crucial problem in deep neural networks, even in the latest network architectures. In this paper, so as to relieve the overfitting effect of ResNet and its improvements (i.e., Wide ResNet, PyramidNet and ResNeXt), we propose a new regularization method, named ShakeDrop regularization. ShakeDrop is inspired by Shake-Shake, which is an effective regularization method but can be applied to only ResNeXt. ShakeDrop is even more effective than Shake-Shake and can be successfully applied to not only ResNeXt but also ResNet, Wide ResNet and PyramidNet. The important key to realize ShakeDrop is stability of training. Since effective regularization often causes unstable training, we introduce a stabilizer of training which is an unusual usage of an existing regularizer. Experiments reveals that ShakeDrop achieves comparable or superior generalization performance to conventional methods.
View on arXiv