43
47

Do Deep Convolutional Nets Really Need to be Deep (Or Even Convolutional)?

Abstract

Yes, apparently they do. Previous research showed that shallow feed-forward nets sometimes can learn the complex functions previously learned by deep nets while using the same number of parameters as the deep models they mimic. In this paper we train models with a varying number of convolutional layers to mimic a state-of-the-art CIFAR-10 model. We are unable to train models without multiple layers of convolution to mimic deep convolutional models: the student models do not have to be as deep as the teacher model they mimic, but the students need multiple convolutional layers to learn functions of comparable accuracy as the teacher. Even when trained via distillation, deep convolutional nets need to be deep and convolutional.

View on arXiv
Comments on this paper