Do Deep Convolutional Nets Really Need to be Deep (Or Even Convolutional)?

17 March 2016

G. Urban

Krzysztof J. Geras

Samira Ebrahimi Kahou

Abstract

Yes, apparently they do. Previous research showed that shallow feed-forward nets sometimes can learn the complex functions previously learned by deep nets while using the same number of parameters as the deep models they mimic. In this paper we train models with a varying number of convolutional layers to mimic a state-of-the-art CIFAR-10 model. We are unable to train models without multiple layers of convolution to mimic deep convolutional models: the student models do not have to be as deep as the teacher model they mimic, but the students need multiple convolutional layers to learn functions of comparable accuracy as the teacher. Even when trained via distillation, deep convolutional nets need to be deep and convolutional.

View on arXiv

Comments on this paper