Pruning neural networks: is it time to nip it in the bud?

10 October 2018

Elliot J. Crowley

Jack Turner

Amos Storkey

Michael F. P. O'Boyle

3DPC

ArXiv (abs)PDF HTML Github (141★)

Abstract

Pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing connections and fine-tuning; reducing the overall width of the network. However, the efficacy of network pruning has largely evaded scrutiny. In this paper, we examine ResNets and DenseNets obtained through pruning-and-tuning and make two interesting observations: (i) reduced networks---smaller versions of the original network trained from scratch---consistently outperform pruned networks; (ii) if you take the architecture of a pruned network and then train it from scratch it is significantly more competitive. Furthermore, these architectures are easy to approximate: we can prune once and obtain a whole family of new, scalable network architectures that can simply be trained from scratch. Finally, we compare the inference speed of reduced and pruned networks on hardware, and show that reduced networks are significantly faster.

View on arXiv

Comments on this paper