Layer-Wise Data-Free CNN Compression

International Conference on Pattern Recognition (ICPR), 2020

18 November 2020

Yanzi Jin

Abstract

We present a computationally efficient method for compressing a trained neural network without using any data. We break the problem of data-free network compression into independent layer-wise compressions. We show how to efficiently generate layer-wise training data, and how to precondition the network to maintain accuracy during layer-wise compression. Our generic technique can be used with any compression method. We outperform related works for data-free low-bit-width quantization on MobileNetV1, MobileNetV2, and ResNet18. We also demonstrate the efficacy of our layer-wise method when applied to pruning. We outperform baselines in the low-computation regime suitable for on-device edge compression while using orders of magnitude less memory and compute time than comparable generative methods. In the high-computation regime, we show how to combine our method with generative methods to improve upon state-of-the-art performance for several networks.

View on arXiv

Comments on this paper