ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks

12 May 2025

Wenhao Hu

Paul Henderson

José Cano

ArXiv (abs)PDF HTML

Main:7 Pages

3 Figures

Bibliography:1 Pages

3 Tables

Abstract

Pruning is a widely used method for compressing Deep Neural Networks (DNNs), where less relevant parameters are removed from a DNN model to reduce its size. However, removing parameters reduces model accuracy, so pruning is typically combined with fine-tuning, and sometimes other operations such as rewinding weights, to recover accuracy. A common approach is to repeatedly prune and then fine-tune, with increasing amounts of model parameters being removed in each step. While straightforward to implement, pruning pipelines that follow this approach are computationally expensive due to the need for repeated fine-tuning.

View on arXiv

@article{hu2025_2505.07411,
  title={ ICE-Pruning: An Iterative Cost-Efficient Pruning Pipeline for Deep Neural Networks },
  author={ Wenhao Hu and Paul Henderson and José Cano },
  journal={arXiv preprint arXiv:2505.07411},
  year={ 2025 }
}

Comments on this paper