Fast Neural Networks with Circulant Projections

11 February 2015

Yu Cheng

Sanjiv Kumar

Abstract

The basic computation of a fully-connected neural network layer is a linear projection of the input signal followed by a non-linear transformation. The linear projection step consumes the bulk of the processing time and memory footprint. In this work, we propose to replace the conventional linear projection with the circulant projection. The circulant structure enables the use of the Fast Fourier Transform to speed up the computation. Considering a neural network layer with $d$ input nodes, and $d$ output nodes, this method improves the time complexity from $\mathcal{O}(d^2)$ to $\mathcal{O}(d\log{d})$ and space complexity from $\mathcal{O}(d^2)$ to $\mathcal{O}(d)$ . We further show that the gradient computation and optimization of the circulant projections can be performed very efficiently. Our experiments on three standard datasets show that the proposed approach achieves this significant gain in efficiency and storage with minimal loss of accuracy compared to neural networks with unstructured projections.

View on arXiv

Comments on this paper