254

Correlated Initialization for Correlated Data

Neural Processing Letters (NPL), 2020
Abstract

Spatial data exhibits the property that nearby points are correlated. This holds also for learnt representations across layers, but not for commonly used weight initialization methods. Our theoretical analysis reveals for uncorrelated initialization that (i) flow through layers suffers from much more rapid decrease and (ii) training of individual parameters is subject to more ``zig-zagging''. We propose multiple methods for correlated initialization. For CNNs, they yield accuracy gains of several per cent in the absence of regularization. Even for properly tuned L2-regularization gains are often possible.

View on arXiv
Comments on this paper