Correlated Initialization for Correlated Data
Neural Processing Letters (NPL), 2020
Abstract
Spatial data exhibits the property that nearby points are correlated. This holds also for learnt representations across layers, but not for commonly used weight initialization methods. Our theoretical analysis reveals for uncorrelated initialization that (i) flow through layers suffers from much more rapid decrease and (ii) training of individual parameters is subject to more ``zig-zagging''. We propose multiple methods for correlated initialization. For CNNs, they yield accuracy gains of several per cent in the absence of regularization. Even for properly tuned L2-regularization gains are often possible.
View on arXivComments on this paper
