Measuring Information Transfer in Neural Networks

Quantifying the information content in a neural network model is essentially estimating the model's Kolmogorov complexity. Recent success of prequential coding on neural networks points to a promising path of deriving an efficient description length of a model. We propose a practical measure of the generalizable information in a neural network model based on prequential coding, which we term Information Transfer (). Theoretically, is an estimation of the generalizable part of a model's information content. In experiments, we show that is consistently correlated with generalizable information and can be used as a measure of patterns or "knowledge" in a model or a dataset. Consequently, can serve as a useful analysis tool in deep learning. In this paper, we apply to compare and dissect information in datasets, evaluate representation models in transfer learning, and analyze catastrophic forgetting and continual learning algorithms. provides an information perspective which helps us discover new insights into neural network learning.
View on arXiv