73
19
v1v2 (latest)

Information Theoretic Co-Training

Abstract

This paper introduces an information theoretic co-training objective for unsupervised learning. We consider the problem of predicting the future. Rather than predict future sensations (image pixels or sound waves) we predict "hypotheses" to be confirmed by future sensations. More formally, we assume a population distribution on pairs (x,y)(x,y) where we can think of xx as a past sensation and yy as a future sensation. We train both a predictor model PΦ(zx)P_\Phi(z|x) and a confirmation model PΨ(zy)P_\Psi(z|y) where we view zz as hypotheses (when predicted) or facts (when confirmed). For a population distribution on pairs (x,y)(x,y) we focus on the problem of measuring the mutual information between xx and yy. By the data processing inequality this mutual information is at least as large as the mutual information between xx and zz under the distribution on triples (x,z,y)(x,z,y) defined by the confirmation model PΨ(zy)P_\Psi(z|y). The information theoretic training objective for PΦ(zx)P_\Phi(z|x) and PΨ(zy)P_\Psi(z|y) can be viewed as a form of co-training where we want the prediction from xx to match the confirmation from yy.

View on arXiv
Comments on this paper