Deep Representation Learning with an Information-theoretic Loss
- DRL

This paper proposes an information-theoretic loss for learning deep neural networks. We propose a loss function based on the Information Bottleneck principle and a max-margin loss with an aim to increase the class separability in the embedded space. While deep neural network models have excelled in supervised learning tasks with large-scale labeled data available, they are prone to practical issues in testing samples outside of classes shown in training, e.g., anomaly detection and out-of-distribution detection. In such tasks, it is not sufficient to merely discriminate between known classes. Our intuition is to represent the known classes in compact and separated embedded regions in order to decrease the possibility of known and unseen classes largely overlapping in the embedded space. We show that the IB-based loss function reflects the inter-class distances as well as the compactness within classes, thus will extend the extending models of the existing deep data description models. Our empirical study shows that the proposed model improves the segmentation of normal classes in the deep feature space which contributes to identifying the out-of-distribution samples.
View on arXiv