Data Augmentation in Classification using GAN

2 November 2017

Abstract

It is a difficult task to classify images with multiple labels only using a small number of labeled samples and to be worse, with unbalanced distribution. In this paper we propose a brand-new data augmentation method using generative adversarial networks, which is able to complement and complete the data manifold from the true sense, assist the classifier to better find margins or hyper-planes of neighboring classes, and finally lead to better performance in image classification task. Specifically, we design a pipeline containing a CNN model as classifier and a cycle-consistent adversarial networks(CycleGAN) to generate supplementary data from given classes. In order to avoid gradient vanishing, we apply a least-squared loss to adversarial loss. We also propose several evaluation methods to validate GAN's contribution in data augmentation. Qualitative observations indicate that data manifolds show a significant improvement in distribution integrity and margin clarity between classes. Quantitative comparative experiments with the baseline show a 5%~10% increase after this data augmentation technique.

View on arXiv

Comments on this paper