Multimodal Controller for Generative Models
Class-conditional generative models are crucial tools for data generation from user-specified class labels. Existing approaches for class-conditional generative models require nontrivial modifications of backbone generative architectures to model conditional information fed into the model. This paper introduces a plug-and-play module named 'multimodal controller' to generate multimodal data without introducing additional learning parameters. In the absence of the controllers, our model reduces to non-conditional generative models. We test the efficacy of multimodal controller on CIFAR10, CIFAR100, COIL100 and Omniglot datasets, and experimentally demonstrate that multimodal controlled generative models (including VAE, PixelCNN, Glow, and GAN) are capable of generating class-conditional images of better or comparable quality when compared with the state-of-the-art conditional generative models. Moreover, we show that multimodal controlled models can also transit images between classes and create novel modalities of images.
View on arXiv