Image-to-image translation is a subset of computer vision and pattern recognition problems where our goal is to learn a mapping between input images of domain and output images of domain . Current methods use neural networks with an encoder-decoder structure to learn a mapping such that the distribution of images from and are identical, where and is referred as the encoder and is referred to as the decoder. Currently, such methods which also compute an inverse mapping use a separate encoder-decoder pair or at least a separate decoder to do so. Here we introduce a method to perform cross domain image-to-image translation across multiple domains using a single encoder-decoder architecture. We use an auto-encoder network which given an input image , first computes a latent domain encoding and a latent content encoding , where the domain encoding and content encoding are independent. And then a decoder network creates a reconstruction of the original image . Ideally, the domain encoding contains no information regarding the content of the image and the content encoding contains no information regarding the domain of the image. We use this property of the encodings to find the mapping across domains by simply changing the domain encoding of the decoder's input. where is the observation of .
View on arXiv