Despite recent advances in semantic manipulation using StyleGAN, semantic editing of real faces remains challenging. The gap between the space and the + space demands an undesirable trade-off between reconstruction quality and editing quality. To solve this problem, we propose to expand the latent space by replacing fully-connected layers in the StyleGAN's mapping network with attention-based transformers. This simple and effective technique integrates the aforementioned two spaces and transforms them into one new latent space called ++. Our modified StyleGAN maintains the state-of-the-art generation quality of the original StyleGAN with moderately better diversity. But more importantly, the proposed ++ space achieves superior performance in both reconstruction quality and editing quality. Despite these significant advantages, our ++ space supports existing inversion algorithms and editing methods with only negligible modifications thanks to its structural similarity with the + space. Extensive experiments on the FFHQ dataset prove that our proposed ++ space is evidently more preferable than the previous + space for real face editing. The code is publicly available for research purposes at https://github.com/AnonSubm2021/TransStyleGAN.
View on arXiv