Face swapping is gaining significant traction, boosted by the plethora of human face synthesis with the deep learning methods. Recent works based on Generative Adversarial Nets (GAN) for face swapping often suffer from blending inconsistency, distortions and artefacts, as well as instability in training. In this work, we propose a novel end-to-end framework for high-fidelity face swapping, leveraging the high photorealistic face generation techniques from StyleGAN. Firstly, we invert the facial images into the style latent space by purposing a novel facial attributes encoder that is capable of extracting face essentials from the face image and projecting them to the style code in the latent space. We show that such inverted style code encapsulates facial attributes that are indispensable for face swapping task. Secondly, a carefully designed style blending module (SBM) is introduced for transferring the identity from a source image to the target by the multi-head attention (MHA) mechanism. We propose relevant constraints for guiding the learning of the SBM, leading to the effective blending of the Face ID from the source face to the target image. Finally, the blended style code can be translated back to the image space via the style decoder, benefiting from the training stability and the high quality of the generative capability of the style-based decoder. Extensive experiments demonstrate the superior quality of the face synthesis results (illustrated in Figure 1) of our face-swapping system compared with other state-of-the-art methods.
{"title":"High-Fidelity Face Swapping with Style Blending","authors":"Xinyu Yang, Zhijin Guo, Mowen Xue, Zijian Shi","doi":"10.56541/kupa8487","DOIUrl":"https://doi.org/10.56541/kupa8487","url":null,"abstract":"Face swapping is gaining significant traction, boosted by the plethora of human face synthesis with the deep learning methods. Recent works based on Generative Adversarial Nets (GAN) for face swapping often suffer from blending inconsistency, distortions and artefacts, as well as instability in training. In this work, we propose a novel end-to-end framework for high-fidelity face swapping, leveraging the high photorealistic face generation techniques from StyleGAN. Firstly, we invert the facial images into the style latent space by purposing a novel facial attributes encoder that is capable of extracting face essentials from the face image and projecting them to the style code in the latent space. We show that such inverted style code encapsulates facial attributes that are indispensable for face swapping task. Secondly, a carefully designed style blending module (SBM) is introduced for transferring the identity from a source image to the target by the multi-head attention (MHA) mechanism. We propose relevant constraints for guiding the learning of the SBM, leading to the effective blending of the Face ID from the source face to the target image. Finally, the blended style code can be translated back to the image space via the style decoder, benefiting from the training stability and the high quality of the generative capability of the style-based decoder. Extensive experiments demonstrate the superior quality of the face synthesis results (illustrated in Figure 1) of our face-swapping system compared with other state-of-the-art methods.","PeriodicalId":180076,"journal":{"name":"24th Irish Machine Vision and Image Processing Conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132463364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}