FW-GAN: Flow-Navigated Warping GAN for Video Virtual Try-On

2019 IEEE/CVF International Conference on Computer Vision (ICCV) Pub Date : 2019-10-01 DOI:10.1109/ICCV.2019.00125

Haoye Dong, Xiaodan Liang, Xiaohui Shen, Bowen Wu, Bing-cheng Chen, Jian Yin

{"title":"FW-GAN: Flow-Navigated Warping GAN for Video Virtual Try-On","authors":"Haoye Dong, Xiaodan Liang, Xiaohui Shen, Bowen Wu, Bing-cheng Chen, Jian Yin","doi":"10.1109/ICCV.2019.00125","DOIUrl":null,"url":null,"abstract":"Beyond current image-based virtual try-on systems that have attracted increasing attention, we move a step forward to developing a video virtual try-on system that precisely transfers clothes onto the person and generates visually realistic videos conditioned on arbitrary poses. Besides the challenges in image-based virtual try-on (e.g., clothes fidelity, image synthesis), video virtual try-on further requires spatiotemporal consistency. Directly adopting existing image-based approaches often fails to generate coherent video with natural and realistic textures. In this work, we propose Flow-navigated Warping Generative Adversarial Network (FW-GAN), a novel framework that learns to synthesize the video of virtual try-on based on a person image, the desired clothes image, and a series of target poses. FW-GAN aims to synthesize the coherent and natural video while manipulating the pose and clothes. It consists of: (i) a flow-guided fusion module that warps the past frames to assist synthesis, which is also adopted in the discriminator to help enhance the coherence and quality of the synthesized video; (ii) a warping net that is designed to warp clothes image for the refinement of clothes textures; (iii) a parsing constraint loss that alleviates the problem caused by the misalignment of segmentation maps from images with different poses and various clothes. Experiments on our newly collected dataset show that FW-GAN can synthesize high-quality video of virtual try-on and significantly outperforms other methods both qualitatively and quantitatively.","PeriodicalId":6728,"journal":{"name":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"1161-1170"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"72","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CVF International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2019.00125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 72

Abstract

Beyond current image-based virtual try-on systems that have attracted increasing attention, we move a step forward to developing a video virtual try-on system that precisely transfers clothes onto the person and generates visually realistic videos conditioned on arbitrary poses. Besides the challenges in image-based virtual try-on (e.g., clothes fidelity, image synthesis), video virtual try-on further requires spatiotemporal consistency. Directly adopting existing image-based approaches often fails to generate coherent video with natural and realistic textures. In this work, we propose Flow-navigated Warping Generative Adversarial Network (FW-GAN), a novel framework that learns to synthesize the video of virtual try-on based on a person image, the desired clothes image, and a series of target poses. FW-GAN aims to synthesize the coherent and natural video while manipulating the pose and clothes. It consists of: (i) a flow-guided fusion module that warps the past frames to assist synthesis, which is also adopted in the discriminator to help enhance the coherence and quality of the synthesized video; (ii) a warping net that is designed to warp clothes image for the refinement of clothes textures; (iii) a parsing constraint loss that alleviates the problem caused by the misalignment of segmentation maps from images with different poses and various clothes. Experiments on our newly collected dataset show that FW-GAN can synthesize high-quality video of virtual try-on and significantly outperforms other methods both qualitatively and quantitatively.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

FW-GAN:流导航扭曲GAN视频虚拟试戴

除了目前吸引越来越多关注的基于图像的虚拟试穿系统之外，我们又向前迈出了一步，开发了一种视频虚拟试穿系统，可以精确地将衣服转移到人身上，并根据任意姿势生成视觉上逼真的视频。除了基于图像的虚拟试穿(如服装保真度、图像合成)面临的挑战外，视频虚拟试穿还需要时空一致性。直接采用现有的基于图像的方法往往无法生成具有自然逼真纹理的连贯视频。在这项工作中，我们提出了Flow-navigated warp Generative Adversarial Network (FW-GAN)，这是一个基于人的图像、所需的衣服图像和一系列目标姿势来学习合成虚拟试穿视频的新框架。FW-GAN的目标是在操纵姿势和服装的同时合成连贯自然的视频。它包括:(1)流引导融合模块，该模块对过去的帧进行扭曲以辅助合成，鉴别器也采用了流引导融合模块，以帮助增强合成视频的连贯性和质量;(ii)经翘曲网，其设计用于经翘曲衣服图像以改善衣服纹理;(iii)解析约束损失，缓解了不同姿态、不同衣服的图像分割图不对齐的问题。在新收集的数据集上的实验表明，FW-GAN可以合成高质量的虚拟试穿视频，并且在定性和定量上都明显优于其他方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量

期刊最新文献

Very Long Natural Scenery Image Prediction by Outpainting VTNFP: An Image-Based Virtual Try-On Network With Body and Clothing Feature Preservation Towards Latent Attribute Discovery From Triplet Similarities Gaze360: Physically Unconstrained Gaze Estimation in the Wild Attention Bridging Network for Knowledge Transfer