Face animation based on multiple sources and perspective alignment

Q1 Computer Science Virtual Reality Intelligent Hardware Pub Date : 2024-06-01 Epub Date: 2024-06-27 DOI:10.1016/j.vrih.2024.04.002

Yuanzong Mei , Wenyi Wang , Xi Liu , Wei Yong , Weijie Wu , Yifan Zhu , Shuai Wang , Jianwen Chen

{"title":"Face animation based on multiple sources and perspective alignment","authors":"Yuanzong Mei , Wenyi Wang , Xi Liu , Wei Yong , Weijie Wu , Yifan Zhu , Shuai Wang , Jianwen Chen","doi":"10.1016/j.vrih.2024.04.002","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Face image animation generates a synthetic human face video that harmoniously integrates the identity derived from the source image and facial motion obtained from the driving video. This technology could be beneficial in multiple medical fields, such as diagnosis and privacy protection<em>.</em> Previous studies on face animation often relied on a single source image to generate an output video. With a significant pose difference between the source image and the driving frame, the quality of the generated video is likely to be suboptimal because the source image may not provide sufficient features for the warped feature map.</p></div><div><h3>Methods</h3><p>In this study, we propose a novel face-animation scheme based on multiple sources and perspective alignment to address these issues. We first introduce a multiple-source sampling and selection module to screen the optimal source image set from the provided driving video. We then propose an inter-frame interpolation and alignment module to further eliminate the misalignment between the selected source image and the driving frame.</p></div><div><h3>Conclusions</h3><p>The proposed method exhibits superior performance in terms of objective metrics and visual quality in large-angle animation scenes compared to other state-of-the-art face animation methods. It indicates the effectiveness of the proposed method in addressing the distortion issues in large-angle animation.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 3","pages":"Pages 252-266"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579624000202/pdfft?md5=2a9475967792588ba319db5427a9033d&pid=1-s2.0-S2096579624000202-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579624000202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Face image animation generates a synthetic human face video that harmoniously integrates the identity derived from the source image and facial motion obtained from the driving video. This technology could be beneficial in multiple medical fields, such as diagnosis and privacy protection. Previous studies on face animation often relied on a single source image to generate an output video. With a significant pose difference between the source image and the driving frame, the quality of the generated video is likely to be suboptimal because the source image may not provide sufficient features for the warped feature map.

Methods

In this study, we propose a novel face-animation scheme based on multiple sources and perspective alignment to address these issues. We first introduce a multiple-source sampling and selection module to screen the optimal source image set from the provided driving video. We then propose an inter-frame interpolation and alignment module to further eliminate the misalignment between the selected source image and the driving frame.

Conclusions

The proposed method exhibits superior performance in terms of objective metrics and visual quality in large-angle animation scenes compared to other state-of-the-art face animation methods. It indicates the effectiveness of the proposed method in addressing the distortion issues in large-angle animation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于多源和透视对齐的人脸动画

背景人脸图像动画生成合成人脸视频，将源图像中的身份信息和驾驶视频中的面部动作和谐地结合在一起。这项技术可用于诊断和隐私保护等多个医疗领域。以往关于人脸动画的研究通常依赖单一源图像来生成输出视频。由于源图像和驾驶帧之间存在明显的姿态差异，生成的视频质量很可能不理想，因为源图像可能无法为扭曲特征图提供足够的特征。首先，我们引入了一个多源采样和选择模块，从提供的驾驶视频中筛选出最佳源图像集。结论与其他最先进的人脸动画方法相比，所提出的方法在大角度动画场景的客观指标和视觉质量方面表现出更优越的性能。这表明所提出的方法能有效解决大角度动画中的失真问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊