Disentangled Lifespan Synthesis via Transformer-Based Nonlinear Regression

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Computer Graphics Forum Pub Date : 2024-10-24 DOI:10.1111/cgf.15229

Mingyuan Li, Yingchun Guo

{"title":"Disentangled Lifespan Synthesis via Transformer-Based Nonlinear Regression","authors":"Mingyuan Li, Yingchun Guo","doi":"10.1111/cgf.15229","DOIUrl":null,"url":null,"abstract":"Lifespan face age transformation aims to generate facial images that accurately depict an individual's appearance at different age stages. This task is highly challenging due to the need for reasonable changes in facial features while preserving identity characteristics. Existing methods tend to synthesize unsatisfactory results, such as entangled facial attributes and low identity preservation, especially when dealing with large age gaps. Furthermore, over-manipulating the style vector may deviate it from the latent space and damage image quality. To address these issues, this paper introduces a novel nonlinear regression model-Disentangled Lifespan face Aging (DL-Aging) to achieve high-quality age transformation images. Specifically, we propose an age modulation encoder to extract age-related multi-scale facial features as key and value, and use the reconstructed style vector of the image as the query. The multi-head cross-attention in the W+ space is utilized to update the query for aging image reconstruction iteratively. This nonlinear transformation enables the model to learn a more disentangled mode of transformation, which is crucial for alleviating facial attribute entanglement. Additionally, we introduce a W+ space age regularization term to prevent excessive manipulation of the style vector and ensure it remains within the W+ space during transformation, thereby improving generation quality and aging accuracy. Extensive qualitative and quantitative experiments demonstrate that the proposed DL-Aging outperforms state-of-the-art methods regarding aging accuracy, image quality, attribute disentanglement, and identity preservation, especially for large age gaps.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.15229","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Lifespan face age transformation aims to generate facial images that accurately depict an individual's appearance at different age stages. This task is highly challenging due to the need for reasonable changes in facial features while preserving identity characteristics. Existing methods tend to synthesize unsatisfactory results, such as entangled facial attributes and low identity preservation, especially when dealing with large age gaps. Furthermore, over-manipulating the style vector may deviate it from the latent space and damage image quality. To address these issues, this paper introduces a novel nonlinear regression model-Disentangled Lifespan face Aging (DL-Aging) to achieve high-quality age transformation images. Specifically, we propose an age modulation encoder to extract age-related multi-scale facial features as key and value, and use the reconstructed style vector of the image as the query. The multi-head cross-attention in the W⁺ space is utilized to update the query for aging image reconstruction iteratively. This nonlinear transformation enables the model to learn a more disentangled mode of transformation, which is crucial for alleviating facial attribute entanglement. Additionally, we introduce a W⁺ space age regularization term to prevent excessive manipulation of the style vector and ensure it remains within the W⁺ space during transformation, thereby improving generation quality and aging accuracy. Extensive qualitative and quantitative experiments demonstrate that the proposed DL-Aging outperforms state-of-the-art methods regarding aging accuracy, image quality, attribute disentanglement, and identity preservation, especially for large age gaps.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过基于变压器的非线性回归进行分解寿命合成

生命周期面部年龄变换的目的是生成能准确描绘个人在不同年龄阶段外貌的面部图像。这项任务极具挑战性，因为需要在保留身份特征的同时合理改变面部特征。现有的方法往往合成出不令人满意的结果，如面部属性纠缠不清和身份保留率低，尤其是在处理较大的年龄差距时。此外，对风格向量的过度操作可能会使其偏离潜在空间，从而损害图像质量。为了解决这些问题，本文引入了一个新颖的非线性回归模型--Disentangled Lifespan face Aging（DL-Aging），以实现高质量的年龄转换图像。具体来说，我们提出了一种年龄调制编码器，以提取与年龄相关的多尺度面部特征作为键和值，并将重建后的图像样式向量作为查询。利用 W+ 空间中的多头交叉关注来迭代更新老化图像重建的查询。这种非线性变换能使模型学习到更多的分解变换模式，这对减轻面部属性纠缠至关重要。此外，我们还引入了 W+ 空间年龄正则项，以防止对风格向量的过度操作，并确保其在转换过程中保持在 W+ 空间内，从而提高生成质量和老化准确性。广泛的定性和定量实验证明，所提出的 DL-Aging 在老化准确性、图像质量、属性纠缠和身份保持方面都优于最先进的方法，尤其是在年龄差距较大的情况下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computer Graphics Forum 工程技术-计算机：软件工程

CiteScore

5.80

自引率

12.00%

发文量

175

审稿时长

3-6 weeks

期刊介绍： Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.

期刊最新文献

Issue Information Editorial Lightweight Voronoi Sponza GeoCode: Interpretable Shape Programs Immersive and Interactive Learning With eDIVE: A Solution for Creating Collaborative VR Education Experiences