{"title":"通过基于变压器的非线性回归进行分解寿命合成","authors":"Mingyuan Li, Yingchun Guo","doi":"10.1111/cgf.15229","DOIUrl":null,"url":null,"abstract":"<p>Lifespan face age transformation aims to generate facial images that accurately depict an individual's appearance at different age stages. This task is highly challenging due to the need for reasonable changes in facial features while preserving identity characteristics. Existing methods tend to synthesize unsatisfactory results, such as entangled facial attributes and low identity preservation, especially when dealing with large age gaps. Furthermore, over-manipulating the style vector may deviate it from the latent space and damage image quality. To address these issues, this paper introduces a novel nonlinear regression model-<b>D</b>isentangled <b>L</b>ifespan face <b>Aging</b> (DL-Aging) to achieve high-quality age transformation images. Specifically, we propose an age modulation encoder to extract age-related multi-scale facial features as key and value, and use the reconstructed style vector of the image as the query. The multi-head cross-attention in the W<sup>+</sup> space is utilized to update the query for aging image reconstruction iteratively. This nonlinear transformation enables the model to learn a more disentangled mode of transformation, which is crucial for alleviating facial attribute entanglement. Additionally, we introduce a W<sup>+</sup> space age regularization term to prevent excessive manipulation of the style vector and ensure it remains within the W<sup>+</sup> space during transformation, thereby improving generation quality and aging accuracy. Extensive qualitative and quantitative experiments demonstrate that the proposed DL-Aging outperforms state-of-the-art methods regarding aging accuracy, image quality, attribute disentanglement, and identity preservation, especially for large age gaps.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Disentangled Lifespan Synthesis via Transformer-Based Nonlinear Regression\",\"authors\":\"Mingyuan Li, Yingchun Guo\",\"doi\":\"10.1111/cgf.15229\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Lifespan face age transformation aims to generate facial images that accurately depict an individual's appearance at different age stages. This task is highly challenging due to the need for reasonable changes in facial features while preserving identity characteristics. Existing methods tend to synthesize unsatisfactory results, such as entangled facial attributes and low identity preservation, especially when dealing with large age gaps. Furthermore, over-manipulating the style vector may deviate it from the latent space and damage image quality. To address these issues, this paper introduces a novel nonlinear regression model-<b>D</b>isentangled <b>L</b>ifespan face <b>Aging</b> (DL-Aging) to achieve high-quality age transformation images. Specifically, we propose an age modulation encoder to extract age-related multi-scale facial features as key and value, and use the reconstructed style vector of the image as the query. The multi-head cross-attention in the W<sup>+</sup> space is utilized to update the query for aging image reconstruction iteratively. This nonlinear transformation enables the model to learn a more disentangled mode of transformation, which is crucial for alleviating facial attribute entanglement. Additionally, we introduce a W<sup>+</sup> space age regularization term to prevent excessive manipulation of the style vector and ensure it remains within the W<sup>+</sup> space during transformation, thereby improving generation quality and aging accuracy. Extensive qualitative and quantitative experiments demonstrate that the proposed DL-Aging outperforms state-of-the-art methods regarding aging accuracy, image quality, attribute disentanglement, and identity preservation, especially for large age gaps.</p>\",\"PeriodicalId\":10687,\"journal\":{\"name\":\"Computer Graphics Forum\",\"volume\":\"43 7\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Graphics Forum\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cgf.15229\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.15229","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Disentangled Lifespan Synthesis via Transformer-Based Nonlinear Regression
Lifespan face age transformation aims to generate facial images that accurately depict an individual's appearance at different age stages. This task is highly challenging due to the need for reasonable changes in facial features while preserving identity characteristics. Existing methods tend to synthesize unsatisfactory results, such as entangled facial attributes and low identity preservation, especially when dealing with large age gaps. Furthermore, over-manipulating the style vector may deviate it from the latent space and damage image quality. To address these issues, this paper introduces a novel nonlinear regression model-Disentangled Lifespan face Aging (DL-Aging) to achieve high-quality age transformation images. Specifically, we propose an age modulation encoder to extract age-related multi-scale facial features as key and value, and use the reconstructed style vector of the image as the query. The multi-head cross-attention in the W+ space is utilized to update the query for aging image reconstruction iteratively. This nonlinear transformation enables the model to learn a more disentangled mode of transformation, which is crucial for alleviating facial attribute entanglement. Additionally, we introduce a W+ space age regularization term to prevent excessive manipulation of the style vector and ensure it remains within the W+ space during transformation, thereby improving generation quality and aging accuracy. Extensive qualitative and quantitative experiments demonstrate that the proposed DL-Aging outperforms state-of-the-art methods regarding aging accuracy, image quality, attribute disentanglement, and identity preservation, especially for large age gaps.
期刊介绍:
Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.