{"title":"PR3D:从单张图像重建精确逼真的 3D 人脸","authors":"Zhangjin Huang, Xing Wu","doi":"10.1002/cav.2254","DOIUrl":null,"url":null,"abstract":"<p>Reconstructing the three-dimensional (3D) shape and texture of the face from a single image is a significant and challenging task in computer vision and graphics. In recent years, learning-based reconstruction methods have exhibited outstanding performance, but their effectiveness is severely constrained by the scarcity of available training data with 3D annotations. To address this issue, we present the PR3D (Precise and Realistic 3D face reconstruction) method, which consists of high-precision shape reconstruction based on semi-supervised learning and high-fidelity texture reconstruction based on StyleGAN2. In shape reconstruction, we use in-the-wild face images and 3D annotated datasets to train the auxiliary encoder and the identity encoder, encoding the input image into parameters of FLAME (a parametric 3D face model). Simultaneously, a novel semi-supervised hybrid landmark loss is designed to more effectively learn from in-the-wild face images and 3D annotated datasets. Furthermore, to meet the real-time requirements in practical applications, a lightweight shape reconstruction model called fast-PR3D is distilled through teacher–student learning. In texture reconstruction, we propose a texture extraction method based on face reenactment in StyleGAN2 style space, extracting texture from the source and reenacted face images to constitute a facial texture map. Extensive experiments have demonstrated the state-of-the-art performance of our method.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PR3D: Precise and realistic 3D face reconstruction from a single image\",\"authors\":\"Zhangjin Huang, Xing Wu\",\"doi\":\"10.1002/cav.2254\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Reconstructing the three-dimensional (3D) shape and texture of the face from a single image is a significant and challenging task in computer vision and graphics. In recent years, learning-based reconstruction methods have exhibited outstanding performance, but their effectiveness is severely constrained by the scarcity of available training data with 3D annotations. To address this issue, we present the PR3D (Precise and Realistic 3D face reconstruction) method, which consists of high-precision shape reconstruction based on semi-supervised learning and high-fidelity texture reconstruction based on StyleGAN2. In shape reconstruction, we use in-the-wild face images and 3D annotated datasets to train the auxiliary encoder and the identity encoder, encoding the input image into parameters of FLAME (a parametric 3D face model). Simultaneously, a novel semi-supervised hybrid landmark loss is designed to more effectively learn from in-the-wild face images and 3D annotated datasets. Furthermore, to meet the real-time requirements in practical applications, a lightweight shape reconstruction model called fast-PR3D is distilled through teacher–student learning. In texture reconstruction, we propose a texture extraction method based on face reenactment in StyleGAN2 style space, extracting texture from the source and reenacted face images to constitute a facial texture map. Extensive experiments have demonstrated the state-of-the-art performance of our method.</p>\",\"PeriodicalId\":50645,\"journal\":{\"name\":\"Computer Animation and Virtual Worlds\",\"volume\":\"35 3\",\"pages\":\"\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2024-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Animation and Virtual Worlds\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cav.2254\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cav.2254","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
PR3D: Precise and realistic 3D face reconstruction from a single image
Reconstructing the three-dimensional (3D) shape and texture of the face from a single image is a significant and challenging task in computer vision and graphics. In recent years, learning-based reconstruction methods have exhibited outstanding performance, but their effectiveness is severely constrained by the scarcity of available training data with 3D annotations. To address this issue, we present the PR3D (Precise and Realistic 3D face reconstruction) method, which consists of high-precision shape reconstruction based on semi-supervised learning and high-fidelity texture reconstruction based on StyleGAN2. In shape reconstruction, we use in-the-wild face images and 3D annotated datasets to train the auxiliary encoder and the identity encoder, encoding the input image into parameters of FLAME (a parametric 3D face model). Simultaneously, a novel semi-supervised hybrid landmark loss is designed to more effectively learn from in-the-wild face images and 3D annotated datasets. Furthermore, to meet the real-time requirements in practical applications, a lightweight shape reconstruction model called fast-PR3D is distilled through teacher–student learning. In texture reconstruction, we propose a texture extraction method based on face reenactment in StyleGAN2 style space, extracting texture from the source and reenacted face images to constitute a facial texture map. Extensive experiments have demonstrated the state-of-the-art performance of our method.
期刊介绍:
With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.