{"title":"增强面部图像中的眼镜去除效果:利用翻译模型完成眼镜遮罩的新方法","authors":"Zahra Esmaily, Hossein Ebrahimpour-Komleh","doi":"10.1007/s11042-024-20101-5","DOIUrl":null,"url":null,"abstract":"<p>Accurately removing eyeglasses from facial images is crucial for improving the performance of various face-related tasks such as verification, identification, and reconstruction. This paper presents a novel approach to enhancing eyeglasses removal by integrating a mask completion technique into the existing framework. Our method focuses on improving the accuracy of eyeglasses masks, which is essential for subsequent eyeglasses and shadow removal steps. We introduce a unique dataset specifically designed for eyeglasses mask image completion. This dataset is generated by applying Top-Hat morphological operations to existing eyeglasses mask datasets, creating a collection of images containing eyeglasses masks in two states: damaged (incomplete) and complete (ground truth). A Pix2Pix image-to-image translation model is trained on this newly created dataset for the purpose of restoring incomplete eyeglass mask predictions. This restoration step significantly improves the accuracy of eyeglass frame extraction and leads to more realistic results in subsequent eyeglasses and shadow removal. Our method incorporates a post-processing step to refine the completed mask, preventing the formation of artifacts in the background or outside of the eyeglasses frame box, further enhancing the overall quality of the processed image. Experimental results on CelebA, FFHQ, and MeGlass datasets showcase the effectiveness of our method, outperforming state-of-the-art approaches in quantitative metrics (FID, KID, MOS) and qualitative evaluations.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":"2 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing eyeglasses removal in facial images: a novel approach using translation models for eyeglasses mask completion\",\"authors\":\"Zahra Esmaily, Hossein Ebrahimpour-Komleh\",\"doi\":\"10.1007/s11042-024-20101-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Accurately removing eyeglasses from facial images is crucial for improving the performance of various face-related tasks such as verification, identification, and reconstruction. This paper presents a novel approach to enhancing eyeglasses removal by integrating a mask completion technique into the existing framework. Our method focuses on improving the accuracy of eyeglasses masks, which is essential for subsequent eyeglasses and shadow removal steps. We introduce a unique dataset specifically designed for eyeglasses mask image completion. This dataset is generated by applying Top-Hat morphological operations to existing eyeglasses mask datasets, creating a collection of images containing eyeglasses masks in two states: damaged (incomplete) and complete (ground truth). A Pix2Pix image-to-image translation model is trained on this newly created dataset for the purpose of restoring incomplete eyeglass mask predictions. This restoration step significantly improves the accuracy of eyeglass frame extraction and leads to more realistic results in subsequent eyeglasses and shadow removal. Our method incorporates a post-processing step to refine the completed mask, preventing the formation of artifacts in the background or outside of the eyeglasses frame box, further enhancing the overall quality of the processed image. Experimental results on CelebA, FFHQ, and MeGlass datasets showcase the effectiveness of our method, outperforming state-of-the-art approaches in quantitative metrics (FID, KID, MOS) and qualitative evaluations.</p>\",\"PeriodicalId\":18770,\"journal\":{\"name\":\"Multimedia Tools and Applications\",\"volume\":\"2 1\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Multimedia Tools and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11042-024-20101-5\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Tools and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11042-024-20101-5","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Enhancing eyeglasses removal in facial images: a novel approach using translation models for eyeglasses mask completion
Accurately removing eyeglasses from facial images is crucial for improving the performance of various face-related tasks such as verification, identification, and reconstruction. This paper presents a novel approach to enhancing eyeglasses removal by integrating a mask completion technique into the existing framework. Our method focuses on improving the accuracy of eyeglasses masks, which is essential for subsequent eyeglasses and shadow removal steps. We introduce a unique dataset specifically designed for eyeglasses mask image completion. This dataset is generated by applying Top-Hat morphological operations to existing eyeglasses mask datasets, creating a collection of images containing eyeglasses masks in two states: damaged (incomplete) and complete (ground truth). A Pix2Pix image-to-image translation model is trained on this newly created dataset for the purpose of restoring incomplete eyeglass mask predictions. This restoration step significantly improves the accuracy of eyeglass frame extraction and leads to more realistic results in subsequent eyeglasses and shadow removal. Our method incorporates a post-processing step to refine the completed mask, preventing the formation of artifacts in the background or outside of the eyeglasses frame box, further enhancing the overall quality of the processed image. Experimental results on CelebA, FFHQ, and MeGlass datasets showcase the effectiveness of our method, outperforming state-of-the-art approaches in quantitative metrics (FID, KID, MOS) and qualitative evaluations.
期刊介绍:
Multimedia Tools and Applications publishes original research articles on multimedia development and system support tools as well as case studies of multimedia applications. It also features experimental and survey articles. The journal is intended for academics, practitioners, scientists and engineers who are involved in multimedia system research, design and applications. All papers are peer reviewed.
Specific areas of interest include:
- Multimedia Tools:
- Multimedia Applications:
- Prototype multimedia systems and platforms