文本引导下的屏蔽人脸图像合成

IF 5.2 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-03-30 DOI:10.1145/3654667
Anjali T, Masilamani V
{"title":"文本引导下的屏蔽人脸图像合成","authors":"Anjali T, Masilamani V","doi":"10.1145/3654667","DOIUrl":null,"url":null,"abstract":"<p>The COVID-19 pandemic has made us all understand that wearing a face mask protects us from the spread of respiratory viruses. The face authentication systems, which are trained on the basis of facial key points such as the eyes, nose, and mouth, found it difficult to identify the person when the majority of the face is covered by the face mask. Removing the mask for authentication will cause the infection to spread. The possible solutions are: (a) to train the face recognition systems to identify the person with the upper face features (b) Reconstruct the complete face of the person with a generative model. (c) train the model with a dataset of the masked faces of the people. In this paper, we explore the scope of generative models for image synthesis. We used stable diffusion to generate masked face images of popular celebrities on various text prompts. A realistic dataset of 15K masked face images of 100 celebrities is generated and is called the Realistic Synthetic Masked Face Dataset (RSMFD). The model and the generated dataset will be made public so that researchers can augment the dataset. According to our knowledge, this is the largest masked face recognition dataset with realistic images. The generated images were tested on popular deep face recognition models and achieved significant results. The dataset is also trained and tested on some of the famous image classification models, and the results are competitive. The dataset is available on this link:- https://drive.google.com/drive/folders/1yetcgUOL1TOP4rod1geGsOkIrIJHtcEw?usp=sharing\n</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"1 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Text-Guided Synthesis of Masked Face Images\",\"authors\":\"Anjali T, Masilamani V\",\"doi\":\"10.1145/3654667\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The COVID-19 pandemic has made us all understand that wearing a face mask protects us from the spread of respiratory viruses. The face authentication systems, which are trained on the basis of facial key points such as the eyes, nose, and mouth, found it difficult to identify the person when the majority of the face is covered by the face mask. Removing the mask for authentication will cause the infection to spread. The possible solutions are: (a) to train the face recognition systems to identify the person with the upper face features (b) Reconstruct the complete face of the person with a generative model. (c) train the model with a dataset of the masked faces of the people. In this paper, we explore the scope of generative models for image synthesis. We used stable diffusion to generate masked face images of popular celebrities on various text prompts. A realistic dataset of 15K masked face images of 100 celebrities is generated and is called the Realistic Synthetic Masked Face Dataset (RSMFD). The model and the generated dataset will be made public so that researchers can augment the dataset. According to our knowledge, this is the largest masked face recognition dataset with realistic images. The generated images were tested on popular deep face recognition models and achieved significant results. The dataset is also trained and tested on some of the famous image classification models, and the results are competitive. The dataset is available on this link:- https://drive.google.com/drive/folders/1yetcgUOL1TOP4rod1geGsOkIrIJHtcEw?usp=sharing\\n</p>\",\"PeriodicalId\":50937,\"journal\":{\"name\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3654667\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3654667","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

COVID-19 大流行让我们明白,戴口罩可以防止呼吸道病毒传播。人脸识别系统是根据眼睛、鼻子和嘴巴等面部关键部位进行训练的,但当面部大部分被口罩遮住时,系统就很难识别人的身份。摘下面具进行身份验证会导致感染扩散。可能的解决办法有(a) 对人脸识别系统进行训练,使其能够根据人脸上部特征识别人的身份 (b) 利用生成模型重建人的完整面部。(c) 使用蒙面人脸数据集训练模型。在本文中,我们探索了生成模型在图像合成中的应用范围。我们利用稳定扩散生成了各种文本提示下的流行名人的面具人脸图像。生成的真实数据集包含 100 位名人的 15K 张面具人脸图像,被称为真实合成面具人脸数据集(RSMFD)。模型和生成的数据集将公开,以便研究人员扩充数据集。据我们所知,这是最大的具有真实图像的面具人脸识别数据集。生成的图像在流行的深度人脸识别模型上进行了测试,取得了显著的效果。该数据集还在一些著名的图像分类模型上进行了训练和测试,结果也很有竞争力。该数据集可从以下链接获取:- https://drive.google.com/drive/folders/1yetcgUOL1TOP4rod1geGsOkIrIJHtcEw?usp=sharing
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Text-Guided Synthesis of Masked Face Images

The COVID-19 pandemic has made us all understand that wearing a face mask protects us from the spread of respiratory viruses. The face authentication systems, which are trained on the basis of facial key points such as the eyes, nose, and mouth, found it difficult to identify the person when the majority of the face is covered by the face mask. Removing the mask for authentication will cause the infection to spread. The possible solutions are: (a) to train the face recognition systems to identify the person with the upper face features (b) Reconstruct the complete face of the person with a generative model. (c) train the model with a dataset of the masked faces of the people. In this paper, we explore the scope of generative models for image synthesis. We used stable diffusion to generate masked face images of popular celebrities on various text prompts. A realistic dataset of 15K masked face images of 100 celebrities is generated and is called the Realistic Synthetic Masked Face Dataset (RSMFD). The model and the generated dataset will be made public so that researchers can augment the dataset. According to our knowledge, this is the largest masked face recognition dataset with realistic images. The generated images were tested on popular deep face recognition models and achieved significant results. The dataset is also trained and tested on some of the famous image classification models, and the results are competitive. The dataset is available on this link:- https://drive.google.com/drive/folders/1yetcgUOL1TOP4rod1geGsOkIrIJHtcEw?usp=sharing

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.50
自引率
5.90%
发文量
285
审稿时长
7.5 months
期刊介绍: The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.
期刊最新文献
TA-Detector: A GNN-based Anomaly Detector via Trust Relationship KF-VTON: Keypoints-Driven Flow Based Virtual Try-On Network Unified View Empirical Study for Large Pretrained Model on Cross-Domain Few-Shot Learning Multimodal Fusion for Talking Face Generation Utilizing Speech-related Facial Action Units Compressed Point Cloud Quality Index by Combining Global Appearance and Local Details
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1