Generating and Improving a Dataset of Masked Faces Using Data Augmentation

Q4 Biochemistry, Genetics and Molecular Biology Journal of Biomolecular Techniques Pub Date : 2023-06-10 DOI:10.51173/jt.v5i2.1140
Waleed Ayad, Siraj Qays, Ali Al-Naji
{"title":"Generating and Improving a Dataset of Masked Faces Using Data Augmentation","authors":"Waleed Ayad, Siraj Qays, Ali Al-Naji","doi":"10.51173/jt.v5i2.1140","DOIUrl":null,"url":null,"abstract":"Before the spread of the COVID-19 virus in 2020, modern face recognition systems performed excellently, but then the wearing of masks was imposed by countries on their population, which led to a noteworthy decrease in the discriminatory ability of those systems, where they had been trained on large-scale datasets of unmasked faces and not available large-scale masked faces datasets that time. To contribute to addressing the shortage of large-scale data sets that consist of people wearing masks, a developed method has been presented to create simulated masks and overlay them on faces in two main steps. The first step was to detect, align and crop the faces of unmasked faces datasets in a dataset and then apply simulated masks on the faces utilizing the dlib-ml library. This method was used to generate a dataset for masked faces (CASIA-mask). The second step used five techniques of data augmentation with the generated dataset. To evaluate the masked dataset and data augmentation, an accuracy of 96.4% was achieved by training one of the latest and most important facial recognition systems, FaceNet, on the masked dataset. The same system also achieved excellent results of 97.71% when trained on CASIA-mask and data augmentation together.","PeriodicalId":39617,"journal":{"name":"Journal of Biomolecular Techniques","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomolecular Techniques","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51173/jt.v5i2.1140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 0

Abstract

Before the spread of the COVID-19 virus in 2020, modern face recognition systems performed excellently, but then the wearing of masks was imposed by countries on their population, which led to a noteworthy decrease in the discriminatory ability of those systems, where they had been trained on large-scale datasets of unmasked faces and not available large-scale masked faces datasets that time. To contribute to addressing the shortage of large-scale data sets that consist of people wearing masks, a developed method has been presented to create simulated masks and overlay them on faces in two main steps. The first step was to detect, align and crop the faces of unmasked faces datasets in a dataset and then apply simulated masks on the faces utilizing the dlib-ml library. This method was used to generate a dataset for masked faces (CASIA-mask). The second step used five techniques of data augmentation with the generated dataset. To evaluate the masked dataset and data augmentation, an accuracy of 96.4% was achieved by training one of the latest and most important facial recognition systems, FaceNet, on the masked dataset. The same system also achieved excellent results of 97.71% when trained on CASIA-mask and data augmentation together.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用数据增强生成和改进蒙面数据集
在2020年COVID-19病毒传播之前,现代人脸识别系统表现出色,但随后各国强制要求其人口戴口罩,导致这些系统的识别能力显著下降,因为当时这些系统是在大规模的未戴口罩的人脸数据集上进行训练的,而当时没有大规模的口罩数据集。为了解决由戴口罩的人组成的大规模数据集的短缺问题,提出了一种开发的方法,通过两个主要步骤创建模拟口罩并将其覆盖在脸上。第一步是检测、对齐和裁剪数据集中未被遮挡的人脸数据集的人脸,然后利用dlib-ml库在这些人脸上应用模拟的蒙版。利用该方法生成被遮挡人脸数据集(CASIA-mask)。第二步对生成的数据集使用了五种数据增强技术。为了评估蒙面数据集和数据增强,通过在蒙面数据集上训练最新和最重要的面部识别系统之一FaceNet,准确率达到96.4%。在CASIA-mask和数据增强相结合的训练下,该系统也取得了97.71%的优异成绩。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Biomolecular Techniques
Journal of Biomolecular Techniques Biochemistry, Genetics and Molecular Biology-Molecular Biology
CiteScore
2.50
自引率
0.00%
发文量
9
期刊介绍: The Journal of Biomolecular Techniques is a peer-reviewed publication issued five times a year by the Association of Biomolecular Resource Facilities. The Journal was established to promote the central role biotechnology plays in contemporary research activities, to disseminate information among biomolecular resource facilities, and to communicate the biotechnology research conducted by the Association’s Research Groups and members, as well as other investigators.
期刊最新文献
Effect of Different Polishing Systems on Surface Roughness of IPS Empress Ceramic Materials Evaluation of the Effect of Nano and Micro Hydroxyapatite Particles on the Impact Strength of Acrylic Resin: In Vitro Study The Effect of Recycled CAD/CAM PEEK Fibers on the Transverse Strength of Repaired Acrylic Resin Assessment of Vitamin D3 Level Among a Sample of Type 2 Diabetic Patients Attending Diabetes and Endocrinology Center in Al-Hilla City The Impact of Digital Transformation in Enhancing Operational Performance: An Applied Study in the Kirkuk Electricity Distribution Branch
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1