{"title":"使用 QARepVGG-YOLOv7 快速检测公共场所的人脸面具","authors":"Chuying Guan, Jiaxuan Jiang, Zhong Wang","doi":"10.1007/s11554-024-01476-y","DOIUrl":null,"url":null,"abstract":"<p>The COVID-19 pandemic has resulted in substantial global losses. In the post-epidemic era, public health needs still advocate the correct use of medical masks in confined spaces such as hospitals and indoors. This can effectively block the spread of infectious diseases through droplets, protect personal and public health, and improve the environmental sustainability and social resilience of cities. Therefore, detecting the correct wearing of masks is crucial. This study proposes an innovative three-class mask detection model based on the QARepVGG-YOLOv7 algorithm. The model replaces the convolution module in the backbone network with the QARepVGG module and uses the quantitative friendly structure and re-parameterization characteristics of the QARepVGG module to achieve high-precision and high-efficiency target detection. To validate the effectiveness of our proposed method, we created a mask dataset of 5095 pictures, including three categories: correct use of masks, incorrect use of masks, and individuals who do not wear masks. We also employed data augmentation techniques to further balance the dataset categories. We tested YOLOv5s, YOLOv6, YOLOv7, and YOLOv8s models on self-made datasets. The results show that the QARepVGG-YOLOv7 model has the best accuracy compared with the most advanced YOLO model. Our model achieves a significantly improved mAP value of 0.946 and a faster fps of 263.2, which is 90.8 fps higher than the YOLOv7 model and a 0.5% increase in map value over the YOLOv7 model. It is a high-precision and high-efficiency mask detection model.</p>","PeriodicalId":51224,"journal":{"name":"Journal of Real-Time Image Processing","volume":"55 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast detection of face masks in public places using QARepVGG-YOLOv7\",\"authors\":\"Chuying Guan, Jiaxuan Jiang, Zhong Wang\",\"doi\":\"10.1007/s11554-024-01476-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The COVID-19 pandemic has resulted in substantial global losses. In the post-epidemic era, public health needs still advocate the correct use of medical masks in confined spaces such as hospitals and indoors. This can effectively block the spread of infectious diseases through droplets, protect personal and public health, and improve the environmental sustainability and social resilience of cities. Therefore, detecting the correct wearing of masks is crucial. This study proposes an innovative three-class mask detection model based on the QARepVGG-YOLOv7 algorithm. The model replaces the convolution module in the backbone network with the QARepVGG module and uses the quantitative friendly structure and re-parameterization characteristics of the QARepVGG module to achieve high-precision and high-efficiency target detection. To validate the effectiveness of our proposed method, we created a mask dataset of 5095 pictures, including three categories: correct use of masks, incorrect use of masks, and individuals who do not wear masks. We also employed data augmentation techniques to further balance the dataset categories. We tested YOLOv5s, YOLOv6, YOLOv7, and YOLOv8s models on self-made datasets. The results show that the QARepVGG-YOLOv7 model has the best accuracy compared with the most advanced YOLO model. Our model achieves a significantly improved mAP value of 0.946 and a faster fps of 263.2, which is 90.8 fps higher than the YOLOv7 model and a 0.5% increase in map value over the YOLOv7 model. It is a high-precision and high-efficiency mask detection model.</p>\",\"PeriodicalId\":51224,\"journal\":{\"name\":\"Journal of Real-Time Image Processing\",\"volume\":\"55 1\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Real-Time Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11554-024-01476-y\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Real-Time Image Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11554-024-01476-y","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Fast detection of face masks in public places using QARepVGG-YOLOv7
The COVID-19 pandemic has resulted in substantial global losses. In the post-epidemic era, public health needs still advocate the correct use of medical masks in confined spaces such as hospitals and indoors. This can effectively block the spread of infectious diseases through droplets, protect personal and public health, and improve the environmental sustainability and social resilience of cities. Therefore, detecting the correct wearing of masks is crucial. This study proposes an innovative three-class mask detection model based on the QARepVGG-YOLOv7 algorithm. The model replaces the convolution module in the backbone network with the QARepVGG module and uses the quantitative friendly structure and re-parameterization characteristics of the QARepVGG module to achieve high-precision and high-efficiency target detection. To validate the effectiveness of our proposed method, we created a mask dataset of 5095 pictures, including three categories: correct use of masks, incorrect use of masks, and individuals who do not wear masks. We also employed data augmentation techniques to further balance the dataset categories. We tested YOLOv5s, YOLOv6, YOLOv7, and YOLOv8s models on self-made datasets. The results show that the QARepVGG-YOLOv7 model has the best accuracy compared with the most advanced YOLO model. Our model achieves a significantly improved mAP value of 0.946 and a faster fps of 263.2, which is 90.8 fps higher than the YOLOv7 model and a 0.5% increase in map value over the YOLOv7 model. It is a high-precision and high-efficiency mask detection model.
期刊介绍:
Due to rapid advancements in integrated circuit technology, the rich theoretical results that have been developed by the image and video processing research community are now being increasingly applied in practical systems to solve real-world image and video processing problems. Such systems involve constraints placed not only on their size, cost, and power consumption, but also on the timeliness of the image data processed.
Examples of such systems are mobile phones, digital still/video/cell-phone cameras, portable media players, personal digital assistants, high-definition television, video surveillance systems, industrial visual inspection systems, medical imaging devices, vision-guided autonomous robots, spectral imaging systems, and many other real-time embedded systems. In these real-time systems, strict timing requirements demand that results are available within a certain interval of time as imposed by the application.
It is often the case that an image processing algorithm is developed and proven theoretically sound, presumably with a specific application in mind, but its practical applications and the detailed steps, methodology, and trade-off analysis required to achieve its real-time performance are not fully explored, leaving these critical and usually non-trivial issues for those wishing to employ the algorithm in a real-time system.
The Journal of Real-Time Image Processing is intended to bridge the gap between the theory and practice of image processing, serving the greater community of researchers, practicing engineers, and industrial professionals who deal with designing, implementing or utilizing image processing systems which must satisfy real-time design constraints.