Lang Zhang, Zhan Ao Huang, Canghong Shi, Hongjiang Ma, Xiaojie Li, Xi Wu
{"title":"MFPIDet: improved YOLOV7 architecture based on multi-scale feature fusion for prohibited item detection in complex environment","authors":"Lang Zhang, Zhan Ao Huang, Canghong Shi, Hongjiang Ma, Xiaojie Li, Xi Wu","doi":"10.1007/s40747-024-01580-3","DOIUrl":null,"url":null,"abstract":"<p>Prohibited item detection is crucial for the safety of public places. Deep learning, one of the mainstream methods in prohibited item detection tasks, has shown superior performance far beyond traditional prohibited item detection methods. However, most neural network architectures in deep learning still lack sufficient local feature representation ability for overlapping and small targets, and ignore the problem of semantic conflicts caused by direct feature fusion. In this paper, we propose MFPIDet, a novel prohibited item detection neural network architecture based on improved YOLOV7 to achieve reliable prohibited item detection in complex environments. Specifically, a multi-scale attention module (MAM) backbone is proposed to filter the redundant information of target regions and further applied to enhance the local feature representation ability of overlapping objects. Here, to reduce the redundant information of target regions, a squeeze-excitation (SE) block is used to filter the background. Then, aiming at enhancing the feature expression ability of overlapping objects, a multi-scale feature extraction module (MFEM) is designed for local feature representation. In addition, to obtain richer context information, We design an adaptive fusion feature pyramid network (AF-FPN) to combine the adaptive context information fusion module (ACIFM) with the feature fusion module (FFM) to improve the neck structure of YOLOV7. The proposed method is validated on the PIDray dataset, and the tested results showed that our method obtained the highest <i>mAP</i> (68.7%), which is improved by 3.5% than YOLOV7 methods. Our approach provides a new design pattern for prohibited item detection in complex environments and shows the development potential of deep learning in related fields.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"16 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01580-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Prohibited item detection is crucial for the safety of public places. Deep learning, one of the mainstream methods in prohibited item detection tasks, has shown superior performance far beyond traditional prohibited item detection methods. However, most neural network architectures in deep learning still lack sufficient local feature representation ability for overlapping and small targets, and ignore the problem of semantic conflicts caused by direct feature fusion. In this paper, we propose MFPIDet, a novel prohibited item detection neural network architecture based on improved YOLOV7 to achieve reliable prohibited item detection in complex environments. Specifically, a multi-scale attention module (MAM) backbone is proposed to filter the redundant information of target regions and further applied to enhance the local feature representation ability of overlapping objects. Here, to reduce the redundant information of target regions, a squeeze-excitation (SE) block is used to filter the background. Then, aiming at enhancing the feature expression ability of overlapping objects, a multi-scale feature extraction module (MFEM) is designed for local feature representation. In addition, to obtain richer context information, We design an adaptive fusion feature pyramid network (AF-FPN) to combine the adaptive context information fusion module (ACIFM) with the feature fusion module (FFM) to improve the neck structure of YOLOV7. The proposed method is validated on the PIDray dataset, and the tested results showed that our method obtained the highest mAP (68.7%), which is improved by 3.5% than YOLOV7 methods. Our approach provides a new design pattern for prohibited item detection in complex environments and shows the development potential of deep learning in related fields.
期刊介绍:
Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.