MFPIDet: improved YOLOV7 architecture based on multi-scale feature fusion for prohibited item detection in complex environment

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Complex & Intelligent Systems Pub Date : 2024-08-14 DOI:10.1007/s40747-024-01580-3

Lang Zhang, Zhan Ao Huang, Canghong Shi, Hongjiang Ma, Xiaojie Li, Xi Wu

{"title":"MFPIDet: improved YOLOV7 architecture based on multi-scale feature fusion for prohibited item detection in complex environment","authors":"Lang Zhang, Zhan Ao Huang, Canghong Shi, Hongjiang Ma, Xiaojie Li, Xi Wu","doi":"10.1007/s40747-024-01580-3","DOIUrl":null,"url":null,"abstract":"<p>Prohibited item detection is crucial for the safety of public places. Deep learning, one of the mainstream methods in prohibited item detection tasks, has shown superior performance far beyond traditional prohibited item detection methods. However, most neural network architectures in deep learning still lack sufficient local feature representation ability for overlapping and small targets, and ignore the problem of semantic conflicts caused by direct feature fusion. In this paper, we propose MFPIDet, a novel prohibited item detection neural network architecture based on improved YOLOV7 to achieve reliable prohibited item detection in complex environments. Specifically, a multi-scale attention module (MAM) backbone is proposed to filter the redundant information of target regions and further applied to enhance the local feature representation ability of overlapping objects. Here, to reduce the redundant information of target regions, a squeeze-excitation (SE) block is used to filter the background. Then, aiming at enhancing the feature expression ability of overlapping objects, a multi-scale feature extraction module (MFEM) is designed for local feature representation. In addition, to obtain richer context information, We design an adaptive fusion feature pyramid network (AF-FPN) to combine the adaptive context information fusion module (ACIFM) with the feature fusion module (FFM) to improve the neck structure of YOLOV7. The proposed method is validated on the PIDray dataset, and the tested results showed that our method obtained the highest <i>mAP</i> (68.7%), which is improved by 3.5% than YOLOV7 methods. Our approach provides a new design pattern for prohibited item detection in complex environments and shows the development potential of deep learning in related fields.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"16 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01580-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Prohibited item detection is crucial for the safety of public places. Deep learning, one of the mainstream methods in prohibited item detection tasks, has shown superior performance far beyond traditional prohibited item detection methods. However, most neural network architectures in deep learning still lack sufficient local feature representation ability for overlapping and small targets, and ignore the problem of semantic conflicts caused by direct feature fusion. In this paper, we propose MFPIDet, a novel prohibited item detection neural network architecture based on improved YOLOV7 to achieve reliable prohibited item detection in complex environments. Specifically, a multi-scale attention module (MAM) backbone is proposed to filter the redundant information of target regions and further applied to enhance the local feature representation ability of overlapping objects. Here, to reduce the redundant information of target regions, a squeeze-excitation (SE) block is used to filter the background. Then, aiming at enhancing the feature expression ability of overlapping objects, a multi-scale feature extraction module (MFEM) is designed for local feature representation. In addition, to obtain richer context information, We design an adaptive fusion feature pyramid network (AF-FPN) to combine the adaptive context information fusion module (ACIFM) with the feature fusion module (FFM) to improve the neck structure of YOLOV7. The proposed method is validated on the PIDray dataset, and the tested results showed that our method obtained the highest mAP (68.7%), which is improved by 3.5% than YOLOV7 methods. Our approach provides a new design pattern for prohibited item detection in complex environments and shows the development potential of deep learning in related fields.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MFPIDet：基于多尺度特征融合的改进型 YOLOV7 架构，用于在复杂环境中检测违禁物品

违禁物品检测对公共场所的安全至关重要。深度学习是违禁物品检测任务的主流方法之一，其性能远远超过传统的违禁物品检测方法。然而，深度学习中的大多数神经网络架构对于重叠和小目标仍然缺乏足够的局部特征表示能力，并且忽略了直接特征融合所导致的语义冲突问题。本文提出了基于改进型 YOLOV7 的新型违禁品检测神经网络架构 MFPIDet，以实现复杂环境下可靠的违禁品检测。具体来说，我们提出了一个多尺度注意力模块（MAM）骨干来过滤目标区域的冗余信息，并进一步应用于增强重叠对象的局部特征表示能力。在这里，为了减少目标区域的冗余信息，使用了挤压激励（SE）块来过滤背景。然后，为了增强重叠对象的特征表达能力，设计了一个多尺度特征提取模块（MFEM）来进行局部特征表示。此外，为了获得更丰富的上下文信息，我们设计了一个自适应融合特征金字塔网络（AF-FPN），将自适应上下文信息融合模块（ACIFM）与特征融合模块（FFM）结合起来，以改善 YOLOV7 的颈部结构。测试结果表明，我们的方法获得了最高的 mAP（68.7%），比 YOLOV7 方法提高了 3.5%。我们的方法为复杂环境中的违禁物品检测提供了一种新的设计模式，并展示了深度学习在相关领域的发展潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.