EMPViT: Efficient multi-path vision transformer for security risks detection in power distribution network

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neurocomputing Pub Date : 2024-11-22 DOI:10.1016/j.neucom.2024.128967
Pan Li, Xiaofang Yuan, Haozhi Xu, Jinlei Wang, Yaonan Wang
{"title":"EMPViT: Efficient multi-path vision transformer for security risks detection in power distribution network","authors":"Pan Li,&nbsp;Xiaofang Yuan,&nbsp;Haozhi Xu,&nbsp;Jinlei Wang,&nbsp;Yaonan Wang","doi":"10.1016/j.neucom.2024.128967","DOIUrl":null,"url":null,"abstract":"<div><div>To maintain the safe operation of power distribution network (PDN) equipment, it is important to accurately and promptly identify security risks. However, conventional drone-based object detection methods face challenges due to noise and similarity features in risk targets, as well as limited computing resources of unmanned aerial vehicles (UAVs). To address these challenges, an efficient embedding-based multi-path fusion architecture is proposed. This architecture uses a re-parameterized depthwise block to embed local context information at different scales, enhancing the extraction of tiny features while preserving inference speed. Additionally, a coordinated self-attention module is proposed to reduce computational complexity while maintaining the performance of global information. By fusing fine and coarse feature representations without requiring a lot of computation, this module efficiently learns from both local and global features from images. The goal is to create an efficient multi-path vision transformer (EMPViT) architecture that achieves a balance between accuracy and efficiency. The proposed EMPViT has been evaluated on two different drone image dataset, demonstrating better performance compared to other architectures. Specifically, the EMPViT-S improves the detection mAP by 1.2%, and the inference speed is improved to 1.24 times on average on Drone-PDN dataset. It has achieved the same performance improvement on VisDrone-DET2019 dataset, gaining detection performance by 1.3% and 1.2 times acceleration on average.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128967"},"PeriodicalIF":5.5000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224017387","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

To maintain the safe operation of power distribution network (PDN) equipment, it is important to accurately and promptly identify security risks. However, conventional drone-based object detection methods face challenges due to noise and similarity features in risk targets, as well as limited computing resources of unmanned aerial vehicles (UAVs). To address these challenges, an efficient embedding-based multi-path fusion architecture is proposed. This architecture uses a re-parameterized depthwise block to embed local context information at different scales, enhancing the extraction of tiny features while preserving inference speed. Additionally, a coordinated self-attention module is proposed to reduce computational complexity while maintaining the performance of global information. By fusing fine and coarse feature representations without requiring a lot of computation, this module efficiently learns from both local and global features from images. The goal is to create an efficient multi-path vision transformer (EMPViT) architecture that achieves a balance between accuracy and efficiency. The proposed EMPViT has been evaluated on two different drone image dataset, demonstrating better performance compared to other architectures. Specifically, the EMPViT-S improves the detection mAP by 1.2%, and the inference speed is improved to 1.24 times on average on Drone-PDN dataset. It has achieved the same performance improvement on VisDrone-DET2019 dataset, gaining detection performance by 1.3% and 1.2 times acceleration on average.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于配电网安全风险检测的高效多径视觉变压器
为了维护PDN设备的安全运行,准确、及时地识别安全隐患是十分重要的。然而,传统的基于无人机的目标检测方法由于风险目标的噪声和相似性特征以及无人机计算资源有限而面临挑战。为了解决这些问题,提出了一种高效的基于嵌入的多路径融合体系结构。该体系结构使用重新参数化的深度块来嵌入不同尺度的局部上下文信息,在保持推理速度的同时增强了对微小特征的提取。此外,为了在保持全局信息性能的同时降低计算复杂度,提出了一种协调的自关注模块。通过在不需要大量计算的情况下融合精细和粗糙的特征表示,该模块可以有效地从图像中学习局部和全局特征。目标是创建一个高效的多路径视觉转换器(EMPViT)体系结构,在准确性和效率之间取得平衡。在两个不同的无人机图像数据集上对所提出的EMPViT进行了评估,与其他架构相比,显示出更好的性能。其中,EMPViT-S在无人机- pdn数据集上的检测mAP提高了1.2%,推理速度平均提高到1.24倍。它在VisDrone-DET2019数据集上取得了相同的性能提升,检测性能平均提高1.3%,加速度平均提高1.2倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
期刊最新文献
Monocular thermal SLAM with neural radiance fields for 3D scene reconstruction Learning a more compact representation for low-rank tensor completion An HVS-derived network for assessing the quality of camouflaged targets with feature fusion Global Span Semantic Dependency Awareness and Filtering Network for nested named entity recognition A user behavior-aware multi-task learning model for enhanced short video recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1