Underwater target detection network based on differential routing assistance and bilateral attention synergy

IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Displays Pub Date : 2024-09-13 DOI:10.1016/j.displa.2024.102836
Zhiwei Chen, Suting Chen
{"title":"Underwater target detection network based on differential routing assistance and bilateral attention synergy","authors":"Zhiwei Chen,&nbsp;Suting Chen","doi":"10.1016/j.displa.2024.102836","DOIUrl":null,"url":null,"abstract":"<div><p>Underwater target detection technology holds significant importance in both military and civilian applications of ocean exploration. However, due to the complex underwater environment, most targets are small and often obscured, leading to low detection accuracy and missed detections in existing target detection algorithms. To address these issues, we propose an underwater target detection algorithm that balances accuracy and speed. Specifically, we first propose the Differentiable Routing Assistance Sampling Network named (DRASN), where differentiable routing participates in training the sampling network but not in the inference process. It replaces the down-sampling network composed of Maxpool and convolution fusion in the backbone network, reducing the feature loss of small and occluded targets. Secondly, we proposed the Bilateral Attention Synergistic Network (BASN), which establishes connections between the backbone and neck with fine-grained information from both channel and spatial perspectives, thereby further enhancing the detection capability of targets in complex backgrounds. Finally, considering the characteristics of real frames, we proposed a scale approximation auxiliary loss function named (Aux-Loss) and modify the allocation strategy of positive and negative samples to enable the network to selectively learn high-quality anchors, thereby improving the convergence capability of the network. Compared with mainstream algorithms, our detection network achieves 82.9% in [email protected] on the URPC2021 dataset, which is 9.5%, 5.7%, and 2.8% higher than YOLOv8s, RT-DETR, and SDBB respectively. The speed reaches 75 FPS and meets the requirements for real-time performance.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102836"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224002002","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Underwater target detection technology holds significant importance in both military and civilian applications of ocean exploration. However, due to the complex underwater environment, most targets are small and often obscured, leading to low detection accuracy and missed detections in existing target detection algorithms. To address these issues, we propose an underwater target detection algorithm that balances accuracy and speed. Specifically, we first propose the Differentiable Routing Assistance Sampling Network named (DRASN), where differentiable routing participates in training the sampling network but not in the inference process. It replaces the down-sampling network composed of Maxpool and convolution fusion in the backbone network, reducing the feature loss of small and occluded targets. Secondly, we proposed the Bilateral Attention Synergistic Network (BASN), which establishes connections between the backbone and neck with fine-grained information from both channel and spatial perspectives, thereby further enhancing the detection capability of targets in complex backgrounds. Finally, considering the characteristics of real frames, we proposed a scale approximation auxiliary loss function named (Aux-Loss) and modify the allocation strategy of positive and negative samples to enable the network to selectively learn high-quality anchors, thereby improving the convergence capability of the network. Compared with mainstream algorithms, our detection network achieves 82.9% in [email protected] on the URPC2021 dataset, which is 9.5%, 5.7%, and 2.8% higher than YOLOv8s, RT-DETR, and SDBB respectively. The speed reaches 75 FPS and meets the requirements for real-time performance.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于差分路由辅助和双边注意力协同的水下目标探测网络
水下目标探测技术在海洋探测的军事和民用应用中都具有重要意义。然而,由于水下环境复杂,大多数目标体积小且经常被遮挡,导致现有目标检测算法的检测精度低和漏检。为了解决这些问题,我们提出了一种兼顾精度和速度的水下目标检测算法。具体来说,我们首先提出了名为 "可微分路由辅助采样网络"(DRASN)的算法,其中可微分路由参与采样网络的训练,但不参与推理过程。它取代了骨干网络中由 Maxpool 和卷积融合组成的下采样网络,减少了小目标和隐蔽目标的特征损失。其次,我们提出了双边注意力协同网络(BASN),从信道和空间两个角度利用细粒度信息在骨干和颈部之间建立联系,从而进一步提高了复杂背景下的目标检测能力。最后,考虑到真实帧的特点,我们提出了尺度逼近辅助损失函数(Aux-Loss),并修改了正负样本的分配策略,使网络能够有选择地学习高质量的锚点,从而提高了网络的收敛能力。与主流算法相比,我们的检测网络在URPC2021数据集上的[email protected]达到了82.9%,比YOLOv8s、RT-DETR和SDBB分别高出9.5%、5.7%和2.8%。速度达到 75 FPS,满足实时性能要求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
期刊最新文献
Mambav3d: A mamba-based virtual 3D module stringing semantic information between layers of medical image slices Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment GLDBF: Global and local dual-branch fusion network for no-reference point cloud quality assessment Virtual reality in medical education: Effectiveness of Immersive Virtual Anatomy Laboratory (IVAL) compared to traditional learning approaches Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1