{"title":"Underwater target detection network based on differential routing assistance and bilateral attention synergy","authors":"Zhiwei Chen, Suting Chen","doi":"10.1016/j.displa.2024.102836","DOIUrl":null,"url":null,"abstract":"<div><p>Underwater target detection technology holds significant importance in both military and civilian applications of ocean exploration. However, due to the complex underwater environment, most targets are small and often obscured, leading to low detection accuracy and missed detections in existing target detection algorithms. To address these issues, we propose an underwater target detection algorithm that balances accuracy and speed. Specifically, we first propose the Differentiable Routing Assistance Sampling Network named (DRASN), where differentiable routing participates in training the sampling network but not in the inference process. It replaces the down-sampling network composed of Maxpool and convolution fusion in the backbone network, reducing the feature loss of small and occluded targets. Secondly, we proposed the Bilateral Attention Synergistic Network (BASN), which establishes connections between the backbone and neck with fine-grained information from both channel and spatial perspectives, thereby further enhancing the detection capability of targets in complex backgrounds. Finally, considering the characteristics of real frames, we proposed a scale approximation auxiliary loss function named (Aux-Loss) and modify the allocation strategy of positive and negative samples to enable the network to selectively learn high-quality anchors, thereby improving the convergence capability of the network. Compared with mainstream algorithms, our detection network achieves 82.9% in [email protected] on the URPC2021 dataset, which is 9.5%, 5.7%, and 2.8% higher than YOLOv8s, RT-DETR, and SDBB respectively. The speed reaches 75 FPS and meets the requirements for real-time performance.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102836"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224002002","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Underwater target detection technology holds significant importance in both military and civilian applications of ocean exploration. However, due to the complex underwater environment, most targets are small and often obscured, leading to low detection accuracy and missed detections in existing target detection algorithms. To address these issues, we propose an underwater target detection algorithm that balances accuracy and speed. Specifically, we first propose the Differentiable Routing Assistance Sampling Network named (DRASN), where differentiable routing participates in training the sampling network but not in the inference process. It replaces the down-sampling network composed of Maxpool and convolution fusion in the backbone network, reducing the feature loss of small and occluded targets. Secondly, we proposed the Bilateral Attention Synergistic Network (BASN), which establishes connections between the backbone and neck with fine-grained information from both channel and spatial perspectives, thereby further enhancing the detection capability of targets in complex backgrounds. Finally, considering the characteristics of real frames, we proposed a scale approximation auxiliary loss function named (Aux-Loss) and modify the allocation strategy of positive and negative samples to enable the network to selectively learn high-quality anchors, thereby improving the convergence capability of the network. Compared with mainstream algorithms, our detection network achieves 82.9% in [email protected] on the URPC2021 dataset, which is 9.5%, 5.7%, and 2.8% higher than YOLOv8s, RT-DETR, and SDBB respectively. The speed reaches 75 FPS and meets the requirements for real-time performance.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.