Side-Scan Sonar Underwater Target Detection: Combining the Diffusion Model With an Improved YOLOv7 Model

IF 3.8 2区 工程技术 Q1 ENGINEERING, CIVIL IEEE Journal of Oceanic Engineering Pub Date : 2024-03-20 DOI:10.1109/JOE.2024.3379481
Xin Wen;Feihu Zhang;Chensheng Cheng;Xujia Hou;Guang Pan
{"title":"Side-Scan Sonar Underwater Target Detection: Combining the Diffusion Model With an Improved YOLOv7 Model","authors":"Xin Wen;Feihu Zhang;Chensheng Cheng;Xujia Hou;Guang Pan","doi":"10.1109/JOE.2024.3379481","DOIUrl":null,"url":null,"abstract":"Side-scan sonar (SSS) plays a crucial role in underwater exploration. Autonomous analysis of SSS images is vital for detecting unknown targets in underwater environments. However, due to the complexity of the underwater environment, few highlighted areas of the target, blurred feature details, and the difficulty of collecting data from SSS, achieving high-precision autonomous target recognition in SSS images is challenging. This article solves this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in SSS images. First, we enhance and enlarge real and experimental images using the denoising–diffusion model to establish a self-made SSS image data set, as there are data pictures of the detection target in the SSS images obtained from real experiments. Since the SSS image has large areas without targets, this article introduces a vision transformer (ViT) for dynamic attention and global modeling, which improves the model's weight in the target region. Second, the convolutional block attention module is adopted to further improve the feature expression ability and reduce floating-point operations. Finally, this article uses Scylla-Intersection over Union as the loss function to increase the accuracy of the model's inference. Experiments on the SSS image data set demonstrate that the improved YOLOv7 model outperforms other technologies, with an average accuracy (mAP0.5) and (mAP0.5:0.95) of 78.00% and 48.11%, respectively. These results are 3.47% and 2.9% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this article has great potential for object detection and recognition of SSS images.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"49 3","pages":"976-991"},"PeriodicalIF":3.8000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Oceanic Engineering","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10534346/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0

Abstract

Side-scan sonar (SSS) plays a crucial role in underwater exploration. Autonomous analysis of SSS images is vital for detecting unknown targets in underwater environments. However, due to the complexity of the underwater environment, few highlighted areas of the target, blurred feature details, and the difficulty of collecting data from SSS, achieving high-precision autonomous target recognition in SSS images is challenging. This article solves this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in SSS images. First, we enhance and enlarge real and experimental images using the denoising–diffusion model to establish a self-made SSS image data set, as there are data pictures of the detection target in the SSS images obtained from real experiments. Since the SSS image has large areas without targets, this article introduces a vision transformer (ViT) for dynamic attention and global modeling, which improves the model's weight in the target region. Second, the convolutional block attention module is adopted to further improve the feature expression ability and reduce floating-point operations. Finally, this article uses Scylla-Intersection over Union as the loss function to increase the accuracy of the model's inference. Experiments on the SSS image data set demonstrate that the improved YOLOv7 model outperforms other technologies, with an average accuracy (mAP0.5) and (mAP0.5:0.95) of 78.00% and 48.11%, respectively. These results are 3.47% and 2.9% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this article has great potential for object detection and recognition of SSS images.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
侧扫声纳水下目标探测:将扩散模型与改进的 YOLOv7 模型相结合
侧扫声纳(SSS)在水下探测中发挥着至关重要的作用。自主分析 SSS 图像对于探测水下环境中的未知目标至关重要。然而,由于水下环境的复杂性、目标高亮区域少、特征细节模糊以及 SSS 数据采集困难等原因,在 SSS 图像中实现高精度自主目标识别具有挑战性。本文通过改进 You Only Look Once v7(YOLOv7)模型来解决这一问题,从而实现 SSS 图像中的高精度目标检测。首先,我们利用去噪扩散模型对真实图像和实验图像进行增强和放大,建立一个自制的 SSS 图像数据集,因为在真实实验获得的 SSS 图像中存在检测目标的数据图片。由于 SSS 图像中有大片区域没有目标,本文引入了视觉变换器(ViT)进行动态关注和全局建模,提高了模型在目标区域的权重。其次,采用卷积块注意力模块,进一步提高特征表达能力,减少浮点运算。最后,本文采用 Scylla-Intersection over Union 作为损失函数,提高了模型推理的准确性。在 SSS 图像数据集上的实验表明,改进后的 YOLOv7 模型优于其他技术,其平均准确率(mAP0.5)和(mAP0.5:0.95)分别为 78.00% 和 48.11%。这些结果比 YOLOv7 模型分别高出 3.47% 和 2.9%。本文提出的改进型 YOLOv7 算法在 SSS 图像的物体检测和识别方面具有很大的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Journal of Oceanic Engineering
IEEE Journal of Oceanic Engineering 工程技术-工程:大洋
CiteScore
9.60
自引率
12.20%
发文量
86
审稿时长
12 months
期刊介绍: The IEEE Journal of Oceanic Engineering (ISSN 0364-9059) is the online-only quarterly publication of the IEEE Oceanic Engineering Society (IEEE OES). The scope of the Journal is the field of interest of the IEEE OES, which encompasses all aspects of science, engineering, and technology that address research, development, and operations pertaining to all bodies of water. This includes the creation of new capabilities and technologies from concept design through prototypes, testing, and operational systems to sense, explore, understand, develop, use, and responsibly manage natural resources.
期刊最新文献
2024 Index IEEE Journal of Oceanic Engineering Vol. 49 Table of Contents Call for papers: Special Issue on the IEEE UT2025 Symposium Hierarchical Interactive Attention Res-UNet for Inland Water Monitoring With Satellite-Based SAR Imagery Testing High Directional Resolution Sea-Spectrum Estimation Methods in View of the Needs of a National Monitoring System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1