UAMFDet: Acoustic‐Optical Fusion for Underwater Multi‐Modal Object Detection

IF 4.2 2区 计算机科学 Q2 ROBOTICS Journal of Field Robotics Pub Date : 2024-09-05 DOI:10.1002/rob.22432
Haojie Chen, Zhuo Wang, Hongde Qin, Xiaokai Mu
{"title":"UAMFDet: Acoustic‐Optical Fusion for Underwater Multi‐Modal Object Detection","authors":"Haojie Chen, Zhuo Wang, Hongde Qin, Xiaokai Mu","doi":"10.1002/rob.22432","DOIUrl":null,"url":null,"abstract":"Underwater object detection serves as a crucial means for autonomous underwater vehicles (AUVs) to gain awareness of their surroundings. Currently, AUVs predominantly depend on underwater optical cameras or sonar sensing techniques to furnish vital information sources for subsequent tasks such as underwater rescue and mining exploration. However, the influence of underwater light attenuation or significant background noise often leads to the failure of either the acoustic or optical sensor. Consequently, the traditional single‐modal object detection network, which relies exclusively on either the optical or acoustic modality, struggles to adapt to the varying complexities of underwater environments. To address this challenge, this paper proposes a novel underwater acoustic‐optical fusion‐based underwater multi‐modal object detection paradigm termed UAMFDet, which fuses highly misaligned acoustic‐optical features in the spatial dimension at both the fine‐grained level and the instance level. First, we propose a multi‐modal deformable self‐aligned feature fusion module to adaptively capture feature dependencies between multi‐modal targets, and perform self‐aligned multi‐modal fine‐grained feature fusion by differential fusion. Then a multi‐modal instance‐level feature matching network is designed. It matches multi‐modal instance features by a lightweight cross‐attention mechanism and performs differential fusion to achieve instance‐level feature fusion. In addition, we establish a data set dedicated to underwater acoustic‐optical fusion object detection tasks called UAOF, and conduct a large number of experiments on the UAOF data set to verify the effectiveness of UAMFDet.","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"15 1","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/rob.22432","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Underwater object detection serves as a crucial means for autonomous underwater vehicles (AUVs) to gain awareness of their surroundings. Currently, AUVs predominantly depend on underwater optical cameras or sonar sensing techniques to furnish vital information sources for subsequent tasks such as underwater rescue and mining exploration. However, the influence of underwater light attenuation or significant background noise often leads to the failure of either the acoustic or optical sensor. Consequently, the traditional single‐modal object detection network, which relies exclusively on either the optical or acoustic modality, struggles to adapt to the varying complexities of underwater environments. To address this challenge, this paper proposes a novel underwater acoustic‐optical fusion‐based underwater multi‐modal object detection paradigm termed UAMFDet, which fuses highly misaligned acoustic‐optical features in the spatial dimension at both the fine‐grained level and the instance level. First, we propose a multi‐modal deformable self‐aligned feature fusion module to adaptively capture feature dependencies between multi‐modal targets, and perform self‐aligned multi‐modal fine‐grained feature fusion by differential fusion. Then a multi‐modal instance‐level feature matching network is designed. It matches multi‐modal instance features by a lightweight cross‐attention mechanism and performs differential fusion to achieve instance‐level feature fusion. In addition, we establish a data set dedicated to underwater acoustic‐optical fusion object detection tasks called UAOF, and conduct a large number of experiments on the UAOF data set to verify the effectiveness of UAMFDet.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
UAMFDet:用于水下多模态目标检测的声光融合技术
水下物体探测是自动潜航器(AUV)感知周围环境的重要手段。目前,自动潜航器主要依靠水下光学相机或声纳传感技术为后续任务(如水下救援和采矿勘探)提供重要的信息源。然而,水下光衰减或巨大背景噪声的影响往往会导致声学或光学传感器失效。因此,完全依赖光学或声学模式的传统单模式物体检测网络难以适应水下环境的各种复杂性。为了应对这一挑战,本文提出了一种新颖的基于声光融合的水下多模态物体检测范例,称为 UAMFDet,它在细粒度级别和实例级别的空间维度上融合了高度错位的声光特征。首先,我们提出了一个多模态可变形自对齐特征融合模块,用于自适应捕捉多模态目标之间的特征依赖关系,并通过差分融合执行自对齐多模态细粒度特征融合。然后设计一个多模态实例级特征匹配网络。它通过轻量级交叉关注机制匹配多模态实例特征,并执行差分融合以实现实例级特征融合。此外,我们还建立了一个专门用于水下声光融合物体检测任务的数据集 UAOF,并在 UAOF 数据集上进行了大量实验,以验证 UAMFDet 的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Field Robotics
Journal of Field Robotics 工程技术-机器人学
CiteScore
15.00
自引率
3.60%
发文量
80
审稿时长
6 months
期刊介绍: The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments. The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.
期刊最新文献
Issue Information Cover Image, Volume 42, Number 1, January 2025 Back Cover, Volume 42, Number 1, January 2025 Issue Information Cover Image, Volume 41, Number 8, December 2024
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1