首页 > 最新文献

IEEE Journal of Oceanic Engineering最新文献

英文 中文
Human-in-the-Loop Segmentation of Multispecies Coral Imagery 多物种珊瑚图像的人在环分割
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-12-22 DOI: 10.1109/JOE.2025.3625691
Scarlett Raine;Ross Marchant;Brano Kusy;Frederic Maire;Niko Sünderhauf;Tobias Fischer
Marine surveys by robotic underwater and surface vehicles result in substantial quantities of coral reef imagery, however labeling these images is expensive and time-consuming for domain experts. Point label propagation is a technique that uses existing images labeled with sparse points to create augmented ground truth data, which can be used to train a semantic segmentation model. In this work, we show that recent advances in large foundation models facilitate the creation of augmented ground truth masks using only features extracted by the denoised version of the DIstillation of knowledge with NO labels version 2 (DINOv2) foundation model and K-nearest neighbors (KNN), without any pretraining. For images with extremely sparse labels, we use human-in-the-loop principles to enhance annotation efficiency: if there are five point labels per image, our method outperforms the prior state-of-the-art by 19.7% for mean intersection over union (mIoU). When human-in-the-loop labeling is not available, using the denoised DINOv2 features with a KNN still improves on the prior state-of-the-art by 5.8% for mIoU (five grid points). On the semantic segmentation task, we outperform the prior state-of-the-art by 13.5% for mIoU when only five point labels are used for point label propagation. In addition, we perform a comprehensive study into the number and placement of point labels, and make several recommendations for improving the efficiency of labeling images with points.
水下和水面机器人的海洋调查产生了大量的珊瑚礁图像,但是对领域专家来说,标记这些图像既昂贵又耗时。点标签传播是一种利用现有图像的稀疏点标记来创建增强的地面真值数据的技术,该数据可用于训练语义分割模型。在这项工作中,我们表明,大型基础模型的最新进展有助于仅使用由带有NO标签版本2 (DINOv2)基础模型和k近邻(KNN)的知识蒸馏的去噪版本提取的特征来创建增强的地面真值掩模,而无需任何预训练。对于标签极其稀疏的图像,我们使用human-in-the-loop原则来提高标注效率:如果每张图像有五个点标签,我们的方法在平均交联(mIoU)方面比先前的最先进方法高出19.7%。当人在环路中标记不可用时,使用带有KNN的去噪DINOv2特征对于mIoU(五个网格点)仍然比先前的技术水平提高5.8%。在语义分割任务上,当仅使用五个点标签进行点标签传播时,我们的性能比先前的最先进技术高出13.5%。此外,我们对点标签的数量和位置进行了全面的研究,并提出了一些建议,以提高点标记图像的效率。
{"title":"Human-in-the-Loop Segmentation of Multispecies Coral Imagery","authors":"Scarlett Raine;Ross Marchant;Brano Kusy;Frederic Maire;Niko Sünderhauf;Tobias Fischer","doi":"10.1109/JOE.2025.3625691","DOIUrl":"https://doi.org/10.1109/JOE.2025.3625691","url":null,"abstract":"Marine surveys by robotic underwater and surface vehicles result in substantial quantities of coral reef imagery, however labeling these images is expensive and time-consuming for domain experts. Point label propagation is a technique that uses existing images labeled with sparse points to create augmented ground truth data, which can be used to train a semantic segmentation model. In this work, we show that recent advances in large foundation models facilitate the creation of augmented ground truth masks using only features extracted by the denoised version of the DIstillation of knowledge with NO labels version 2 (DINOv2) foundation model and K-nearest neighbors (KNN), without any pretraining. For images with extremely sparse labels, we use human-in-the-loop principles to enhance annotation efficiency: if there are five point labels per image, our method outperforms the prior state-of-the-art by 19.7% for mean intersection over union (mIoU). When human-in-the-loop labeling is not available, using the denoised DINOv2 features with a KNN still improves on the prior state-of-the-art by 5.8% for mIoU (five grid points). On the semantic segmentation task, we outperform the prior state-of-the-art by 13.5% for mIoU when only five point labels are used for point label propagation. In addition, we perform a comprehensive study into the number and placement of point labels, and make several recommendations for improving the efficiency of labeling images with points.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"51 1","pages":"762-779"},"PeriodicalIF":5.3,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Scale Attention Feature Pyramid Network for Challenged Underwater Object Detection 跨尺度注意力特征金字塔网络的水下目标检测
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-12-22 DOI: 10.1109/JOE.2024.3450532
Miao Yang;Jinyang Zhong;Hansen Zhang;Can Pan;Xinmiao Gao;Chenglong Gong
Underwater object detection (UOD) is more difficult than object detection in air due to the noise caused by irrelevant objects and textures, and the scale variation. These difficulties pose a higher challenge to the feature extraction capability of detectors. Feature pyramid network (FPN) enhances the scale detection capability of detectors, while attention mechanisms effectively suppress irrelevant features. We present a cross-scale attention feature pyramid network (CSAFPN) for UOD. A feature fusion guided (FFG) module is incorporated in the CSAFPN, which constructs cross-scale context information and simultaneously guides the enhancement of all feature maps. Compared to existing FPN-like architectures, CSAFPN excels not only in capturing cross-scale long-range dependencies but also in acquiring compact multi-scale feature maps that specifically emphasize target regions. Extensive experiments on the Brackish2019 data set show that CSAFPN can achieve consistent improvements on various backbones and detectors. Moreover, FFG can be seamlessly integrated into any FPN-like architecture, offering a cost-effective improvement in UOD, resulting in a 1.4% average precision (AP) increase for FPN, a 1.3% AP increase for PANet, and a 1.4% AP increase for neural architecture search-FPN.
水下目标检测比空气中目标检测困难,主要是由于不相关目标和纹理产生的噪声以及尺度的变化。这些困难对检测器的特征提取能力提出了更高的挑战。特征金字塔网络(FPN)增强了检测器的尺度检测能力,而注意机制有效地抑制了无关特征。提出了一种面向UOD的跨尺度注意力特征金字塔网络(CSAFPN)。在CSAFPN中引入特征融合引导(feature fusion guided, FFG)模块,构建跨尺度上下文信息,同时指导所有特征映射的增强。与现有的类fpn架构相比,CSAFPN不仅在捕获跨尺度的远程依赖关系方面表现出色,而且在获取紧凑的多尺度特征图方面也表现出色,这些特征图特别强调目标区域。在Brackish2019数据集上的大量实验表明,CSAFPN可以在各种主干网和检测器上实现一致的改进。此外,FFG可以无缝集成到任何类似FPN的架构中,以经济有效的方式改善UOD,使FPN的平均精度(AP)提高1.4%,PANet的AP提高1.3%,神经结构搜索FPN的AP提高1.4%。
{"title":"Cross-Scale Attention Feature Pyramid Network for Challenged Underwater Object Detection","authors":"Miao Yang;Jinyang Zhong;Hansen Zhang;Can Pan;Xinmiao Gao;Chenglong Gong","doi":"10.1109/JOE.2024.3450532","DOIUrl":"https://doi.org/10.1109/JOE.2024.3450532","url":null,"abstract":"Underwater object detection (UOD) is more difficult than object detection in air due to the noise caused by irrelevant objects and textures, and the scale variation. These difficulties pose a higher challenge to the feature extraction capability of detectors. Feature pyramid network (FPN) enhances the scale detection capability of detectors, while attention mechanisms effectively suppress irrelevant features. We present a cross-scale attention feature pyramid network (CSAFPN) for UOD. A feature fusion guided (FFG) module is incorporated in the CSAFPN, which constructs cross-scale context information and simultaneously guides the enhancement of all feature maps. Compared to existing FPN-like architectures, CSAFPN excels not only in capturing cross-scale long-range dependencies but also in acquiring compact multi-scale feature maps that specifically emphasize target regions. Extensive experiments on the Brackish2019 data set show that CSAFPN can achieve consistent improvements on various backbones and detectors. Moreover, FFG can be seamlessly integrated into any FPN-like architecture, offering a cost-effective improvement in UOD, resulting in a 1.4% average precision (AP) increase for FPN, a 1.3% AP increase for PANet, and a 1.4% AP increase for neural architecture search-FPN.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"51 1","pages":"826-835"},"PeriodicalIF":5.3,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NA-UICDE: A Novel Adaptive Algorithm for Underwater Image Color Correction and Detail Enhancement 一种新的水下图像色彩校正和细节增强自适应算法
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-12-09 DOI: 10.1109/JOE.2025.3617906
Yuyun Chen;Wenguang He;Gangqiang Xiong;Junwu Li;Yaomin Wang
Underwater images often suffer from color distortion and detail loss due to light absorption and scattering, which degrades visual quality and limits practical applications. To address these issues, a novel adaptive algorithm for underwater image color correction and detail enhancement is proposed. The algorithm first applies threshold stretching to adjust the grayscale range, enhancing contrast while mitigating the risk of localized overcompensation. Based on the color distribution, images are categorized into bluish and greenish tones, providing the foundation for the adaptive color compensation method (ACCM). The ACCM is designed to separately compensate different color channels, using the green channel as a reference to restore the most degraded channels while maintaining overall color balance. The compensation process is further constrained by the minimum color loss criterion to ensure consistent color fidelity. Furthermore, an edge detail enhancement method is formulated to recover fine details by amplifying intensity differences between the original image and its smoothed version. Extensive experiments on multiple underwater image data sets demonstrate that the proposed algorithm consistently outperforms state-of-the-art methods, achieving average improvements of 0.0021, 0.0646, 0.3677, and 0.0800 in underwater color image quality evaluation, underwater image quality metric, fog aware density evaluator, and colorfulness contrast fog density index metrics, respectively, underscoring its effectiveness and robustness across diverse underwater environments.
由于光的吸收和散射,水下图像经常遭受色彩失真和细节损失,这降低了视觉质量,限制了实际应用。针对这些问题,提出了一种新的水下图像色彩校正和细节增强自适应算法。该算法首先利用阈值拉伸来调整灰度范围,增强对比度,同时降低局部过度补偿的风险。根据图像的颜色分布,将图像分为偏蓝调和偏绿调,为自适应色彩补偿方法(ACCM)提供了基础。ACCM被设计为分别补偿不同的颜色通道,使用绿色通道作为参考来恢复最退化的通道,同时保持整体颜色平衡。补偿过程进一步受到最小色彩损失准则的约束,以确保一致的色彩保真度。在此基础上,提出了一种边缘细节增强方法,通过放大原始图像与平滑图像之间的强度差来恢复精细细节。在多个水下图像数据集上进行的大量实验表明,该算法在水下彩色图像质量评估、水下图像质量度量、雾感密度评估器和色彩对比度雾密度指数指标上的平均改进分别为0.0021、0.0646、0.3677和0.0800,始终优于现有的方法,突出了其在不同水下环境中的有效性和鲁棒性。
{"title":"NA-UICDE: A Novel Adaptive Algorithm for Underwater Image Color Correction and Detail Enhancement","authors":"Yuyun Chen;Wenguang He;Gangqiang Xiong;Junwu Li;Yaomin Wang","doi":"10.1109/JOE.2025.3617906","DOIUrl":"https://doi.org/10.1109/JOE.2025.3617906","url":null,"abstract":"Underwater images often suffer from color distortion and detail loss due to light absorption and scattering, which degrades visual quality and limits practical applications. To address these issues, a novel adaptive algorithm for underwater image color correction and detail enhancement is proposed. The algorithm first applies threshold stretching to adjust the grayscale range, enhancing contrast while mitigating the risk of localized overcompensation. Based on the color distribution, images are categorized into bluish and greenish tones, providing the foundation for the adaptive color compensation method (ACCM). The ACCM is designed to separately compensate different color channels, using the green channel as a reference to restore the most degraded channels while maintaining overall color balance. The compensation process is further constrained by the minimum color loss criterion to ensure consistent color fidelity. Furthermore, an edge detail enhancement method is formulated to recover fine details by amplifying intensity differences between the original image and its smoothed version. Extensive experiments on multiple underwater image data sets demonstrate that the proposed algorithm consistently outperforms state-of-the-art methods, achieving average improvements of 0.0021, 0.0646, 0.3677, and 0.0800 in underwater color image quality evaluation, underwater image quality metric, fog aware density evaluator, and colorfulness contrast fog density index metrics, respectively, underscoring its effectiveness and robustness across diverse underwater environments.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"51 1","pages":"794-806"},"PeriodicalIF":5.3,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UMono: Physical-Model-Informed Hybrid CNN–Transformer Framework for Underwater Monocular Depth Estimation 水下单目深度估计的物理模型-知情混合cnn -变压器框架
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-11-10 DOI: 10.1109/JOE.2025.3606045
Xupeng Wu;Jian Wang;Jing Wang;Shenghui Rong;Bo He
Underwater monocular depth estimation serves as the foundation for tasks such as 3-D reconstruction of underwater scenes. However, due to the water medium and the absorption and scattering of light in water, the underwater environment undergoes a distinctive imaging process, which presents challenges in accurately estimating depth from a single image. The existing methods fail to consider the unique characteristics of underwater environments, leading to inadequate estimation results and limited generalization performance. Furthermore, underwater depth estimation requires extracting and fusing both local and global features, which is not fully explored in existing methods. In this article, an end-to-end learning framework for underwater monocular depth estimation called UMono is presented, which incorporates underwater image formation model characteristics into the network architecture, and effectively utilizes both local and global features of an underwater image. Specifically, UMono consists of an encoder with a hybrid architecture of a convolutional neural network (CNN) and Transformer and a decoder guided by a medium transmission map. First, we develop an underwater deep feature extraction (UDFE) block, which leverages the CNN and Transformer in parallel to achieve comprehensive extraction of both local and global features. These features are effectively integrated via the proposed local–global feature fusion (LGFF) module. By stacking the UDFE block as the basic unit, we constructed a hybrid encoder that generates four-stage hierarchical features. Subsequently, the medium transmission map is incorporated into the network as underwater domain knowledge, together with the encoded hierarchical features, is fed into the underwater depth information aggregation (UDIA) module, which aggregates depth information from the physical model and the neural network by a proposed cross attention mechanism. Then, the aggregated features serve as the guiding information for each decoding stage, facilitating the model in achieving comprehensive scene understanding and precise depth estimation. The final estimated depth map is obtained through consecutive upsampling processing. Experimental results demonstrate that the proposed method is effective for underwater monocular depth estimation and outperforms the existing methods in both quantitative and qualitative analyses.
水下单目深度估计是水下场景三维重建等任务的基础。然而,由于水介质和水中对光的吸收和散射,水下环境经历了一个独特的成像过程,这给单幅图像准确估计深度带来了挑战。现有的方法没有考虑到水下环境的独特性,导致估计结果不充分,泛化性能有限。此外,水下深度估计需要提取和融合局部和全局特征,这在现有方法中没有得到充分的探讨。本文提出了一种用于水下单目深度估计的端到端学习框架UMono,该框架将水下图像形成模型的特征融入到网络架构中,有效地利用了水下图像的局部和全局特征。具体来说,UMono由一个具有卷积神经网络(CNN)和Transformer混合架构的编码器和一个由介质传输图引导的解码器组成。首先,我们开发了一个水下深度特征提取(UDFE)块,它并行利用CNN和Transformer来实现局部和全局特征的综合提取。通过提出的局部-全局特征融合(LGFF)模块有效地集成了这些特征。通过将UDFE块堆叠为基本单元,我们构建了一个生成四阶段分层特征的混合编码器。随后,将介质传输图作为水下领域知识纳入网络,并与编码的层次特征一起输入水下深度信息聚合(UDIA)模块,该模块通过提出的交叉注意机制对物理模型和神经网络的深度信息进行聚合。然后,将聚合的特征作为每个解码阶段的指导信息,使模型能够实现全面的场景理解和精确的深度估计。通过连续上采样处理得到最终的估计深度图。实验结果表明,该方法对水下单目深度估计是有效的,在定量和定性分析方面都优于现有的方法。
{"title":"UMono: Physical-Model-Informed Hybrid CNN–Transformer Framework for Underwater Monocular Depth Estimation","authors":"Xupeng Wu;Jian Wang;Jing Wang;Shenghui Rong;Bo He","doi":"10.1109/JOE.2025.3606045","DOIUrl":"https://doi.org/10.1109/JOE.2025.3606045","url":null,"abstract":"Underwater monocular depth estimation serves as the foundation for tasks such as 3-D reconstruction of underwater scenes. However, due to the water medium and the absorption and scattering of light in water, the underwater environment undergoes a distinctive imaging process, which presents challenges in accurately estimating depth from a single image. The existing methods fail to consider the unique characteristics of underwater environments, leading to inadequate estimation results and limited generalization performance. Furthermore, underwater depth estimation requires extracting and fusing both local and global features, which is not fully explored in existing methods. In this article, an end-to-end learning framework for underwater monocular depth estimation called UMono is presented, which incorporates underwater image formation model characteristics into the network architecture, and effectively utilizes both local and global features of an underwater image. Specifically, UMono consists of an encoder with a hybrid architecture of a convolutional neural network (CNN) and Transformer and a decoder guided by a medium transmission map. First, we develop an underwater deep feature extraction (UDFE) block, which leverages the CNN and Transformer in parallel to achieve comprehensive extraction of both local and global features. These features are effectively integrated via the proposed local–global feature fusion (LGFF) module. By stacking the UDFE block as the basic unit, we constructed a hybrid encoder that generates four-stage hierarchical features. Subsequently, the medium transmission map is incorporated into the network as underwater domain knowledge, together with the encoded hierarchical features, is fed into the underwater depth information aggregation (UDIA) module, which aggregates depth information from the physical model and the neural network by a proposed cross attention mechanism. Then, the aggregated features serve as the guiding information for each decoding stage, facilitating the model in achieving comprehensive scene understanding and precise depth estimation. The final estimated depth map is obtained through consecutive upsampling processing. Experimental results demonstrate that the proposed method is effective for underwater monocular depth estimation and outperforms the existing methods in both quantitative and qualitative analyses.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"51 1","pages":"780-793"},"PeriodicalIF":5.3,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2025 Index IEEE Journal of Oceanic Engineering 2025索引IEEE海洋工程学报
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-11-05 DOI: 10.1109/JOE.2025.3628413
{"title":"2025 Index IEEE Journal of Oceanic Engineering","authors":"","doi":"10.1109/JOE.2025.3628413","DOIUrl":"https://doi.org/10.1109/JOE.2025.3628413","url":null,"abstract":"","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"50 4","pages":"1-57"},"PeriodicalIF":5.3,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11230041","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint Image Enhancement for Underwater Object Detection in Various Domains 多领域水下目标检测联合图像增强
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-10-30 DOI: 10.1109/JOE.2025.3604170
Junjie Wen;Guidong Yang;Benyun Zhao;Lei Lei;Zhi Gao;Xi Chen;Ben M. Chen
Underwater environments present significant challenges, such as image degradation and domain discrepancies, that severely impact object detection performance. Traditional approaches often use image enhancement as a preprocessing step, but this adds computational overhead, latency, and can even degrade detection accuracy. To address these issues, we propose a novel underwater object detection framework that jointly trains image enhancement within a multitask architecture. This framework employs a progressive training strategy to iteratively improve detection performance through enhancement and introduces a domain-adaptation mechanism to align features across domains at both image and object levels. Experimental results demonstrate that our method achieves state-of-the-art performance across diverse data sets, with real-time detection at 105.93 frames per second and over +15$%$ mean average precision absolute improvement in unseen environments, underscoring its potential for real-world underwater applications.
水下环境提出了重大挑战,如图像退化和域差异,严重影响目标检测性能。传统的方法通常使用图像增强作为预处理步骤,但这会增加计算开销、延迟,甚至会降低检测精度。为了解决这些问题,我们提出了一种新的水下目标检测框架,该框架在多任务架构中联合训练图像增强。该框架采用渐进式训练策略,通过增强迭代提高检测性能,并引入域自适应机制,在图像和对象级别跨域对齐特征。实验结果表明,我们的方法在不同的数据集上实现了最先进的性能,实时检测速度为每秒105.93帧,在看不见的环境中平均精度绝对提高了+ 15%,强调了其在真实水下应用的潜力。
{"title":"Joint Image Enhancement for Underwater Object Detection in Various Domains","authors":"Junjie Wen;Guidong Yang;Benyun Zhao;Lei Lei;Zhi Gao;Xi Chen;Ben M. Chen","doi":"10.1109/JOE.2025.3604170","DOIUrl":"https://doi.org/10.1109/JOE.2025.3604170","url":null,"abstract":"Underwater environments present significant challenges, such as image degradation and domain discrepancies, that severely impact object detection performance. Traditional approaches often use image enhancement as a preprocessing step, but this adds computational overhead, latency, and can even degrade detection accuracy. To address these issues, we propose a novel underwater object detection framework that jointly trains image enhancement within a multitask architecture. This framework employs a progressive training strategy to iteratively improve detection performance through enhancement and introduces a domain-adaptation mechanism to align features across domains at both image and object levels. Experimental results demonstrate that our method achieves state-of-the-art performance across diverse data sets, with real-time detection at 105.93 frames per second and over +15<inline-formula><tex-math>$%$</tex-math></inline-formula> mean average precision absolute improvement in unseen environments, underscoring its potential for real-world underwater applications.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"51 1","pages":"807-825"},"PeriodicalIF":5.3,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Ship Radiated Noise Separation Method Based on Neural Network Autoencoder Combined With Time–Frequency Masking 基于时频掩蔽神经网络自编码器的船舶辐射噪声分离方法研究
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-09-16 DOI: 10.1109/JOE.2025.3590072
Zhi Xia;Ziang Zhang;Jie Shi;Zihao Zhao
Strong narrowband signals mixed with weak broadband signals can interfere with the direction estimation of broadband targets using the cross-correlation method of a two-hydrophone array. To tackle this problem, this article proposes a method for separating narrowband signals from sonar-received signals. The proposed method integrates neural network autoencoders with time-frequency masking techniques. We have designed a lightweight neural network autoencoder that can be trained using purely simulated data. This autoencoder extracts narrowband line spectrum features from the time–frequency distribution of the sonar-received signals and generates time–frequency masks. Subsequently, the time-frequency masking method is employed to isolate the narrowband components from the sonar-received signals. The proposed method was validated using data from the SWellEx-96 experiment. In the cross-correlation results of the original data from two hydrophones, strong narrowband signals dominated and obscured the weaker broadband signals. By removing the narrowband components from the two-hydrophone signals using the method proposed in this article, the cross-correlation of the processed data clearly revealed the time history of the broadband signal characteristics. This result confirms the effectiveness of the proposed method.
强窄带信号与弱宽带信号混合会对双水听器阵列的宽带目标方向估计产生干扰。为了解决这一问题,本文提出了一种从声纳接收信号中分离窄带信号的方法。该方法将神经网络自编码器与时频掩蔽技术相结合。我们设计了一个轻量级的神经网络自动编码器,可以使用纯模拟数据进行训练。该自编码器从声呐接收信号的时频分布中提取窄带线谱特征并生成时频掩模。随后,采用时频掩蔽法对声呐接收信号中的窄带分量进行隔离。利用welllex -96实验数据验证了该方法的有效性。在两个水听器原始数据的互相关结果中,较强的窄带信号占主导地位,掩盖了较弱的宽带信号。利用本文提出的方法去除双水听器信号中的窄带分量,处理后数据的相互关系清晰地揭示了宽带信号特征的时间历程。实验结果证实了该方法的有效性。
{"title":"Research on Ship Radiated Noise Separation Method Based on Neural Network Autoencoder Combined With Time–Frequency Masking","authors":"Zhi Xia;Ziang Zhang;Jie Shi;Zihao Zhao","doi":"10.1109/JOE.2025.3590072","DOIUrl":"https://doi.org/10.1109/JOE.2025.3590072","url":null,"abstract":"Strong narrowband signals mixed with weak broadband signals can interfere with the direction estimation of broadband targets using the cross-correlation method of a two-hydrophone array. To tackle this problem, this article proposes a method for separating narrowband signals from sonar-received signals. The proposed method integrates neural network autoencoders with time-frequency masking techniques. We have designed a lightweight neural network autoencoder that can be trained using purely simulated data. This autoencoder extracts narrowband line spectrum features from the time–frequency distribution of the sonar-received signals and generates time–frequency masks. Subsequently, the time-frequency masking method is employed to isolate the narrowband components from the sonar-received signals. The proposed method was validated using data from the SWellEx-96 experiment. In the cross-correlation results of the original data from two hydrophones, strong narrowband signals dominated and obscured the weaker broadband signals. By removing the narrowband components from the two-hydrophone signals using the method proposed in this article, the cross-correlation of the processed data clearly revealed the time history of the broadband signal characteristics. This result confirms the effectiveness of the proposed method.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"50 4","pages":"3248-3263"},"PeriodicalIF":5.3,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Region Ocean Sound Velocity Field Model Considering Variation Mechanisms of Temperature and Salt 考虑温度和盐变化机制的区域海洋声速场新模型
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-09-15 DOI: 10.1109/JOE.2025.3582350
Xi Zhao;Qiangqiang Yuan;Hongzhou Chai;Zhongyang Yuan
The limited availability of in situ ocean observations poses significant challenges to real-time oceanographic applications, particularly in hydroacoustic measurements where accuracy critically depends on the spatiotemporal variability of sound speed. To address the sparsity of sound-speed profile (SSPs), this study proposes an advanced modeling framework for constructing regional sound speed fields by integrating temporal and spatial dynamics. Specifically, the variation mechanisms of temperature and salinity are analyzed, and empirical orthogonal function decomposition is used to extract compact SSP representations. A novel multimodel temporal prediction architecture, combining seasonal-trend decomposition using LOESS, long short-term memory, multivariate unsupervised domain adaptation, and inverted Transformer, captures complex seasonal and adaptive patterns. Meanwhile, spatial modeling adopts a particle swarm optimization least squares support vector machine approach to enhance interpolation across diverse marine environments. Experiments show that the model outperforms existing methods, achieving a root-mean-square error of 0.812 m/s and a mean absolute percentage error of 0.037%. Its robust prediction capability supports accurate multibeam bathymetric processing even without direct SSP observations, confirming its practical value for real-time ocean mapping.
海洋现场观测的有限可用性对实时海洋学应用构成了重大挑战,特别是在水声测量中,其精度严重依赖于声速的时空变异性。为了解决声速分布的稀疏性问题,本文提出了一种基于时空动力学的区域声速场建模框架。具体而言,分析了温度和盐度的变化机制,并利用经验正交函数分解提取了紧凑的SSP表示。基于黄土的季节趋势分解、长短期记忆、多变量无监督域自适应和反向变压器相结合的多模型时间预测体系结构,能够捕捉复杂的季节和自适应模式。同时,空间建模采用粒子群优化最小二乘支持向量机方法,增强了不同海洋环境的插值能力。实验表明,该模型优于现有方法,均方根误差为0.812 m/s,平均绝对百分比误差为0.037%。即使没有直接的SSP观测,其强大的预测能力也支持精确的多波束测深处理,证实了其在实时海洋测绘中的实用价值。
{"title":"A New Region Ocean Sound Velocity Field Model Considering Variation Mechanisms of Temperature and Salt","authors":"Xi Zhao;Qiangqiang Yuan;Hongzhou Chai;Zhongyang Yuan","doi":"10.1109/JOE.2025.3582350","DOIUrl":"https://doi.org/10.1109/JOE.2025.3582350","url":null,"abstract":"The limited availability of in situ ocean observations poses significant challenges to real-time oceanographic applications, particularly in hydroacoustic measurements where accuracy critically depends on the spatiotemporal variability of sound speed. To address the sparsity of sound-speed profile (SSPs), this study proposes an advanced modeling framework for constructing regional sound speed fields by integrating temporal and spatial dynamics. Specifically, the variation mechanisms of temperature and salinity are analyzed, and empirical orthogonal function decomposition is used to extract compact SSP representations. A novel multimodel temporal prediction architecture, combining seasonal-trend decomposition using LOESS, long short-term memory, multivariate unsupervised domain adaptation, and inverted Transformer, captures complex seasonal and adaptive patterns. Meanwhile, spatial modeling adopts a particle swarm optimization least squares support vector machine approach to enhance interpolation across diverse marine environments. Experiments show that the model outperforms existing methods, achieving a root-mean-square error of 0.812 m/s and a mean absolute percentage error of 0.037%. Its robust prediction capability supports accurate multibeam bathymetric processing even without direct SSP observations, confirming its practical value for real-time ocean mapping.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"50 4","pages":"3218-3234"},"PeriodicalIF":5.3,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Channel–Spatial Aligned Global Knowledge Distillation for Underwater Acoustic Target Recognition 基于信道-空间对齐的水声目标识别全局知识精馏
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-08-28 DOI: 10.1109/JOE.2025.3586648
Xiaohui Chu;Zhenzhe Hou;Haoran Duan;Lijun Xu;Runze Hu
Knowledge distillation (KD) is a predominant technique to streamline deep-learning-based recognition models for practical underwater deployments. However, existing KD methods for underwater acoustic target recognition face two problems: 1) the knowledge learning paradigm is not very consistent with the characteristics of underwater acoustics and 2) the complexity of acoustic signals in ocean environments leads to different prediction capacities in teacher and student models. This induces feature misalignment in the knowledge transfer, rendering suboptimal results. To address these problems, we propose a new distillation paradigm, i.e., channel–spatial aligned global knowledge distillation (CSGKD). Considering that the channel features (indicating the loudness of signals) and spatial features (indicating the propagation patterns of signals) in Mel spectrograms are discriminative for acoustic signal recognition, we design the knowledge-transferring scheme from “channel–spatial” aspects for effective feature extraction. Furthermore, CSGKD introduces a global multilayer alignment strategy, where all student layers collectively correspond to a single teacher layer. This allows the student model to dissect acoustic signals at a granular level, thereby capturing intricate patterns and nuances. CSGKD achieves a seamless blend of richness and efficiency, ensuring swift processing while being detail oriented. Extensive experiments on two real-world oceanic data sets confirm the superior performance of CSGKD compared to existing KD methods, i.e., achieving an accuracy (ACC) of 82.37% ($uparrow$ 2.49% versus 79.88%). Notably, CSGKD showcases an 8.87% improvement in the ACC of the lightweight student model.
知识蒸馏(KD)是一种主要的技术,用于简化基于深度学习的水下部署识别模型。然而,现有的水声目标识别KD方法面临两个问题:1)知识学习范式与水声特性不太一致;2)海洋环境中声信号的复杂性导致教师模型和学生模型的预测能力不同。这导致知识转移中的特征不对齐,呈现次优结果。为了解决这些问题,我们提出了一种新的蒸馏范式,即通道-空间对齐的全局知识蒸馏(CSGKD)。考虑到Mel谱图中的通道特征(表示信号的响度)和空间特征(表示信号的传播方式)对声信号识别具有区别性,我们从“通道-空间”方面设计了知识传递方案,以实现有效的特征提取。此外,CSGKD引入了一种全局多层对齐策略,其中所有学生层共同对应于单个教师层。这使得学生模型可以在颗粒级上解剖声学信号,从而捕获复杂的模式和细微差别。CSGKD实现了丰富性和效率的无缝融合,确保快速处理,同时注重细节。在两个真实海洋数据集上的大量实验证实了CSGKD与现有KD方法相比的优越性能,即达到82.37%的准确率(ACC) ($ uprow $ 2.49%对79.88%)。值得注意的是,CSGKD显示轻量级学生模型的ACC提高了8.87%。
{"title":"Channel–Spatial Aligned Global Knowledge Distillation for Underwater Acoustic Target Recognition","authors":"Xiaohui Chu;Zhenzhe Hou;Haoran Duan;Lijun Xu;Runze Hu","doi":"10.1109/JOE.2025.3586648","DOIUrl":"https://doi.org/10.1109/JOE.2025.3586648","url":null,"abstract":"Knowledge distillation (KD) is a predominant technique to streamline deep-learning-based recognition models for practical underwater deployments. However, existing KD methods for underwater acoustic target recognition face two problems: 1) the knowledge learning paradigm is not very consistent with the characteristics of underwater acoustics and 2) the complexity of acoustic signals in ocean environments leads to different prediction capacities in teacher and student models. This induces feature misalignment in the knowledge transfer, rendering suboptimal results. To address these problems, we propose a new distillation paradigm, i.e., channel–spatial aligned global knowledge distillation (CSGKD). Considering that the channel features (indicating the loudness of signals) and spatial features (indicating the propagation patterns of signals) in Mel spectrograms are discriminative for acoustic signal recognition, we design the knowledge-transferring scheme from “channel–spatial” aspects for effective feature extraction. Furthermore, CSGKD introduces a global multilayer alignment strategy, where all student layers collectively correspond to a single teacher layer. This allows the student model to dissect acoustic signals at a granular level, thereby capturing intricate patterns and nuances. CSGKD achieves a seamless blend of richness and efficiency, ensuring swift processing while being detail oriented. Extensive experiments on two real-world oceanic data sets confirm the superior performance of CSGKD compared to existing KD methods, i.e., achieving an accuracy (ACC) of 82.37% (<inline-formula><tex-math>$uparrow$</tex-math></inline-formula> 2.49% versus 79.88%). Notably, CSGKD showcases an 8.87% improvement in the ACC of the lightweight student model.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"50 4","pages":"3145-3159"},"PeriodicalIF":5.3,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ocean Sound Speed Profile Measurement Using a Pulse–Echo Technique 利用脉冲回波技术测量海洋声速剖面
IF 5.3 2区 工程技术 Q1 ENGINEERING, CIVIL Pub Date : 2025-08-20 DOI: 10.1109/JOE.2025.3583780
Mohammad Reza Mousavi;Len Zedel
This article presents an acoustic method for remotely measuring the ocean sound speed profile using a single directional transmitter and at least two receivers. By employing cross-correlation techniques to estimate the time of flight of echo-received signals, the proposed approach calculates both the average sound speeds and the depths of ocean reflectors, resulting in the estimation of the sound speed profile. To validate the method, simulations are conducted using a ray acoustic propagation model that includes both time-invariant conditions and time-varying statistical effects. Key system parameters, including pulse characteristics, transducer geometry, signal-to-noise ratio, and reflector density, are analyzed. The accuracy of the estimated sound speed profiles is assessed by comparing them with the input profiles used in the simulation model. Using the proposed approach, a nonuniform average sound speed profile is measured with a root-mean-square error of 0.67 m/s up to 125 m, using a 22 m transducer array, highlighting its practicality for ocean sound speed monitoring. Experimental evaluation was conducted with a 7.39-m array in the National Research Council of Canada towing tank (200 m × 12 m × 7 m). The tank results showed a standard deviation below 2.5 m/s up to 20-m range, increasing to 12 m/s at 40 m due to reduced signal-to-clutter ratio in the high-interference environment.
本文提出了一种用单方向发射机和至少两个接收机远程测量海洋声速剖面的声学方法。该方法利用互相关技术估计回波接收信号的飞行时间,同时计算平均声速和海洋反射体的深度,从而估计声速剖面。为了验证该方法,使用包含时不变条件和时变统计效应的射线声传播模型进行了模拟。分析了系统关键参数,包括脉冲特性、换能器几何形状、信噪比和反射器密度。通过与仿真模型中使用的输入声速分布进行比较,评估了估计声速分布的准确性。利用所提出的方法,利用22米传感器阵列测量了非均匀平均声速剖面,在125米范围内的均方根误差为0.67 m/s,突出了其在海洋声速监测中的实用性。实验评价采用加拿大国家研究委员会拖曳槽(200 m × 12 m × 7 m)的7.39 m阵列进行。坦克测试结果显示,在20米范围内,标准偏差低于2.5米/秒,在40米范围内,由于在高干扰环境中降低了信杂波比,标准偏差增加到12米/秒。
{"title":"Ocean Sound Speed Profile Measurement Using a Pulse–Echo Technique","authors":"Mohammad Reza Mousavi;Len Zedel","doi":"10.1109/JOE.2025.3583780","DOIUrl":"https://doi.org/10.1109/JOE.2025.3583780","url":null,"abstract":"This article presents an acoustic method for remotely measuring the ocean sound speed profile using a single directional transmitter and at least two receivers. By employing cross-correlation techniques to estimate the time of flight of echo-received signals, the proposed approach calculates both the average sound speeds and the depths of ocean reflectors, resulting in the estimation of the sound speed profile. To validate the method, simulations are conducted using a ray acoustic propagation model that includes both time-invariant conditions and time-varying statistical effects. Key system parameters, including pulse characteristics, transducer geometry, signal-to-noise ratio, and reflector density, are analyzed. The accuracy of the estimated sound speed profiles is assessed by comparing them with the input profiles used in the simulation model. Using the proposed approach, a nonuniform average sound speed profile is measured with a root-mean-square error of 0.67 m/s up to 125 m, using a 22 m transducer array, highlighting its practicality for ocean sound speed monitoring. Experimental evaluation was conducted with a 7.39-m array in the National Research Council of Canada towing tank (200 m × 12 m × 7 m). The tank results showed a standard deviation below 2.5 m/s up to 20-m range, increasing to 12 m/s at 40 m due to reduced signal-to-clutter ratio in the high-interference environment.","PeriodicalId":13191,"journal":{"name":"IEEE Journal of Oceanic Engineering","volume":"50 4","pages":"3184-3200"},"PeriodicalIF":5.3,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145371482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal of Oceanic Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1