首页 > 最新文献

Digital Signal Processing最新文献

英文 中文
Capturing HDR video in challenging light conditions by beam-splitting ratio variable multi-sensor system 利用分束比可变多传感器系统在恶劣光照条件下捕获HDR视频
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-25 DOI: 10.1016/j.dsp.2026.105956
Zhangchi Qiao , Hongwei Yi , Desheng Wen , Yong Han
Recording video in HDR scenes is challenging because it is always limited by the potential well capacity and sampling rate of the imaging sensor. The essence of this problem is how to balance the relationship between temporal resolution, spatial resolution and dynamic range. To solve this, we designed a variable beam-splitting ratio multi-sensor system (BRVMS) to capture both long and short exposure frames. It consists of a variety of configurations to meet changing light conditions. In addition, we considered motion blur from long exposures before synthesising the HDR frames. We proposed a method to estimate the blur kernel using short exposure frame constraints and add a mask to remove outliers in the overexposed area. Finally, we proposed a match-fusion method based on the two-layer 3D patch (2L3DP) to generate high-quality, detail-rich HDR frames. Extensive experimental results and ablation studies were performed to show the effectiveness of the system. By combining the BRVMS with the 2L3DP match-fusion method, we have enhanced the adaptability and performance of the vision system in high-speed, high-dynamic-range scenes to meet the growing demands of vision applications.
在HDR场景中录制视频具有挑战性,因为它总是受到成像传感器的潜在井容量和采样率的限制。该问题的实质是如何平衡时间分辨率、空间分辨率和动态范围之间的关系。为了解决这个问题,我们设计了一个可变波束分割比多传感器系统(BRVMS)来捕获长曝光和短曝光帧。它由多种配置组成,以满足不断变化的光线条件。此外,在合成HDR帧之前,我们考虑了长时间曝光的运动模糊。我们提出了一种利用短曝光帧约束来估计模糊核的方法,并在过度曝光区域添加蒙版来去除异常值。最后,我们提出了一种基于二层3D补丁(2L3DP)的匹配融合方法,以生成高质量、细节丰富的HDR帧。广泛的实验结果和烧蚀研究表明了该系统的有效性。通过将BRVMS与2L3DP匹配融合方法相结合,增强了视觉系统在高速、高动态范围场景中的适应性和性能,满足了视觉应用日益增长的需求。
{"title":"Capturing HDR video in challenging light conditions by beam-splitting ratio variable multi-sensor system","authors":"Zhangchi Qiao ,&nbsp;Hongwei Yi ,&nbsp;Desheng Wen ,&nbsp;Yong Han","doi":"10.1016/j.dsp.2026.105956","DOIUrl":"10.1016/j.dsp.2026.105956","url":null,"abstract":"<div><div>Recording video in HDR scenes is challenging because it is always limited by the potential well capacity and sampling rate of the imaging sensor. The essence of this problem is how to balance the relationship between temporal resolution, spatial resolution and dynamic range. To solve this, we designed a variable beam-splitting ratio multi-sensor system (BRVMS) to capture both long and short exposure frames. It consists of a variety of configurations to meet changing light conditions. In addition, we considered motion blur from long exposures before synthesising the HDR frames. We proposed a method to estimate the blur kernel using short exposure frame constraints and add a mask to remove outliers in the overexposed area. Finally, we proposed a match-fusion method based on the two-layer 3D patch (2L3DP) to generate high-quality, detail-rich HDR frames. Extensive experimental results and ablation studies were performed to show the effectiveness of the system. By combining the BRVMS with the 2L3DP match-fusion method, we have enhanced the adaptability and performance of the vision system in high-speed, high-dynamic-range scenes to meet the growing demands of vision applications.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105956"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A relocated augmented coprime array for balancing low mutual coupling and high degrees of freedom 一种用于平衡低互耦合和高自由度的重定位增广素数阵列
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-31 DOI: 10.1016/j.dsp.2026.105972
Zifan Tao , Jun Yang , Xinping Chen
As a sparsely array structure that has attracted much attention, the coprime array can provide high degrees of freedom and reduce the mutual coupling effect. However, there are holes in the difference co-array of the coprime array, which cannot fully utilize the array element information. This paper proposes a relocated augmented coprime array structure to fill the holes in the difference co-array. We determine the positions of sensors by analyzing the redundancy of the coprime array and combining it with a two-dimensional hole model. This design can significantly increase the degrees of freedom and uniform degrees of freedom, and the mutual coupling effect is also greatly reduced. We derive the closed-form expressions for the degrees of freedom, uniform degrees of freedom, and weight function of the proposed array. Simulation results demonstrate that, in comparison with current mainstream sparse array structures such as SNA, ICNA, SDNA and SSACA, the proposed RACA achieves uniform degrees of freedom comparable to those of SNA, exhibits excellent performance under both strong and weak mutual coupling scenarios, and strikes a balance between uDOF and mutual coupling effects, thus fully verifying its effectiveness.
互素阵作为一种稀疏阵结构,具有较高的自由度和减小互耦效应的优点,受到了广泛的关注。然而,协素数阵列的差共阵存在漏洞,不能充分利用阵列元素信息。本文提出了一种重新定位的增广协素阵结构来填补差共阵的缺陷。我们通过分析协素数阵列的冗余度,并结合二维空穴模型来确定传感器的位置。这种设计可以显著增加自由度和均匀自由度,同时也大大降低了相互耦合效应。导出了该阵列的自由度、均匀自由度和权函数的封闭表达式。仿真结果表明,与目前主流的SNA、ICNA、SDNA、SSACA等稀疏阵列结构相比,本文提出的RACA实现了与SNA相当的均匀自由度,在强弱互耦合场景下均表现出优异的性能,并在uDOF和互耦合效果之间取得了平衡,充分验证了其有效性。
{"title":"A relocated augmented coprime array for balancing low mutual coupling and high degrees of freedom","authors":"Zifan Tao ,&nbsp;Jun Yang ,&nbsp;Xinping Chen","doi":"10.1016/j.dsp.2026.105972","DOIUrl":"10.1016/j.dsp.2026.105972","url":null,"abstract":"<div><div>As a sparsely array structure that has attracted much attention, the coprime array can provide high degrees of freedom and reduce the mutual coupling effect. However, there are holes in the difference co-array of the coprime array, which cannot fully utilize the array element information. This paper proposes a relocated augmented coprime array structure to fill the holes in the difference co-array. We determine the positions of sensors by analyzing the redundancy of the coprime array and combining it with a two-dimensional hole model. This design can significantly increase the degrees of freedom and uniform degrees of freedom, and the mutual coupling effect is also greatly reduced. We derive the closed-form expressions for the degrees of freedom, uniform degrees of freedom, and weight function of the proposed array. Simulation results demonstrate that, in comparison with current mainstream sparse array structures such as SNA, ICNA, SDNA and SSACA, the proposed RACA achieves uniform degrees of freedom comparable to those of SNA, exhibits excellent performance under both strong and weak mutual coupling scenarios, and strikes a balance between uDOF and mutual coupling effects, thus fully verifying its effectiveness.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105972"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artifact-suppressed style transfer for Chinese ink paintings via enhanced CycleGAN 通过增强的CycleGAN研究中国水墨画的人工抑制风格转移
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-29 DOI: 10.1016/j.dsp.2026.105965
Shuo Zhang, Shengwen Wang, Hongrui Liu, Yonghua Zhang, Ziqing Huang
Style transfer, a pivotal domain in machine vision, has achieved remarkable success in generating Western-style paintings. However, due to the unique “void” (Liubai) aesthetic of Chinese ink painting, the direct application of existing methods often yields irregular artifacts in blank areas and washes out details of brush strokes. To mitigate these limitations, this paper proposes a physically-guided hierarchical attention framework based on CycleGAN. Specifically, we introduce a coarse-to-fine algorithmic design where an inverted brightness-based masking mechanism is first constructed to serve as a spatial prior, explicitly suppressing high-frequency artifacts in void regions based on physical domain characteristics. Building upon this spatial prior, the Convolutional Block Attention Module (CBAM) is integrated into the generator as an adaptive feature modulator, recalibrating weights to adaptively concentrate computational resources on refining semantic foreground textures. Additionally, we incorporate the Learned Perceptual Image Patch Similarity (LPIPS) metric into the cyclic consistency constraint. This perceptually aligned objective resolves the “texture smoothing” issue inherent in pixel-wise losses. Experiments on our curated L2I (Landscape-to-Ink) benchmark dataset show that the model effectively suppresses artifacts and enhances artistic effects, outperforming existing methods. This work offers a robust algorithmic solution for the preservation and innovation of traditional Chinese art. The dataset is available at https://github.com/ww02711/L2I.git.
风格转换是机器视觉中的一个关键领域,在生成西式绘画方面取得了显著的成功。然而,由于中国水墨画独特的“空”(留白)美学,直接运用现有的方法往往会在空白区域产生不规则的人工制品,并洗掉笔触的细节。为了减轻这些限制,本文提出了一个基于CycleGAN的物理引导分层注意力框架。具体来说,我们引入了一种从粗到精的算法设计,其中首先构建了一个基于反转亮度的掩蔽机制作为空间先验,明确抑制基于物理域特征的空洞区域中的高频伪像。在此空间先验的基础上,卷积块注意模块(CBAM)作为自适应特征调制器集成到生成器中,重新校准权重以自适应地将计算资源集中在精炼语义前景纹理上。此外,我们将学习到的感知图像斑块相似度(LPIPS)度量纳入循环一致性约束。这种感知对齐的物镜解决了像素损失所固有的“纹理平滑”问题。在我们策划的L2I(景观到墨水)基准数据集上的实验表明,该模型有效地抑制了人工制品并增强了艺术效果,优于现有的方法。这项工作为中国传统艺术的保护和创新提供了一个强大的算法解决方案。该数据集可在https://github.com/ww02711/L2I.git上获得。
{"title":"Artifact-suppressed style transfer for Chinese ink paintings via enhanced CycleGAN","authors":"Shuo Zhang,&nbsp;Shengwen Wang,&nbsp;Hongrui Liu,&nbsp;Yonghua Zhang,&nbsp;Ziqing Huang","doi":"10.1016/j.dsp.2026.105965","DOIUrl":"10.1016/j.dsp.2026.105965","url":null,"abstract":"<div><div>Style transfer, a pivotal domain in machine vision, has achieved remarkable success in generating Western-style paintings. However, due to the unique “void” (<em>Liubai</em>) aesthetic of Chinese ink painting, the direct application of existing methods often yields irregular artifacts in blank areas and washes out details of brush strokes. To mitigate these limitations, this paper proposes a physically-guided hierarchical attention framework based on CycleGAN. Specifically, we introduce a coarse-to-fine algorithmic design where an inverted brightness-based masking mechanism is first constructed to serve as a spatial prior, explicitly suppressing high-frequency artifacts in void regions based on physical domain characteristics. Building upon this spatial prior, the Convolutional Block Attention Module (CBAM) is integrated into the generator as an adaptive feature modulator, recalibrating weights to adaptively concentrate computational resources on refining semantic foreground textures. Additionally, we incorporate the Learned Perceptual Image Patch Similarity (LPIPS) metric into the cyclic consistency constraint. This perceptually aligned objective resolves the “texture smoothing” issue inherent in pixel-wise losses. Experiments on our curated L2I (Landscape-to-Ink) benchmark dataset show that the model effectively suppresses artifacts and enhances artistic effects, outperforming existing methods. This work offers a robust algorithmic solution for the preservation and innovation of traditional Chinese art. The dataset is available at <span><span>https://github.com/ww02711/L2I.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105965"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AmBC-NOMA with physical-layer network coding for mutualistic two-way relay cellular IoT 具有物理层网络编码的AmBC-NOMA,用于互助双向中继蜂窝物联网
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-29 DOI: 10.1016/j.dsp.2026.105959
Youtao Jiang , Yao Xu , Shaobo Jia , Peng Lin , Xiaoxu Guo , Jianyue Zhu , Zhizhong Zhang
Non-orthogonal multiple access (NOMA)-based two-way relay (TWR) systems can enhance communication coverage and spectral efficiency, but they face challenges in supporting future cellular Internet of Things (IoT) due to the coexistence of heterogeneous rate signals. This paper proposes a mutualistic ambient backscatter communication-aided NOMA scheme for TWR-based cellular IoT, where two cellular users and a relaying user exchange information via physical-layer network coding and NOMA, while IoT devices transmit data using backscatter modulation and cellular radio frequency signals. However, the multi-type interference and complex composite channels in the proposed scheme result in complicated signal-to-interference-plus-noise ratio expressions, which complicate accurate performance characterization. To address this, we derive closed-form expressions for the ergodic sum rate (ESR) using an equivalent transformation of squared generalized-K random variables, and characterize the asymptotic ESR at high signal-to-noise ratio. Simulation results validate the theoretical analysis and demonstrate the ESR gains over conventional orthogonal multiple access, NOMA-based TWR, and symbiotic NOMA-based TWR, while revealing the impacts of the IoT device count, node distance, and power allocation on the ESR.
基于非正交多址(NOMA)的双向中继(TWR)系统可以提高通信覆盖范围和频谱效率,但由于异构速率信号共存,在支持未来的蜂窝物联网(IoT)方面面临挑战。针对基于twr的蜂窝物联网,本文提出了一种互助性环境反向散射通信辅助NOMA方案,其中两个蜂窝用户和一个中继用户通过物理层网络编码和NOMA交换信息,而物联网设备使用反向散射调制和蜂窝射频信号传输数据。然而,由于该方案中存在多类型干扰和复杂的复合通道,导致信噪比表达式复杂,这给准确的性能表征带来了困难。为了解决这一问题,我们利用平方广义k随机变量的等价变换,导出了遍历和率(ESR)的封闭表达式,并刻画了高信噪比下的渐近ESR。仿真结果验证了理论分析,并展示了ESR优于传统正交多址、基于noma的TWR和基于共生noma的TWR,同时揭示了物联网设备数量、节点距离和功率分配对ESR的影响。
{"title":"AmBC-NOMA with physical-layer network coding for mutualistic two-way relay cellular IoT","authors":"Youtao Jiang ,&nbsp;Yao Xu ,&nbsp;Shaobo Jia ,&nbsp;Peng Lin ,&nbsp;Xiaoxu Guo ,&nbsp;Jianyue Zhu ,&nbsp;Zhizhong Zhang","doi":"10.1016/j.dsp.2026.105959","DOIUrl":"10.1016/j.dsp.2026.105959","url":null,"abstract":"<div><div>Non-orthogonal multiple access (NOMA)-based two-way relay (TWR) systems can enhance communication coverage and spectral efficiency, but they face challenges in supporting future cellular Internet of Things (IoT) due to the coexistence of heterogeneous rate signals. This paper proposes a mutualistic ambient backscatter communication-aided NOMA scheme for TWR-based cellular IoT, where two cellular users and a relaying user exchange information via physical-layer network coding and NOMA, while IoT devices transmit data using backscatter modulation and cellular radio frequency signals. However, the multi-type interference and complex composite channels in the proposed scheme result in complicated signal-to-interference-plus-noise ratio expressions, which complicate accurate performance characterization. To address this, we derive closed-form expressions for the ergodic sum rate (ESR) using an equivalent transformation of squared generalized-K random variables, and characterize the asymptotic ESR at high signal-to-noise ratio. Simulation results validate the theoretical analysis and demonstrate the ESR gains over conventional orthogonal multiple access, NOMA-based TWR, and symbiotic NOMA-based TWR, while revealing the impacts of the IoT device count, node distance, and power allocation on the ESR.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105959"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DiVOT: Differentiated interaction-guided video-level object tracking DiVOT:差异化交互引导视频级目标跟踪
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-27 DOI: 10.1016/j.dsp.2026.105955
Zhixi Wu, Si Chen, Da-Han Wang, Shunzhi Zhu
Recent advancements in video-level methods have made significant strides in the object tracking field. This method leverages multiple online templates to capture rich temporal information. However, most existing methods treat online templates as equally important as the initial template, overlooking the inherent instability of online templates during updating, which consequently degrades tracking performance. To alleviate this issue, we propose a novel differentiated interaction-guided video-level object tracking method, termed DiVOT, aimed at mitigating the impact of template instability and boosting the tracking performance. Our feature extraction network consists of a differentiated encoder block, which differentially guides the interaction between the search region and various templates, enabling the tracker to achieve a balance between stability and adaptability. Additionally, we design an auxiliary module, i.e., the memory decoder, to compensate for the deficiency of the differentiated interaction, where the latency of online templates hinders the acquisition of the most recent target appearance information. Extensive experiments on six mainstream datasets, i.e., OTB100, GOT-10k, TrackingNet, VOT2020, NFS, and LaSOT, validate the effectiveness of our proposed method.
视频级方法的最新进展在目标跟踪领域取得了重大进展。该方法利用多个在线模板来捕获丰富的时间信息。然而,大多数现有方法将在线模板视为与初始模板同等重要,忽略了在线模板在更新过程中固有的不稳定性,从而降低了跟踪性能。为了解决这个问题,我们提出了一种新的差异化交互引导视频级目标跟踪方法,称为DiVOT,旨在减轻模板不稳定的影响,提高跟踪性能。我们的特征提取网络由一个差异化的编码器块组成,它以差异化的方式引导搜索区域与各种模板之间的交互,使跟踪器在稳定性和适应性之间取得平衡。此外,我们设计了一个辅助模块,即记忆解码器,以弥补差异化交互的不足,其中在线模板的延迟阻碍了获取最新的目标外观信息。在OTB100、GOT-10k、TrackingNet、VOT2020、NFS和LaSOT等6个主流数据集上的大量实验验证了本文方法的有效性。
{"title":"DiVOT: Differentiated interaction-guided video-level object tracking","authors":"Zhixi Wu,&nbsp;Si Chen,&nbsp;Da-Han Wang,&nbsp;Shunzhi Zhu","doi":"10.1016/j.dsp.2026.105955","DOIUrl":"10.1016/j.dsp.2026.105955","url":null,"abstract":"<div><div>Recent advancements in video-level methods have made significant strides in the object tracking field. This method leverages multiple online templates to capture rich temporal information. However, most existing methods treat online templates as equally important as the initial template, overlooking the inherent instability of online templates during updating, which consequently degrades tracking performance. To alleviate this issue, we propose a novel differentiated interaction-guided video-level object tracking method, termed DiVOT, aimed at mitigating the impact of template instability and boosting the tracking performance. Our feature extraction network consists of a differentiated encoder block, which differentially guides the interaction between the search region and various templates, enabling the tracker to achieve a balance between stability and adaptability. Additionally, we design an auxiliary module, i.e., the memory decoder, to compensate for the deficiency of the differentiated interaction, where the latency of online templates hinders the acquisition of the most recent target appearance information. Extensive experiments on six mainstream datasets, i.e., OTB100, GOT-10k, TrackingNet, VOT2020, NFS, and LaSOT, validate the effectiveness of our proposed method.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105955"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal enhanced underwater image generation method using flow matching model 基于流量匹配模型的多模态增强水下图像生成方法
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-28 DOI: 10.1016/j.dsp.2026.105964
Haifeng Yu , Changxu Zhu , Ruicheng Zhang , Yankai Feng , Xinbin Li
Underwater Image Enhancement (UIE) methods and Underwater Object Detection (UOD) algorithms are used to monitor the growth of marine aquaculture organisms. However, compared to the original underwater image, image enhancement affects the accuracy of object detection. This paper proposes a Multimodal Enhanced Underwater Image Generation method based on flow matching (MEUIG) to generate enhanced underwater images containing object feature information. Firstly, a dual-branch flow matching model is designed which includes feature extraction branch and image enhancement branch. The feature extraction branch extracts the object feature information in the original underwater images. The enhanced underwater image in the image enhancement branch is achieved through the color-line method. Then, we proposed a fusion module to combine the information of the different modalities. This module fuses multimodal feature information which contains image generated by flow matching, feature information and enhanced image. Additionally, we construct a feature extraction module to extract the object features in the original image. Finally, a new loss function is designed, which considers the pixel movement path, the feature difference between the condition image and the output image and the reconstruction loss. Qualitative and quantitative evaluations show that MEUIG improves image quality while retaining the original information. Our method achieves significantly higher detection accuracy on YOLOv11 compared to existing underwater enhancement methods. In the detection of echinus, MEUIG method is 18.8% and 9.7% higher than the contrast enhancement method, respectively. The code of the MEUIG model and the 4889 dataset used for training the MEUIG model can be found at: https://github.com/Warmth-0213/MEUIG.git. The link of the 5455 underwater objects detection dataset is: https://github.com/Warmth-0213/data1.git.
水下图像增强(UIE)方法和水下目标检测(UOD)算法用于监测海洋水产养殖生物的生长。然而,与原始水下图像相比,图像增强会影响目标检测的精度。提出了一种基于流量匹配的多模态增强水下图像生成方法(MEUIG),用于生成包含目标特征信息的增强水下图像。首先,设计了包含特征提取分支和图像增强分支的双分支流匹配模型;特征提取分支提取原始水下图像中的目标特征信息。在图像增强分支中,水下图像的增强是通过色线法实现的。然后,我们提出了一个融合模块,将不同模态的信息进行融合。该模块融合多模态特征信息,多模态特征信息包含流匹配生成的图像、特征信息和增强图像。此外,我们构建了特征提取模块来提取原始图像中的目标特征。最后,设计了一个新的损失函数,该函数考虑了像素的运动路径、条件图像与输出图像的特征差异以及重建损失。定性和定量评价表明,meig在保留原始信息的同时提高了图像质量。与现有的水下增强方法相比,我们的方法在YOLOv11上实现了更高的检测精度。meig法对棘爪的检测比对比增强法分别高18.8%和9.7%。MEUIG模型的代码和用于训练MEUIG模型的4889数据集可以在:https://github.com/Warmth-0213/MEUIG.git上找到。5455水下目标检测数据集链接为:https://github.com/Warmth-0213/data1.git。
{"title":"Multimodal enhanced underwater image generation method using flow matching model","authors":"Haifeng Yu ,&nbsp;Changxu Zhu ,&nbsp;Ruicheng Zhang ,&nbsp;Yankai Feng ,&nbsp;Xinbin Li","doi":"10.1016/j.dsp.2026.105964","DOIUrl":"10.1016/j.dsp.2026.105964","url":null,"abstract":"<div><div>Underwater Image Enhancement (UIE) methods and Underwater Object Detection (UOD) algorithms are used to monitor the growth of marine aquaculture organisms. However, compared to the original underwater image, image enhancement affects the accuracy of object detection. This paper proposes a Multimodal Enhanced Underwater Image Generation method based on flow matching (MEUIG) to generate enhanced underwater images containing object feature information. Firstly, a dual-branch flow matching model is designed which includes feature extraction branch and image enhancement branch. The feature extraction branch extracts the object feature information in the original underwater images. The enhanced underwater image in the image enhancement branch is achieved through the color-line method. Then, we proposed a fusion module to combine the information of the different modalities. This module fuses multimodal feature information which contains image generated by flow matching, feature information and enhanced image. Additionally, we construct a feature extraction module to extract the object features in the original image. Finally, a new loss function is designed, which considers the pixel movement path, the feature difference between the condition image and the output image and the reconstruction loss. Qualitative and quantitative evaluations show that MEUIG improves image quality while retaining the original information. Our method achieves significantly higher detection accuracy on YOLOv11 compared to existing underwater enhancement methods. In the detection of echinus, MEUIG method is 18.8% and 9.7% higher than the contrast enhancement method, respectively. The code of the MEUIG model and the 4889 dataset used for training the MEUIG model can be found at: <span><span>https://github.com/Warmth-0213/MEUIG.git</span><svg><path></path></svg></span>. The link of the 5455 underwater objects detection dataset is: <span><span>https://github.com/Warmth-0213/data1.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105964"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid transfer semantic segmentation architecture for hyperspectral image classification 高光谱图像分类的混合传递语义分割体系结构
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-22 DOI: 10.1016/j.dsp.2025.105852
Huaiping Yan , Yupeng Hou , Chengcai Leng , Yilin Li , Yang Li
Hyperspectral image (HSI) classification is a research hotspot in the field of remote sensing image processing. Deep learning-based methods have gradually become one of the mainstream in the field of HSI classification. However, deep learning-based HSI classification methods still face the challenge of insufficient training samples. Transfer learning is regarded as an effective method to alleviate the problem of insufficient samples. However, hyperspectral image data is scarce, lacking the foundation for pre-training high-quality models. In this paper, a Hybrid Transfer Semantic Segmentation Architecture (HTSSA) is proposed, which transfers knowledge from different datasets by adopting different network structures. The proposed model adopts a triple branch network architecture. The three branches respectively use the vision transformer (ViT) classification model pre-trained on ImageNet, the Deeplabv3 semantic segmentation model pre-trained on the PASCAL VOC 2012 dataset, and the convolutional neural network (CNN) model pre-trained on the source hyperspectral image dataset. The three branch network models were fine-tuned on the target hyperspectral image dataset. The mapping modules were designed to handle the problem of heterogeneous data migration. The ViT branch utilizes the Transformer to extract spatial global context features. The Deeplabv3 branch utilizes the feature pyramid to extract spatial local multi-scale features. The CNN branch uses 3D-CNN to extract the spectral features of hyperspectral images. Finally, the final classification result is obtained by using the fusion features of the three branches. Extensive experiments on public datasets have verified that the Hybrid Transfer Semantic Segmentation Architecture proposed in this paper has alleviated the negative impact of sample scarcity to a certain extent, enhanced the representation ability of the model, and improved the final classification performance.
高光谱图像分类是遥感图像处理领域的一个研究热点。基于深度学习的方法已逐渐成为HSI分类领域的主流方法之一。然而,基于深度学习的HSI分类方法仍然面临训练样本不足的挑战。迁移学习被认为是缓解样本不足问题的有效方法。然而,高光谱图像数据稀缺,缺乏预训练高质量模型的基础。本文提出了一种混合传输语义分割架构(HTSSA),该架构采用不同的网络结构对不同数据集的知识进行传输。该模型采用三分支网络结构。这三个分支分别使用在ImageNet上预训练的视觉变换(vision transformer, ViT)分类模型、在PASCAL VOC 2012数据集上预训练的Deeplabv3语义分割模型和在源高光谱图像数据集上预训练的卷积神经网络(convolutional neural network, CNN)模型。在目标高光谱图像数据集上对三种分支网络模型进行微调。映射模块的设计是为了解决异构数据迁移问题。ViT分支利用Transformer提取空间全局上下文特征。Deeplabv3分支利用特征金字塔提取空间局部多尺度特征。CNN分支使用3D-CNN提取高光谱图像的光谱特征。最后,利用三个分支的融合特征得到最终的分类结果。在公共数据集上的大量实验验证了本文提出的混合迁移语义分割架构在一定程度上缓解了样本稀缺性的负面影响,增强了模型的表示能力,提高了最终的分类性能。
{"title":"Hybrid transfer semantic segmentation architecture for hyperspectral image classification","authors":"Huaiping Yan ,&nbsp;Yupeng Hou ,&nbsp;Chengcai Leng ,&nbsp;Yilin Li ,&nbsp;Yang Li","doi":"10.1016/j.dsp.2025.105852","DOIUrl":"10.1016/j.dsp.2025.105852","url":null,"abstract":"<div><div>Hyperspectral image (HSI) classification is a research hotspot in the field of remote sensing image processing. Deep learning-based methods have gradually become one of the mainstream in the field of HSI classification. However, deep learning-based HSI classification methods still face the challenge of insufficient training samples. Transfer learning is regarded as an effective method to alleviate the problem of insufficient samples. However, hyperspectral image data is scarce, lacking the foundation for pre-training high-quality models. In this paper, a Hybrid Transfer Semantic Segmentation Architecture (HTSSA) is proposed, which transfers knowledge from different datasets by adopting different network structures. The proposed model adopts a triple branch network architecture. The three branches respectively use the vision transformer (ViT) classification model pre-trained on ImageNet, the Deeplabv3 semantic segmentation model pre-trained on the PASCAL VOC 2012 dataset, and the convolutional neural network (CNN) model pre-trained on the source hyperspectral image dataset. The three branch network models were fine-tuned on the target hyperspectral image dataset. The mapping modules were designed to handle the problem of heterogeneous data migration. The ViT branch utilizes the Transformer to extract spatial global context features. The Deeplabv3 branch utilizes the feature pyramid to extract spatial local multi-scale features. The CNN branch uses 3D-CNN to extract the spectral features of hyperspectral images. Finally, the final classification result is obtained by using the fusion features of the three branches. Extensive experiments on public datasets have verified that the Hybrid Transfer Semantic Segmentation Architecture proposed in this paper has alleviated the negative impact of sample scarcity to a certain extent, enhanced the representation ability of the model, and improved the final classification performance.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105852"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KADNet:Low SNR automatic modulation classification via SNR aware deformable convolution and Kolmogorov-Arnold networks 低信噪比自动调制分类通过信噪比感知的可变形卷积和Kolmogorov-Arnold网络
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-21 DOI: 10.1016/j.dsp.2026.105942
Run Wang , Jizhe Li , Youze Yang , Shasha Wang , Bing Zheng
The proliferation of modern communication technologies has precipitated increasingly sophisticated electromagnetic environments, demanding more rigorous performance from Automatic Modulation Classification (AMC) systems, especially in low signal-to-noise ratio (SNR) scenarios where conventional approaches struggle with feature extraction and classification fidelity. In response, we propose KADNet, a novel architecture tailored for AMC in low-SNR scenarios.KADNet comprises two key components: a Signal Enhancement Module (SEM) and an SNR-Aware Deformable Convolutional Network (SADCN).In the SEM, time-domain I/Q samples are first projected into the frequency domain via the fast Fourier transform (FFT). A spectral weighting mask is then generated by a Kolmogorov-Arnold Network (KAN), enabling precise attenuation of noise and amplification of decision-relevant signal components. Subsequently, the SADCN employs a lightweight subnetwork to estimate a soft SNR map, which is then fused into deformable convolution operations via a Signal Quality Spatial Attention (SQSA) mechanism. This fusion produces secondary spatial offsets and modulation-adaptive weights, allowing sampling grids to adjust dynamically in response to local signal quality. Extensive experiments on the RADIOML 2016.10A/B benchmarks demonstrate the effectiveness of our design: KADNet achieves mean classification accuracies of 64.66 percent and 65.58 percent, corresponding to improvements of 2.04 percent and 0.56 percent over baseline methods. Moreover, within the extremely low-SNR range of -20 dB to -2 dB, KADNet attains average accuracies of 36.86 percent and 37.92 percent, surpassing the current state of the art by 3.0 percent to 3.8 percent. This significant improvement over the current state-of-the-art in the most challenging SNR conditions confirms that KADNet is a superior AMC method in low-SNR conditions.
现代通信技术的发展导致电磁环境日益复杂,对自动调制分类(AMC)系统的性能提出了更高的要求,特别是在低信噪比(SNR)的情况下,传统方法难以实现特征提取和分类保真度。因此,我们提出了KADNet,这是一种为低信噪比场景下的AMC量身定制的新架构。KADNet包括两个关键组件:信号增强模块(SEM)和感知信噪比的可变形卷积网络(SADCN)。在扫描电镜中,时域I/Q样本首先通过快速傅里叶变换(FFT)投射到频域。然后由Kolmogorov-Arnold网络(KAN)生成一个频谱加权掩模,从而实现对噪声的精确衰减和与决策相关的信号分量的放大。随后,SADCN采用轻量级子网来估计软信噪比映射,然后通过信号质量空间注意(SQSA)机制将其融合到可变形卷积操作中。这种融合产生二次空间偏移和调制自适应权重,允许采样网格动态调整以响应本地信号质量。在RADIOML 2016.10A/B基准测试上的大量实验证明了我们设计的有效性:KADNet实现了64.66%和65.58%的平均分类准确率,相对于基线方法提高了2.04%和0.56%。此外,在-20 dB至-2 dB的极低信噪比范围内,KADNet的平均精度达到36.86%和37.92%,比目前的技术水平高出3.0%至3.8%。在最具挑战性的信噪比条件下,与目前最先进的技术相比,这一重大改进证实了KADNet在低信噪比条件下是一种优越的AMC方法。
{"title":"KADNet:Low SNR automatic modulation classification via SNR aware deformable convolution and Kolmogorov-Arnold networks","authors":"Run Wang ,&nbsp;Jizhe Li ,&nbsp;Youze Yang ,&nbsp;Shasha Wang ,&nbsp;Bing Zheng","doi":"10.1016/j.dsp.2026.105942","DOIUrl":"10.1016/j.dsp.2026.105942","url":null,"abstract":"<div><div>The proliferation of modern communication technologies has precipitated increasingly sophisticated electromagnetic environments, demanding more rigorous performance from Automatic Modulation Classification (AMC) systems, especially in low signal-to-noise ratio (SNR) scenarios where conventional approaches struggle with feature extraction and classification fidelity. In response, we propose KADNet, a novel architecture tailored for AMC in low-SNR scenarios.KADNet comprises two key components: a Signal Enhancement Module (SEM) and an SNR-Aware Deformable Convolutional Network (SADCN).In the SEM, time-domain I/Q samples are first projected into the frequency domain via the fast Fourier transform (FFT). A spectral weighting mask is then generated by a Kolmogorov-Arnold Network (KAN), enabling precise attenuation of noise and amplification of decision-relevant signal components. Subsequently, the SADCN employs a lightweight subnetwork to estimate a soft SNR map, which is then fused into deformable convolution operations via a Signal Quality Spatial Attention (SQSA) mechanism. This fusion produces secondary spatial offsets and modulation-adaptive weights, allowing sampling grids to adjust dynamically in response to local signal quality. Extensive experiments on the RADIOML 2016.10A/B benchmarks demonstrate the effectiveness of our design: KADNet achieves mean classification accuracies of 64.66 percent and 65.58 percent, corresponding to improvements of 2.04 percent and 0.56 percent over baseline methods. Moreover, within the extremely low-SNR range of -20 dB to -2 dB, KADNet attains average accuracies of 36.86 percent and 37.92 percent, surpassing the current state of the art by 3.0 percent to 3.8 percent. This significant improvement over the current state-of-the-art in the most challenging SNR conditions confirms that KADNet is a superior AMC method in low-SNR conditions.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105942"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WDCGAN-GSMR: A more accurate framework for small-sample radar signal modulation recognition WDCGAN-GSMR:一种更精确的小样本雷达信号调制识别框架
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-31 DOI: 10.1016/j.dsp.2026.105971
Qinghui Zhang, Wenzheng Li, Chenxia Wan
Low Probability of Interception (LPI) radars feature strong anti-detection capabilities, rendering the acquisition of real signal samples extremely challenging. This severely restricts the performance of LPI radar signal modulation recognition under small-sample conditions. To address this issue, this paper proposes a novel Wasserstein Deep Convolutional Generative Adversarial Network integrated with Generative Spatial-Channel Synergistic Attention and Multi-Scale Asymmetric Convolutional Residual (WDCGAN-GSMR), to enhance recognition accuracy under small-sample conditions. The radar signals are first transformed into Time-Frequency Images (TFIs) using the Smoothed Pseudo Wigner–Ville Distribution (SPWVD). These limited TFIs are then augmented using WDCGAN-GSMR by combining real-world and simulated samples, and are finally fed into a convolutional neural network for model training and modulation recognition. Experimental results demonstrate that incorporating the MCR block into WDCGAN-GSMR model significantly reduces the computational complexity. When only 50 samples per class are available, combining the proposed WDCGAN-GSMR with MobileNetV1 improves recognition accuracy by 6.2%. When integrated with the ResNet18 model, the recognition accuracy of the WDCGAN-GSMR model achieves a 6.4% higher than the conventional DCGAN model. This proposed model effectively mitigates the issue of data scarcity and significantly enhances LPI radar signal modulation recognition under small-sample conditions, providing a novel and effective solution for enhancing radar signal modulation recognition.
低截获概率(LPI)雷达具有强大的反探测能力,使得真实信号样本的采集极具挑战性。这严重制约了小样本条件下LPI雷达信号调制识别的性能。为了解决这一问题,本文提出了一种结合生成式空间通道协同注意和多尺度不对称卷积残差(WDCGAN-GSMR)的新型Wasserstein深度卷积生成对抗网络(WDCGAN-GSMR),以提高小样本条件下的识别精度。首先利用平滑伪维格纳-维尔分布(SPWVD)将雷达信号转换成时频图像(tfi)。然后使用WDCGAN-GSMR通过结合真实世界和模拟样本来增强这些有限的tfi,并最终输入卷积神经网络进行模型训练和调制识别。实验结果表明,将MCR块纳入WDCGAN-GSMR模型可以显著降低计算复杂度。当每个类别只有50个样本时,将所提出的WDCGAN-GSMR与MobileNetV1相结合,识别准确率提高了6.2%。与ResNet18模型集成后,WDCGAN-GSMR模型的识别精度比传统的DCGAN模型提高了6.4%。该模型有效地缓解了数据稀缺问题,显著提高了小样本条件下LPI雷达信号调制识别能力,为增强雷达信号调制识别能力提供了一种新颖有效的解决方案。
{"title":"WDCGAN-GSMR: A more accurate framework for small-sample radar signal modulation recognition","authors":"Qinghui Zhang,&nbsp;Wenzheng Li,&nbsp;Chenxia Wan","doi":"10.1016/j.dsp.2026.105971","DOIUrl":"10.1016/j.dsp.2026.105971","url":null,"abstract":"<div><div>Low Probability of Interception (LPI) radars feature strong anti-detection capabilities, rendering the acquisition of real signal samples extremely challenging. This severely restricts the performance of LPI radar signal modulation recognition under small-sample conditions. To address this issue, this paper proposes a novel Wasserstein Deep Convolutional Generative Adversarial Network integrated with Generative Spatial-Channel Synergistic Attention and Multi-Scale Asymmetric Convolutional Residual (WDCGAN-GSMR), to enhance recognition accuracy under small-sample conditions. The radar signals are first transformed into Time-Frequency Images (TFIs) using the Smoothed Pseudo Wigner–Ville Distribution (SPWVD). These limited TFIs are then augmented using WDCGAN-GSMR by combining real-world and simulated samples, and are finally fed into a convolutional neural network for model training and modulation recognition. Experimental results demonstrate that incorporating the MCR block into WDCGAN-GSMR model significantly reduces the computational complexity. When only 50 samples per class are available, combining the proposed WDCGAN-GSMR with MobileNetV1 improves recognition accuracy by 6.2%. When integrated with the ResNet18 model, the recognition accuracy of the WDCGAN-GSMR model achieves a 6.4% higher than the conventional DCGAN model. This proposed model effectively mitigates the issue of data scarcity and significantly enhances LPI radar signal modulation recognition under small-sample conditions, providing a novel and effective solution for enhancing radar signal modulation recognition.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105971"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced underwater object tracking via adaptive image enhancement and multi-regularized correlation filters 通过自适应图像增强和多正则化相关滤波器增强水下目标跟踪
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-28 DOI: 10.1016/j.dsp.2026.105958
Endong Liu, Lihui Wang
Underwater Object Tracking (UOT) is essential for underwater ecological monitoring, marine resource exploration, and autonomous underwater robotics, yet it remains challenging due to low visibility, illumination variations, visual aberrations, and severe color distortions. To address these issues, this paper proposes a task-driven underwater object tracking framework that tightly integrates selective image enhancement with a multi-regularized correlation filter. Specifically, an adaptive image enhancement strategy derived from the generalized Dark Channel Prior (DCP) is selectively activated using CCF indicators (colorfulness, contrast, and fog density), enabling effective visual enhancement while preserving real-time performance. On this basis, a multi-regularized correlation filter incorporating Gaussian-shaped spatial constraints and channel reliability weighting is formulated to improve robustness and localization accuracy under complex underwater conditions. The resulting optimization problem is efficiently solved within an ADMM framework. Extensive experiments on the UOT100 and UTB180 datasets demonstrate that the proposed method consistently outperforms state-of-the-art trackers, achieving superior precision and success rates in challenging underwater scenarios.
水下目标跟踪(UOT)对于水下生态监测、海洋资源勘探和自主水下机器人至关重要,但由于能见度低、光照变化、视觉像差和严重的颜色失真,它仍然具有挑战性。为了解决这些问题,本文提出了一种任务驱动的水下目标跟踪框架,该框架将选择性图像增强与多正则化相关滤波器紧密结合。具体来说,从广义暗通道先验(DCP)衍生的自适应图像增强策略使用CCF指标(色彩,对比度和雾密度)选择性激活,在保持实时性能的同时实现有效的视觉增强。在此基础上,提出了一种结合高斯形空间约束和信道可靠性加权的多正则化相关滤波器,提高了复杂水下条件下的鲁棒性和定位精度。在ADMM框架内有效地解决了优化问题。在UOT100和UTB180数据集上进行的大量实验表明,所提出的方法始终优于最先进的跟踪器,在具有挑战性的水下场景中实现了更高的精度和成功率。
{"title":"Enhanced underwater object tracking via adaptive image enhancement and multi-regularized correlation filters","authors":"Endong Liu,&nbsp;Lihui Wang","doi":"10.1016/j.dsp.2026.105958","DOIUrl":"10.1016/j.dsp.2026.105958","url":null,"abstract":"<div><div>Underwater Object Tracking (UOT) is essential for underwater ecological monitoring, marine resource exploration, and autonomous underwater robotics, yet it remains challenging due to low visibility, illumination variations, visual aberrations, and severe color distortions. To address these issues, this paper proposes a task-driven underwater object tracking framework that tightly integrates selective image enhancement with a multi-regularized correlation filter. Specifically, an adaptive image enhancement strategy derived from the generalized Dark Channel Prior (DCP) is selectively activated using CCF indicators (colorfulness, contrast, and fog density), enabling effective visual enhancement while preserving real-time performance. On this basis, a multi-regularized correlation filter incorporating Gaussian-shaped spatial constraints and channel reliability weighting is formulated to improve robustness and localization accuracy under complex underwater conditions. The resulting optimization problem is efficiently solved within an ADMM framework. Extensive experiments on the UOT100 and UTB180 datasets demonstrate that the proposed method consistently outperforms state-of-the-art trackers, achieving superior precision and success rates in challenging underwater scenarios.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105958"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Digital Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1