首页 > 最新文献

Digital Signal Processing最新文献

英文 中文
Multiparameter estimation for bistatic EMVS-FDA-MIMO radar with arbitrarily configured arrays 任意阵列双基地EMVS-FDA-MIMO雷达的多参数估计
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-22 DOI: 10.1016/j.dsp.2026.105928
Huihui Ma , Haihong Tao , Yaxing Yue , Tiantian Zhong , Yunfei Fang , Le Wang
This study explores the multiparameter estimation challenge within bistatic frequency diverse array multiple-input-multiple-output (FDA-MIMO) radar system that employs arbitrarily configured electromagnetic vector sensor (EMVS) arrays. The signal reception model for the presented radar architecture is established. Building on this foundation, a subspace-based algorithm is proposed to achieve accurate estimation of spatial-polarization angles and ranges. First, rotation invariant structures in spatial domain are formed by constructing several virtual steering matrices, from which the normalized electromagnetic field vectors are derived. Then the two-dimensional direction-of-departure (2D-DOD) and two-dimensional direction-of-arrival (2D-DOA) estimates are computed through vector cross-product operation. Thereafter, polarization angles are determined using least squares (LS) approach. Finally, by compensating the steering matrix with the obtained 2D-DOD, the range estimation can be achieved. Furthermore, the developed framework is evaluated for its identifiability, flexibility, computational demands, and Cramér-Rao bound (CRB). It successfully estimates the targets’ spatial-polarization angles and ranges, while also achieving automatic parameters pairing. Simulation results demonstrate the validity of the developed approach.
本研究探讨了采用任意配置电磁矢量传感器(EMVS)阵列的双基地分频阵列多输入多输出(FDA-MIMO)雷达系统中的多参数估计挑战。建立了该雷达结构的信号接收模型。在此基础上,提出了一种基于子空间的算法来实现空间偏振角和距离的精确估计。首先,通过构造若干虚拟转向矩阵形成空间旋转不变结构,并由此导出归一化电磁场矢量;然后通过矢量叉乘运算计算二维出发方向(2D-DOD)和二维到达方向(2D-DOA)估计。然后,利用最小二乘法确定偏振角。最后,利用得到的2D-DOD对转向矩阵进行补偿,实现距离估计。此外,开发的框架评估其可识别性、灵活性、计算需求和cram - rao边界(CRB)。该方法成功地估计了目标的空间极化角度和距离,并实现了参数的自动配对。仿真结果验证了该方法的有效性。
{"title":"Multiparameter estimation for bistatic EMVS-FDA-MIMO radar with arbitrarily configured arrays","authors":"Huihui Ma ,&nbsp;Haihong Tao ,&nbsp;Yaxing Yue ,&nbsp;Tiantian Zhong ,&nbsp;Yunfei Fang ,&nbsp;Le Wang","doi":"10.1016/j.dsp.2026.105928","DOIUrl":"10.1016/j.dsp.2026.105928","url":null,"abstract":"<div><div>This study explores the multiparameter estimation challenge within bistatic frequency diverse array multiple-input-multiple-output (FDA-MIMO) radar system that employs arbitrarily configured electromagnetic vector sensor (EMVS) arrays. The signal reception model for the presented radar architecture is established. Building on this foundation, a subspace-based algorithm is proposed to achieve accurate estimation of spatial-polarization angles and ranges. First, rotation invariant structures in spatial domain are formed by constructing several virtual steering matrices, from which the normalized electromagnetic field vectors are derived. Then the two-dimensional direction-of-departure (2D-DOD) and two-dimensional direction-of-arrival (2D-DOA) estimates are computed through vector cross-product operation. Thereafter, polarization angles are determined using least squares (LS) approach. Finally, by compensating the steering matrix with the obtained 2D-DOD, the range estimation can be achieved. Furthermore, the developed framework is evaluated for its identifiability, flexibility, computational demands, and Cramér-Rao bound (CRB). It successfully estimates the targets’ spatial-polarization angles and ranges, while also achieving automatic parameters pairing. Simulation results demonstrate the validity of the developed approach.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105928"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
S2MFormer: Spatial-spectral Mamba-transformer complementary network for hyperspectral image classification S2MFormer:用于高光谱图像分类的空间-光谱曼巴-变形互补网络
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-02-02 DOI: 10.1016/j.dsp.2026.105951
Zewen Han , Qiong Huang , Liantao Lan
Hyperspectral image (HSI) classification is a fundamental task in remote sensing. The main challenge lies in effectively capturing both spatial and spectral features. Recently, the Mamba model has received increasing attention for its linear capability in long-range modeling. However, when applied to HSI, it still suffers from the loss of spatial structural information and the neglect of spectral locality. In addition, compared with Transformers, Mamba remains limited in capturing global contextual dependencies. To address these challenges, we propose S2MFormer, a novel Mamba-Transformer hybrid network. It adopts a dual-branch architecture to comprehensively capture both spatial and spectral features from HSI. In the spatial branch, a spatial snake-like scanning strategy is designed to preserve locality during data transformation. To compensate for Mamba’s limitations in global feature capture, we introduce a spatial intra-scale Transformer module, which uses a multi-head attention mechanism to enhance the extraction of global spatial information. For the spectral branch, we employ a spectral grouping strategy for efficient local modeling. This is combined with a spectral intra-scale Transformer to capture multi-dimensional global spectral context. Finally, a spatial-spectral fusion module precisely fuses the spatial and spectral features using learnable weights. Extensive experiments on several public datasets demonstrate that S2MFormer significantly outperforms existing methods in classification accuracy, thus validating the superiority of our proposed approach. Codes are available at https://github.com/hzw123456663592/S2MFormer/tree/master.
高光谱图像分类是遥感领域的一项基础性工作。主要的挑战在于有效地捕捉空间和光谱特征。近年来,曼巴模型因其在远程建模中的线性能力而受到越来越多的关注。然而,当应用于HSI时,它仍然存在空间结构信息丢失和忽略光谱局部性的问题。此外,与变形金刚相比,Mamba在捕获全局上下文依赖关系方面仍然有限。为了解决这些挑战,我们提出了S2MFormer,一种新的Mamba-Transformer混合网络。采用双分支架构,综合捕捉HSI的空间和光谱特征。在空间分支中,设计了空间蛇形扫描策略,在数据转换过程中保持局部性。为了弥补Mamba在全局特征捕获方面的局限性,我们引入了一个空间尺度内的Transformer模块,该模块使用多头注意机制来增强全局空间信息的提取。对于谱分支,我们采用谱分组策略进行高效的局部建模。这与光谱内尺度变压器相结合,以捕获多维全局光谱环境。最后,空间-光谱融合模块利用可学习的权重精确融合空间和光谱特征。在多个公开数据集上的大量实验表明,S2MFormer在分类精度上显著优于现有方法,从而验证了我们提出的方法的优越性。代码可在https://github.com/hzw123456663592/S2MFormer/tree/master上获得。
{"title":"S2MFormer: Spatial-spectral Mamba-transformer complementary network for hyperspectral image classification","authors":"Zewen Han ,&nbsp;Qiong Huang ,&nbsp;Liantao Lan","doi":"10.1016/j.dsp.2026.105951","DOIUrl":"10.1016/j.dsp.2026.105951","url":null,"abstract":"<div><div>Hyperspectral image (HSI) classification is a fundamental task in remote sensing. The main challenge lies in effectively capturing both spatial and spectral features. Recently, the Mamba model has received increasing attention for its linear capability in long-range modeling. However, when applied to HSI, it still suffers from the loss of spatial structural information and the neglect of spectral locality. In addition, compared with Transformers, Mamba remains limited in capturing global contextual dependencies. To address these challenges, we propose S<sup>2</sup>MFormer, a novel Mamba-Transformer hybrid network. It adopts a dual-branch architecture to comprehensively capture both spatial and spectral features from HSI. In the spatial branch, a spatial snake-like scanning strategy is designed to preserve locality during data transformation. To compensate for Mamba’s limitations in global feature capture, we introduce a spatial intra-scale Transformer module, which uses a multi-head attention mechanism to enhance the extraction of global spatial information. For the spectral branch, we employ a spectral grouping strategy for efficient local modeling. This is combined with a spectral intra-scale Transformer to capture multi-dimensional global spectral context. Finally, a spatial-spectral fusion module precisely fuses the spatial and spectral features using learnable weights. Extensive experiments on several public datasets demonstrate that S<sup>2</sup>MFormer significantly outperforms existing methods in classification accuracy, thus validating the superiority of our proposed approach. Codes are available at <span><span>https://github.com/hzw123456663592/S2MFormer/tree/master</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105951"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ASCF-RTDETR: Adaptive scale collaborative feature learning for epithelial cell detection in multichannel fluorescence images ASCF-RTDETR:多通道荧光图像中上皮细胞检测的自适应尺度协同特征学习
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-15 Epub Date: 2026-01-30 DOI: 10.1016/j.dsp.2026.105954
Zhimin Lu , Qing Zhang , Boheng Tian , Fuhua Ge , Chenxi Mo , Rui Guo , Xianbin Duan , Chunming Guo , Pengfei Yu
Multichannel fluorescence imaging plays a pivotal role in cell type identification and pathological diagnosis. However, manual analysis of fluorescence images is prone to misdiagnoses and missed diagnoses. Although AI algorithms hold promise, current methods struggle to extract discriminative features, thereby compromising the accuracy of pathological analysis. This study proposes ASCF-RTDETR, a novel model for precisely detecting epithelial cells in multichannel fluorescence images. ASCF-RTDETR incorporates an Adaptive Multi-Scale Collaborative Feature Fusion (AMFF) module, enabling comprehensive feature interaction through horizontal and vertical dual-path parallel propagation. This is complemented by a High-Efficiency Feature Upsampling Convolution (HFUC) and Multi-Scale Convolution Block (MSCB), enhancing feature representation. Furthermore, a Dynamic Histogram Attention-based Intra-scale Feature Interaction (DHIFI) module is introduced, leveraging bin-wise and frequency-wise dual-path reconstruction to enhance cell boundary features. Concurrently, a lightweight Dual Convolution (DualConv) structure is integrated to reduce computational complexity and provide implicit regularization against imaging noise. Experiments on a self-constructed multichannel fluorescence-labeled epithelial cell dataset demonstrate ASCF-RTDETR’s superior detection performance, achieving a 93.5% mAP50 and 90.7% F1-score, with nearly 50% reduced computational cost compared to baseline models. The model also exhibits strong generalization across multiple public datasets, offering a reliable solution for automated epithelial cell detection and analysis.
多通道荧光成像在细胞类型鉴定和病理诊断中起着关键作用。然而,人工荧光图像分析容易误诊和漏诊。虽然人工智能算法有希望,但目前的方法难以提取判别特征,从而影响了病理分析的准确性。本研究提出了ASCF-RTDETR,一种在多通道荧光图像中精确检测上皮细胞的新模型。ASCF-RTDETR集成了自适应多尺度协同特征融合(AMFF)模块,通过水平和垂直双路径并行传播实现全面的特征交互。这是一个高效特征上采样卷积(HFUC)和多尺度卷积块(MSCB)的补充,增强了特征表示。此外,引入了基于动态直方图注意力的尺度内特征交互(dhfi)模块,利用双路径和频率双路径重建来增强细胞边界特征。同时,集成了轻量级的对偶卷积(DualConv)结构,以降低计算复杂度并提供针对成像噪声的隐式正则化。在自构建的多通道荧光标记上皮细胞数据集上的实验表明,ASCF-RTDETR具有优越的检测性能,实现了93.5%的mAP50和90.7%的f1评分,与基线模型相比,计算成本降低了近50%。该模型还展示了跨多个公共数据集的强泛化,为自动上皮细胞检测和分析提供了可靠的解决方案。
{"title":"ASCF-RTDETR: Adaptive scale collaborative feature learning for epithelial cell detection in multichannel fluorescence images","authors":"Zhimin Lu ,&nbsp;Qing Zhang ,&nbsp;Boheng Tian ,&nbsp;Fuhua Ge ,&nbsp;Chenxi Mo ,&nbsp;Rui Guo ,&nbsp;Xianbin Duan ,&nbsp;Chunming Guo ,&nbsp;Pengfei Yu","doi":"10.1016/j.dsp.2026.105954","DOIUrl":"10.1016/j.dsp.2026.105954","url":null,"abstract":"<div><div>Multichannel fluorescence imaging plays a pivotal role in cell type identification and pathological diagnosis. However, manual analysis of fluorescence images is prone to misdiagnoses and missed diagnoses. Although AI algorithms hold promise, current methods struggle to extract discriminative features, thereby compromising the accuracy of pathological analysis. This study proposes ASCF-RTDETR, a novel model for precisely detecting epithelial cells in multichannel fluorescence images. ASCF-RTDETR incorporates an Adaptive Multi-Scale Collaborative Feature Fusion (AMFF) module, enabling comprehensive feature interaction through horizontal and vertical dual-path parallel propagation. This is complemented by a High-Efficiency Feature Upsampling Convolution (HFUC) and Multi-Scale Convolution Block (MSCB), enhancing feature representation. Furthermore, a Dynamic Histogram Attention-based Intra-scale Feature Interaction (DHIFI) module is introduced, leveraging bin-wise and frequency-wise dual-path reconstruction to enhance cell boundary features. Concurrently, a lightweight Dual Convolution (DualConv) structure is integrated to reduce computational complexity and provide implicit regularization against imaging noise. Experiments on a self-constructed multichannel fluorescence-labeled epithelial cell dataset demonstrate ASCF-RTDETR’s superior detection performance, achieving a 93.5% <em>mAP</em><sub>50</sub> and 90.7% <em>F1</em>-score, with nearly 50% reduced computational cost compared to baseline models. The model also exhibits strong generalization across multiple public datasets, offering a reliable solution for automated epithelial cell detection and analysis.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105954"},"PeriodicalIF":3.0,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring feature pyramid networks and feature fusion for generalized Deepfake detection 基于广义深度伪造检测的特征金字塔网络与特征融合研究
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-01 Epub Date: 2026-01-19 DOI: 10.1016/j.dsp.2026.105945
Gaoming Yang , Biaohu Sun , Xiujun Wang
The accelerated progression of deepfake technologies has triggered a serious trust crisis and motivated numerous scholars to pursue effective methods for detecting such forgeries. However, current detection methods heavily rely on limited forgery cues and irrelevant information to boost intra-dataset performance, and they struggle with generalization and robustness in real-world scenarios. To tackle these problems, we design a Multi-Scale Feature Pyramid Network (MS-FPN) that focuses on forgery regions, and an altered-trace enhancement strategy to reveal more tampering artifacts. Specifically, the MS-FPN performs forgery-region segmentation during feature extraction, which counteracts the detector’s reliance on forgery-irrelevant information and allows it to concentrate on more altered areas. Furthermore, a plug-and-play Cross-Feature Spatial Attention (CFSA) module is proposed to strengthen the constraints on high-level features. In addition, we develop the falsified images re-mixing method to highlight more generalized artifacts by blending two augmented forgery images, while a Multi-level Feature Fusion (MLFF) module is utilized to integrate multi-scale features, enabling the network to capture fine-grained local features. Extensive experiments on multiple public benchmarks demonstrate that the proposed method achieves superior cross-dataset and cross-manipulation generalization, achieving AUC scores of 93.22% on CDF2, 96.88% on UADFV, and 92.67% on DFD. Visualization results further confirm that our approach produces interpretable and reliable evidence for face forgery forensics. The code is available at https://github.com/Sun-researcher/SD-Net-main
深度伪造技术的加速发展引发了严重的信任危机,并促使众多学者寻求检测此类伪造的有效方法。然而,当前的检测方法严重依赖于有限的伪造线索和不相关信息来提高数据集内的性能,并且在现实场景中难以实现泛化和鲁棒性。为了解决这些问题,我们设计了一个专注于伪造区域的多尺度特征金字塔网络(MS-FPN),以及一个改变痕迹增强策略来揭示更多的篡改工件。具体来说,MS-FPN在特征提取过程中执行伪造区域分割,这抵消了检测器对伪造无关信息的依赖,并允许其专注于更多改变的区域。此外,提出了一个即插即用的跨特征空间注意(CFSA)模块,以加强对高级特征的约束。此外,我们开发了伪造图像重新混合方法,通过混合两幅增强的伪造图像来突出更广义的伪像,同时利用多层次特征融合(MLFF)模块集成多尺度特征,使网络能够捕获细粒度的局部特征。在多个公共基准上的大量实验表明,该方法实现了优异的跨数据集和跨操作泛化,在CDF2、UADFV和DFD上的AUC得分分别为93.22%、96.88%和92.67%。可视化结果进一步证实,我们的方法为面部伪造取证提供了可解释和可靠的证据。代码可在https://github.com/Sun-researcher/SD-Net-main上获得
{"title":"Exploring feature pyramid networks and feature fusion for generalized Deepfake detection","authors":"Gaoming Yang ,&nbsp;Biaohu Sun ,&nbsp;Xiujun Wang","doi":"10.1016/j.dsp.2026.105945","DOIUrl":"10.1016/j.dsp.2026.105945","url":null,"abstract":"<div><div>The accelerated progression of deepfake technologies has triggered a serious trust crisis and motivated numerous scholars to pursue effective methods for detecting such forgeries. However, current detection methods heavily rely on limited forgery cues and irrelevant information to boost intra-dataset performance, and they struggle with generalization and robustness in real-world scenarios. To tackle these problems, we design a Multi-Scale Feature Pyramid Network (MS-FPN) that focuses on forgery regions, and an altered-trace enhancement strategy to reveal more tampering artifacts. Specifically, the MS-FPN performs forgery-region segmentation during feature extraction, which counteracts the detector’s reliance on forgery-irrelevant information and allows it to concentrate on more altered areas. Furthermore, a plug-and-play Cross-Feature Spatial Attention (CFSA) module is proposed to strengthen the constraints on high-level features. In addition, we develop the falsified images re-mixing method to highlight more generalized artifacts by blending two augmented forgery images, while a Multi-level Feature Fusion (MLFF) module is utilized to integrate multi-scale features, enabling the network to capture fine-grained local features. Extensive experiments on multiple public benchmarks demonstrate that the proposed method achieves superior cross-dataset and cross-manipulation generalization, achieving AUC scores of 93.22% on CDF2, 96.88% on UADFV, and 92.67% on DFD. Visualization results further confirm that our approach produces interpretable and reliable evidence for face forgery forensics. The code is available at <span><span>https://github.com/Sun-researcher/SD-Net-main</span><svg><path></path></svg></span></div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105945"},"PeriodicalIF":3.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bit-level quad-block shuffling and sequential summing dispersing image encryption based on hyperchaotic 2D Euler Pi Crossed Sine Map 基于超混沌二维欧拉Pi交叉正弦映射的位级四块洗牌和顺序和分散图像加密
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-01 Epub Date: 2026-01-19 DOI: 10.1016/j.dsp.2026.105941
Omer Kocak , Uğur Erkan , Ismail Babaoglu
Chaos-based image encryption methods strongly depend on the complexity and dynamic behavior of chaotic maps to achieve effective permutation and diffusion. In this study, a novel two-dimensional Euler Pi Crossed Sine (2D-EPICS) chaotic map is introduced, which exhibits hyperchaotic dynamics, wide chaotic ranges, and high sensitivity to initial conditions. The chaotic properties of the proposed map are rigorously analyzed using bifurcation diagrams, phase trajectories, Lyapunov exponents, and multiple entropy measures, including sample entropy, permutation entropy, Kolmogorov entropy, and C0 complexity, confirming its strong nonlinear behavior and unpredictability. Building upon this chaotic foundation, the Bit-Level Quad-Block Shuffling and Sequential Summing Dispersing Image Encryption (BQSSSD-IE) scheme is then developed. The encryption process consists of a bit-level permutation stage based on quadruple pixel blocks, followed by a bidirectional diffusion stage achieved through cumulative row-wise and column-wise summations, both driven by sequences generated from the 2D-EPICS map. Extensive security analyses and comparative evaluations demonstrate that the proposed method provides high entropy, low pixel correlation, strong resistance against statistical, differential, noise, and cropping attacks, and competitive computational efficiency. The enhanced dynamic behavior of the 2D-EPICS map significantly strengthens the overall confusion and diffusion capabilities of the encryption scheme, making BQSSSD-IE suitable for secure and real-time image protection applications.
基于混沌的图像加密方法强烈依赖于混沌映射的复杂性和动态行为来实现有效的排列和扩散。本文提出了一种新的二维欧拉Pi交叉正弦(2D-EPICS)混沌映射,该映射具有超混沌动力学、宽混沌范围和对初始条件的高灵敏度。利用分岔图、相轨迹、Lyapunov指数和多熵度量(包括样本熵、排列熵、Kolmogorov熵和C0复杂度)严格分析了所提出映射的混沌特性,证实了其强非线性行为和不可预测性。建立在这个混沌的基础上,然后开发了比特级四块洗牌和顺序求和分散图像加密(BQSSSD-IE)方案。加密过程包括基于四倍像素块的位级排列阶段,然后是通过累积的行向和列向求和实现的双向扩散阶段,这两个阶段都是由2D-EPICS映射生成的序列驱动的。广泛的安全性分析和比较评估表明,所提出的方法具有高熵、低像素相关性、对统计、差分、噪声和裁剪攻击具有很强的抵抗能力以及具有竞争力的计算效率。增强的2D-EPICS地图动态行为显著增强了加密方案的整体混淆和扩散能力,使BQSSSD-IE适合于安全和实时图像保护应用。
{"title":"Bit-level quad-block shuffling and sequential summing dispersing image encryption based on hyperchaotic 2D Euler Pi Crossed Sine Map","authors":"Omer Kocak ,&nbsp;Uğur Erkan ,&nbsp;Ismail Babaoglu","doi":"10.1016/j.dsp.2026.105941","DOIUrl":"10.1016/j.dsp.2026.105941","url":null,"abstract":"<div><div>Chaos-based image encryption methods strongly depend on the complexity and dynamic behavior of chaotic maps to achieve effective permutation and diffusion. In this study, a novel two-dimensional Euler Pi Crossed Sine (2D-EPICS) chaotic map is introduced, which exhibits hyperchaotic dynamics, wide chaotic ranges, and high sensitivity to initial conditions. The chaotic properties of the proposed map are rigorously analyzed using bifurcation diagrams, phase trajectories, Lyapunov exponents, and multiple entropy measures, including sample entropy, permutation entropy, Kolmogorov entropy, and C0 complexity, confirming its strong nonlinear behavior and unpredictability. Building upon this chaotic foundation, the Bit-Level Quad-Block Shuffling and Sequential Summing Dispersing Image Encryption (BQSSSD-IE) scheme is then developed. The encryption process consists of a bit-level permutation stage based on quadruple pixel blocks, followed by a bidirectional diffusion stage achieved through cumulative row-wise and column-wise summations, both driven by sequences generated from the 2D-EPICS map. Extensive security analyses and comparative evaluations demonstrate that the proposed method provides high entropy, low pixel correlation, strong resistance against statistical, differential, noise, and cropping attacks, and competitive computational efficiency. The enhanced dynamic behavior of the 2D-EPICS map significantly strengthens the overall confusion and diffusion capabilities of the encryption scheme, making BQSSSD-IE suitable for secure and real-time image protection applications.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105941"},"PeriodicalIF":3.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HAIR-GLMB: Hybrid appearance-IoU reinforced GLMB filter for UAV-based multi-target tracking HAIR-GLMB:用于无人机多目标跟踪的混合appearance-IoU增强GLMB滤波器
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-01 Epub Date: 2026-01-13 DOI: 10.1016/j.dsp.2026.105906
Haiyi Tong, Dekang Zhu, Zhou Zhang
This paper presents HAIR-GLMB, a Hybrid Appearance and IoU Reinforced Generalized Labeled Multi-Bernoulli (GLMB) filter tailored for multi-target tracking in challenging unmanned aerial vehicle (UAV) scenarios. To address frequent association ambiguities caused by dense target distributions, we propose an adaptive hybrid cost matrix that integrates Intersection-over-Union (IoU) spatial cues with appearance similarity. Specifically, an entropy-based adaptive weighting mechanism dynamically balances spatial and appearance information, thereby enhancing association reliability. We further develop a reinforced likelihood computation within the GLMB recursion, explicitly embedding spatial and appearance information into the update process. A motion-aware adaptive survival probability model is also proposed, effectively sustaining track continuity for inward-moving targets near the boundaries of the camera’s field of view. To improve efficiency, the Gibbs sampler is initialized with an assignment obtained by the Hungarian algorithm on the hybrid cost matrix, placing the Markov chain near high-probability regions and reducing sampling overhead under a limited computational budget. Experiments on challenging UAV benchmarks (VisDrone2019, UAVDT) show that HAIR-GLMB consistently outperforms a GLMB baseline relying only on IoU, yielding higher tracking accuracy, fewer identity switches, and reduced fragmentation.
本文提出了HAIR-GLMB滤波器,这是一种专为具有挑战性的无人机场景中的多目标跟踪而设计的混合外观和IoU增强广义标记多伯努利(GLMB)滤波器。为了解决密集目标分布引起的频繁关联模糊,我们提出了一个自适应混合成本矩阵,该矩阵将交叉-超联合(IoU)空间线索与外观相似性集成在一起。具体而言,基于熵的自适应加权机制动态平衡空间和外观信息,从而提高关联的可靠性。我们进一步在GLMB递归中开发了强化的似然计算,明确地将空间和外观信息嵌入到更新过程中。提出了一种运动感知自适应生存概率模型,有效地维持了摄像机视场边界附近向内运动目标的轨迹连续性。为了提高效率,Gibbs采样器使用匈牙利算法在混合代价矩阵上得到的赋值进行初始化,将马尔可夫链放置在高概率区域附近,在有限的计算预算下减少采样开销。在具有挑战性的无人机基准测试(VisDrone2019, UAVDT)上进行的实验表明,HAIR-GLMB始终优于仅依赖IoU的GLMB基线,具有更高的跟踪精度,更少的身份切换和更少的碎片化。
{"title":"HAIR-GLMB: Hybrid appearance-IoU reinforced GLMB filter for UAV-based multi-target tracking","authors":"Haiyi Tong,&nbsp;Dekang Zhu,&nbsp;Zhou Zhang","doi":"10.1016/j.dsp.2026.105906","DOIUrl":"10.1016/j.dsp.2026.105906","url":null,"abstract":"<div><div>This paper presents HAIR-GLMB, a Hybrid Appearance and IoU Reinforced Generalized Labeled Multi-Bernoulli (GLMB) filter tailored for multi-target tracking in challenging unmanned aerial vehicle (UAV) scenarios. To address frequent association ambiguities caused by dense target distributions, we propose an adaptive hybrid cost matrix that integrates Intersection-over-Union (IoU) spatial cues with appearance similarity. Specifically, an entropy-based adaptive weighting mechanism dynamically balances spatial and appearance information, thereby enhancing association reliability. We further develop a reinforced likelihood computation within the GLMB recursion, explicitly embedding spatial and appearance information into the update process. A motion-aware adaptive survival probability model is also proposed, effectively sustaining track continuity for inward-moving targets near the boundaries of the camera’s field of view. To improve efficiency, the Gibbs sampler is initialized with an assignment obtained by the Hungarian algorithm on the hybrid cost matrix, placing the Markov chain near high-probability regions and reducing sampling overhead under a limited computational budget. Experiments on challenging UAV benchmarks (VisDrone2019, UAVDT) show that HAIR-GLMB consistently outperforms a GLMB baseline relying only on IoU, yielding higher tracking accuracy, fewer identity switches, and reduced fragmentation.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105906"},"PeriodicalIF":3.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Physical layer security analysis and resource optimization for satellite-terrestrial multi-antenna systems 星-地多天线系统物理层安全分析与资源优化
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-01 Epub Date: 2026-01-20 DOI: 10.1016/j.dsp.2026.105949
Kexin Wang , Jian Zhang , Gang Xin , Jun Gao , Chengxian Ge , Yan Li
This paper proposes a multi-antenna based physical layer secure communication scheme and conducts a quantitative analysis of its secret key rate (SKR). Firstly, in order to meet the secure communication needs of long-distance terminals, a Rician fading multi-antenna secure communication model that conforms to the characteristics of satellite broadcast channels was established. Secondly, the upper and lower bounds of the SKR as well as its asymptotic limit under large-scale eavesdropping scenarios are derived, and the quantitative impacts of the antenna number ratio and channel gain ratio on the SKR limit are elucidated. To address the problem of limited antenna resources at the satellite, an optimal antenna number allocation strategy among legitimate terminals is further proposed. Simulations verify that tilting the antenna number allocation toward legitimate terminals with more prominent channel advantages can maximize the SKR. Compared with the uniform allocation of antenna numbers, the optimal allocation strategy can improve the SKR by up to 4.6% under certain scenario conditions. In addition, this strategy can achieve a positive SKR with only half the number of antennas. This scheme effectively addresses the issues of insufficient model adaptability and low resource efficiency in existing studies on satellite scenarios, providing a key theoretical basis for the design of satellite secure communication systems.
提出了一种基于多天线的物理层安全通信方案,并对其密钥率(SKR)进行了定量分析。首先,为了满足远距离终端的保密通信需求,建立了一种符合卫星广播信道特点的多天线衰落保密通信模型。其次,推导了大规模窃听场景下SKR的上下界及其渐近极限,并阐明了天线数比和信道增益比对SKR极限的定量影响;针对卫星天线资源有限的问题,进一步提出了合法终端间天线数的优化分配策略。仿真结果表明,将天线数分配向具有更突出信道优势的合法终端倾斜可以使SKR最大化。与天线数均匀分配相比,在一定场景条件下,该优化分配策略可使SKR提高4.6%。此外,该策略可以实现一个正的SKR,只有一半的天线数量。该方案有效地解决了现有卫星场景研究中存在的模型适应性不足、资源效率低等问题,为卫星保密通信系统的设计提供了关键的理论依据。
{"title":"Physical layer security analysis and resource optimization for satellite-terrestrial multi-antenna systems","authors":"Kexin Wang ,&nbsp;Jian Zhang ,&nbsp;Gang Xin ,&nbsp;Jun Gao ,&nbsp;Chengxian Ge ,&nbsp;Yan Li","doi":"10.1016/j.dsp.2026.105949","DOIUrl":"10.1016/j.dsp.2026.105949","url":null,"abstract":"<div><div>This paper proposes a multi-antenna based physical layer secure communication scheme and conducts a quantitative analysis of its secret key rate (SKR). Firstly, in order to meet the secure communication needs of long-distance terminals, a Rician fading multi-antenna secure communication model that conforms to the characteristics of satellite broadcast channels was established. Secondly, the upper and lower bounds of the SKR as well as its asymptotic limit under large-scale eavesdropping scenarios are derived, and the quantitative impacts of the antenna number ratio and channel gain ratio on the SKR limit are elucidated. To address the problem of limited antenna resources at the satellite, an optimal antenna number allocation strategy among legitimate terminals is further proposed. Simulations verify that tilting the antenna number allocation toward legitimate terminals with more prominent channel advantages can maximize the SKR. Compared with the uniform allocation of antenna numbers, the optimal allocation strategy can improve the SKR by up to 4.6% under certain scenario conditions. In addition, this strategy can achieve a positive SKR with only half the number of antennas. This scheme effectively addresses the issues of insufficient model adaptability and low resource efficiency in existing studies on satellite scenarios, providing a key theoretical basis for the design of satellite secure communication systems.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105949"},"PeriodicalIF":3.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A true target signal extraction method for defending against dense false target jamming in multistatic radar systems 多基地雷达系统中防御密集假目标干扰的真目标信号提取方法
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-01 Epub Date: 2026-01-13 DOI: 10.1016/j.dsp.2026.105910
Dingli Lou, Tuo Fu, Defeng Chen, Huawei Cao
Dense false target jamming (DFTJ) is a typical form of active jamming that generates numerous false targets along the radar line of sight, significantly degrading the detection and tracking performance of radar systems. In multistatic radar systems with spatially separated receivers, jamming signals originating from the same source become highly correlated across various receivers after compensating for their delay and Doppler frequency differences, whereas true target echoes remain weakly correlated because of varying observation geometries. On the basis of these differences, we propose a method for extracting true target signals from jammed echoes. First, the jamming signals are aligned across different receivers by compensating for their amplitude, delay, and Doppler frequency differences. The compensated and pulse-compressed echoes are then stacked into a signal matrix, where the false targets remain nearly invariant across different columns and thus form a low-rank component, while the true targets exhibit amplitude, delay, and Doppler frequency variations, manifesting as sparse high-rank components. Based on this structural distinction, we formulate a robust principal component analysis problem for extracting the true target signals and solve it using the block coordinate descent approach. To satisfy real-time processing demands, we further develop a sequential processing-based version of the proposed method. The numerical simulation results demonstrate the effectiveness of the proposed method, which shows stable performance under different DFTJ strategies, jamming parameters and target characteristics.
密集假目标干扰(DFTJ)是一种典型的有源干扰形式,它会在雷达瞄准线沿线产生大量假目标,严重影响雷达系统的探测和跟踪性能。在具有空间分离接收机的多基地雷达系统中,来自同一源的干扰信号在补偿延迟和多普勒频率差异后,在不同接收机之间变得高度相关,而真实目标回波由于观测几何形状的变化而保持弱相关。基于这些差异,我们提出了一种从干扰回波中提取真实目标信号的方法。首先,通过补偿干扰信号的幅度、延迟和多普勒频率差异,使干扰信号在不同的接收器上对齐。然后将经过补偿和脉冲压缩的回波叠加到信号矩阵中,其中假目标在不同列之间保持几乎不变,从而形成低秩分量,而真实目标表现出幅度、延迟和多普勒频率变化,表现为稀疏的高秩分量。基于这种结构上的区别,我们提出了一个鲁棒的主成分分析问题,用于提取真实目标信号,并使用块坐标下降法进行求解。为了满足实时处理需求,我们进一步开发了基于顺序处理的方法。数值仿真结果验证了该方法的有效性,在不同的DFTJ策略、干扰参数和目标特性下均表现出稳定的性能。
{"title":"A true target signal extraction method for defending against dense false target jamming in multistatic radar systems","authors":"Dingli Lou,&nbsp;Tuo Fu,&nbsp;Defeng Chen,&nbsp;Huawei Cao","doi":"10.1016/j.dsp.2026.105910","DOIUrl":"10.1016/j.dsp.2026.105910","url":null,"abstract":"<div><div>Dense false target jamming (DFTJ) is a typical form of active jamming that generates numerous false targets along the radar line of sight, significantly degrading the detection and tracking performance of radar systems. In multistatic radar systems with spatially separated receivers, jamming signals originating from the same source become highly correlated across various receivers after compensating for their delay and Doppler frequency differences, whereas true target echoes remain weakly correlated because of varying observation geometries. On the basis of these differences, we propose a method for extracting true target signals from jammed echoes. First, the jamming signals are aligned across different receivers by compensating for their amplitude, delay, and Doppler frequency differences. The compensated and pulse-compressed echoes are then stacked into a signal matrix, where the false targets remain nearly invariant across different columns and thus form a low-rank component, while the true targets exhibit amplitude, delay, and Doppler frequency variations, manifesting as sparse high-rank components. Based on this structural distinction, we formulate a robust principal component analysis problem for extracting the true target signals and solve it using the block coordinate descent approach. To satisfy real-time processing demands, we further develop a sequential processing-based version of the proposed method. The numerical simulation results demonstrate the effectiveness of the proposed method, which shows stable performance under different DFTJ strategies, jamming parameters and target characteristics.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105910"},"PeriodicalIF":3.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSFI Net Pro: A tiny slender crack detection model with stronger feature utilization capability MSFI Net Pro:一个具有更强特征利用能力的微小细长裂纹检测模型
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-01 Epub Date: 2026-01-20 DOI: 10.1016/j.dsp.2026.105931
Fan Yang , Junzhou Huo , Zhenxiang Guan , Hua Li , Zhang Cheng
To address the limitations of conventional object detection neck structures–particularly insufficient utilization of deep features and inadequate multi-scale interactions in micro-crack detection–this paper introduces the Multi-scale Feature Stereo Interaction Network, MSFI Net (Pro). At its core lies the innovative Multi-scale Stereo Feature Extraction (MSFE) Block, which constructs parallel spatial interaction pathways across three adjacent scales to comprehensively integrate feature maps from shallow, intermediate, and deep layers. Simultaneously, it introduces a gradient enhancement mechanism through dual hybrid residual connections, effectively preserving gradient flow and feature integrity. MSFI Net (Pro) further engineers a dual-stage fusion pipeline, cascading two MSFE Blocks for coarse refinement followed by precise fine-tuning of features. This synergizes with dense cross-layer connectivity to fortify information propagation. Moreover, the network incorporates shallower P2-layer feature maps, injecting less noisy geometric information that significantly bolsters the recognition capability for slender cracks. Validation on enhanced micro-crack datasets and the Severstal steel defect dataset demonstrates MSFI Net (Pro)’s consistent performance uplift for baseline models. Specifically, under micro-crack test conditions, it achieves a 0.144 improvement in AP50-95 for YOLOv11n, while simultaneously boosting recall rates and prediction confidence for micro-crack targets. Compared to mainstream SOTA neck-optimized models, MSFI Net (Pro) maintains significant performance advantages in detection precision, classification accuracy, and localization efficacy.
为了解决传统目标检测颈部结构的局限性,特别是在微裂纹检测中深层特征的利用不足和多尺度相互作用不足,本文介绍了多尺度特征立体相互作用网络MSFI Net (Pro)。其核心是创新的多尺度立体特征提取(Multi-scale Stereo Feature Extraction, MSFE)区块,构建三个相邻尺度的平行空间交互路径,全面整合浅层、中间层和深层特征图。同时,通过双混合残差连接引入梯度增强机制,有效保持梯度流和特征完整性。MSFI Net (Pro)进一步设计了双级融合管道,级联两个MSFE块进行粗细化,然后进行精确的特征微调。这与密集的跨层连接协同作用,以加强信息传播。此外,该网络结合了较浅的p2层特征图,注入较少噪声的几何信息,显著增强了对细长裂缝的识别能力。对增强微裂纹数据集和Severstal钢缺陷数据集的验证表明,MSFI Net (Pro)在基线模型上具有一致的性能提升。具体而言,在微裂纹测试条件下,YOLOv11n在AP50-95上实现了0.144的改进,同时提高了微裂纹目标的召回率和预测置信度。与主流SOTA颈部优化模型相比,MSFI Net (Pro)在检测精度、分类精度和定位效率方面保持了显著的性能优势。
{"title":"MSFI Net Pro: A tiny slender crack detection model with stronger feature utilization capability","authors":"Fan Yang ,&nbsp;Junzhou Huo ,&nbsp;Zhenxiang Guan ,&nbsp;Hua Li ,&nbsp;Zhang Cheng","doi":"10.1016/j.dsp.2026.105931","DOIUrl":"10.1016/j.dsp.2026.105931","url":null,"abstract":"<div><div>To address the limitations of conventional object detection neck structures–particularly insufficient utilization of deep features and inadequate multi-scale interactions in micro-crack detection–this paper introduces the Multi-scale Feature Stereo Interaction Network, MSFI Net (Pro). At its core lies the innovative Multi-scale Stereo Feature Extraction (MSFE) Block, which constructs parallel spatial interaction pathways across three adjacent scales to comprehensively integrate feature maps from shallow, intermediate, and deep layers. Simultaneously, it introduces a gradient enhancement mechanism through dual hybrid residual connections, effectively preserving gradient flow and feature integrity. MSFI Net (Pro) further engineers a dual-stage fusion pipeline, cascading two MSFE Blocks for coarse refinement followed by precise fine-tuning of features. This synergizes with dense cross-layer connectivity to fortify information propagation. Moreover, the network incorporates shallower P2-layer feature maps, injecting less noisy geometric information that significantly bolsters the recognition capability for slender cracks. Validation on enhanced micro-crack datasets and the Severstal steel defect dataset demonstrates MSFI Net (Pro)’s consistent performance uplift for baseline models. Specifically, under micro-crack test conditions, it achieves a 0.144 improvement in AP<sup>50-95</sup> for YOLOv11n, while simultaneously boosting recall rates and prediction confidence for micro-crack targets. Compared to mainstream SOTA neck-optimized models, MSFI Net (Pro) maintains significant performance advantages in detection precision, classification accuracy, and localization efficacy.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105931"},"PeriodicalIF":3.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on infrared small target detection technology based on DCS-YOLO algorithm 基于DCS-YOLO算法的红外小目标检测技术研究
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-04-01 Epub Date: 2026-01-08 DOI: 10.1016/j.dsp.2026.105898
Meng Yin , Binghe Sun , Rugang Wang , Yuanyuan Wang , Feng Zhou , Xuesheng Bian
To address the challenges of weak features, susceptibility to complex background interference in infrared small targets, and the high computational cost of existing specialized detection models, this paper proposes the Dual-Domain Fusion and Class-Aware Self-supervised YOLO (DCS-YOLO). This framework leverages dual-domain feature fusion and class-aware self-supervised learning for semantic enhancement. During feature extraction, a Class-aware Self-supervised Semantic Fusion Module (CSSFM) utilizes a class-aware self-supervised architecture as a deep semantic guide for generating discriminative semantic features, thereby enhancing the perception of faint target characteristics. Additionally, a Dual-domain Aware Enhancement Module (A2C2f_DDA) is designed, which analyzes the high-frequency components of small targets and employs a spatial-frequency domain feature complementary fusion strategy to sharpen feature capture while suppressing background clutter. For feature upsampling and fusion, a Multi-dimensional Selective Feature Pyramid Network (MSFPN) employs a frequency-domain, spatial, and channel three-dimensional cooperative selection mechanism, integrated with deep semantic information, to enhance feature integration across dimensions and improve detection performance in complex scenes. Furthermore, lightweight components including GSConv, VoVGSCSP, and LSCD-Detect are incorporated to reduce computational complexity and model parameters. Comprehensive evaluations on the IRSTD-1K, RealScene-ISTD, and SIRST-v2 datasets demonstrate the effectiveness of the proposed algorithm, achieving [email protected] scores of 80.7%, 90.2%, and 93.3%, respectively. The results indicate that the algorithm effectively utilizes frequency-domain analysis and semantic enhancement, providing a powerful and efficient solution for infrared small target detection in complex scenarios while maintaining a favorable balance between accuracy and computational cost.
针对红外小目标特征弱、易受复杂背景干扰以及现有专业检测模型计算成本高等问题,提出了双域融合类感知自监督YOLO (DCS-YOLO)算法。该框架利用双域特征融合和类感知自监督学习进行语义增强。在特征提取过程中,类感知自监督语义融合模块(Class-aware Self-supervised Semantic Fusion Module, CSSFM)利用类感知自监督架构作为深层语义向导生成判别性语义特征,从而增强对模糊目标特征的感知。此外,设计了双域感知增强模块(A2C2f_DDA),分析小目标的高频成分,采用空频域特征互补融合策略,在抑制背景杂波的同时锐化特征捕获。在特征上采样和融合方面,多维选择特征金字塔网络(MSFPN)采用频域、空间和信道三维协同选择机制,结合深度语义信息,增强了特征跨维度的融合,提高了复杂场景下的检测性能。此外,还结合了GSConv、VoVGSCSP和LSCD-Detect等轻量级组件,以降低计算复杂度和模型参数。对IRSTD-1K、RealScene-ISTD和SIRST-v2数据集的综合评估表明了该算法的有效性,[email protected]得分分别为80.7%、90.2%和93.3%。结果表明,该算法有效地利用频域分析和语义增强,在保持精度和计算成本的良好平衡的同时,为复杂场景下的红外小目标检测提供了强大而高效的解决方案。
{"title":"Research on infrared small target detection technology based on DCS-YOLO algorithm","authors":"Meng Yin ,&nbsp;Binghe Sun ,&nbsp;Rugang Wang ,&nbsp;Yuanyuan Wang ,&nbsp;Feng Zhou ,&nbsp;Xuesheng Bian","doi":"10.1016/j.dsp.2026.105898","DOIUrl":"10.1016/j.dsp.2026.105898","url":null,"abstract":"<div><div>To address the challenges of weak features, susceptibility to complex background interference in infrared small targets, and the high computational cost of existing specialized detection models, this paper proposes the Dual-Domain Fusion and Class-Aware Self-supervised YOLO (DCS-YOLO). This framework leverages dual-domain feature fusion and class-aware self-supervised learning for semantic enhancement. During feature extraction, a Class-aware Self-supervised Semantic Fusion Module (CSSFM) utilizes a class-aware self-supervised architecture as a deep semantic guide for generating discriminative semantic features, thereby enhancing the perception of faint target characteristics. Additionally, a Dual-domain Aware Enhancement Module (A2C2f_DDA) is designed, which analyzes the high-frequency components of small targets and employs a spatial-frequency domain feature complementary fusion strategy to sharpen feature capture while suppressing background clutter. For feature upsampling and fusion, a Multi-dimensional Selective Feature Pyramid Network (MSFPN) employs a frequency-domain, spatial, and channel three-dimensional cooperative selection mechanism, integrated with deep semantic information, to enhance feature integration across dimensions and improve detection performance in complex scenes. Furthermore, lightweight components including GSConv, VoVGSCSP, and LSCD-Detect are incorporated to reduce computational complexity and model parameters. Comprehensive evaluations on the IRSTD-1K, RealScene-ISTD, and SIRST-v2 datasets demonstrate the effectiveness of the proposed algorithm, achieving [email protected] scores of 80.7%, 90.2%, and 93.3%, respectively. The results indicate that the algorithm effectively utilizes frequency-domain analysis and semantic enhancement, providing a powerful and efficient solution for infrared small target detection in complex scenarios while maintaining a favorable balance between accuracy and computational cost.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105898"},"PeriodicalIF":3.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Digital Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1