首页 > 最新文献

Digital Signal Processing最新文献

英文 中文
Enhancing secrecy performance in multi-user multi-antenna systems using rate-splitting multiple access 利用分频多址提高多用户多天线系统的保密性能
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-20 DOI: 10.1016/j.dsp.2026.105943
Su Nguyen Quoc , Phan Van Tri , Ba Cao Nguyen , Bui Vu Minh , Nguyen Huu Khanh Nhan
This paper proposes the use of multiple antennas (MA) and rate-splitting multiple access (RSMA) to enhance the secrecy performance of a multi-user (MU) wireless system under practical constraints, including imperfect successive interference cancellation (iSIC) and imperfect channel state information (iCSI). Closed-form and analytical expressions of the secrecy outage probability (SOP) and ergodic secrecy capacity (ESC) of both common and private messages are obtained and validated through extensive Monte-Carlo simulations. The results reveal several key insights into the influence of system parameters on secrecy performance. In particular, it is shown that iCSI at the eavesdropper can significantly enhance secrecy, particularly when the estimation error is large, as it limits the eavesdropper’s decoding capability. Furthermore, increasing the number of transmit antennas substantially improves the SOP and ESC of private messages due to enhanced spatial diversity. However, the ESC of the common message does not always benefit from additional antennas; instead, it reaches an optimal value beyond which performance may degrade due to signal leakage or increased interference. Additionally, the impacts of imperfect SIC, power allocation, bandwidth, and operating frequency on the SOP and ESC are thoroughly examined. The findings highlight the importance of jointly using key system parameters such as transmit power, bandwidth, frequency allocation, and antenna configuration while accounting for CSI imperfections, to ensure secure and reliable RSMA-based communication.
本文提出在不完全连续干扰消除(iSIC)和不完全信道状态信息(iCSI)等实际约束条件下,利用多天线(MA)和分频多址(RSMA)来提高多用户无线系统的保密性能。通过广泛的蒙特卡罗仿真,得到了公共消息和私有消息的保密中断概率(SOP)和遍历保密容量(ESC)的封闭表达式和解析表达式,并对其进行了验证。结果揭示了系统参数对保密性能影响的几个关键见解。特别是,窃听器处的iCSI可以显著增强保密性,特别是当估计误差较大时,因为它限制了窃听器的解码能力。此外,由于空间分集的增强,发射天线数量的增加大大提高了私信的SOP和ESC。然而,公共消息的ESC并不总是受益于额外的天线;相反,它会达到一个最优值,超过这个值,由于信号泄漏或干扰增加,性能可能会下降。此外,还深入研究了不完善的SIC、功率分配、带宽和工作频率对SOP和ESC的影响。研究结果强调了联合使用关键系统参数(如发射功率、带宽、频率分配和天线配置)的重要性,同时考虑到CSI缺陷,以确保安全可靠的基于rsma的通信。
{"title":"Enhancing secrecy performance in multi-user multi-antenna systems using rate-splitting multiple access","authors":"Su Nguyen Quoc ,&nbsp;Phan Van Tri ,&nbsp;Ba Cao Nguyen ,&nbsp;Bui Vu Minh ,&nbsp;Nguyen Huu Khanh Nhan","doi":"10.1016/j.dsp.2026.105943","DOIUrl":"10.1016/j.dsp.2026.105943","url":null,"abstract":"<div><div>This paper proposes the use of multiple antennas (MA) and rate-splitting multiple access (RSMA) to enhance the secrecy performance of a multi-user (MU) wireless system under practical constraints, including imperfect successive interference cancellation (iSIC) and imperfect channel state information (iCSI). Closed-form and analytical expressions of the secrecy outage probability (SOP) and ergodic secrecy capacity (ESC) of both common and private messages are obtained and validated through extensive Monte-Carlo simulations. The results reveal several key insights into the influence of system parameters on secrecy performance. In particular, it is shown that iCSI at the eavesdropper can significantly enhance secrecy, particularly when the estimation error is large, as it limits the eavesdropper’s decoding capability. Furthermore, increasing the number of transmit antennas substantially improves the SOP and ESC of private messages due to enhanced spatial diversity. However, the ESC of the common message does not always benefit from additional antennas; instead, it reaches an optimal value beyond which performance may degrade due to signal leakage or increased interference. Additionally, the impacts of imperfect SIC, power allocation, bandwidth, and operating frequency on the SOP and ESC are thoroughly examined. The findings highlight the importance of jointly using key system parameters such as transmit power, bandwidth, frequency allocation, and antenna configuration while accounting for CSI imperfections, to ensure secure and reliable RSMA-based communication.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105943"},"PeriodicalIF":3.0,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wavelet-guided multi-scale edge fusion network for aerial object detection 基于小波制导的多尺度边缘融合网络空中目标检测
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-20 DOI: 10.1016/j.dsp.2026.105946
Xiaoruo Li , Hongwei Ding , Yuanjing Zhu
A fundamental challenge in aerial image object detection lies in accurately identifying multi-scale features within complex backgrounds, characterized by substantial scale variations and inconsistent object distribution. However, existing approaches frequently fail to effectively incorporate edge information, which is critical for precise object localization and remains a major obstacle to improving detection accuracy in aerial imagery. To address this challenge, we propose the Wavelet-guided Multi-scale Edge Fusion Network (WMEFNet) for Aerial Object Detection. Our method begins with the construction of a Feature Edge Perception Backbone Network (FEPBN), in which an edge extractor is embedded into the shallow layers to enhance fine-grained feature representations through a cross-channel fusion strategy. Subsequently, we introduce the Wavelet-Context Fusion Pyramid Network (WCFPN), which integrates edge-aware cues and semantic features from diverse receptive fields, thereby improving the model’s contextual understanding and its adaptability to scale and resolution variations. Furthermore, we design the Wavelet Upsampling Feature Fusion Module (WUFF) and the Wavelet Downsampling Module (WDM), which minimize information loss during sampling operations, enhance the model’s sensitivity to small targets, and preserve crucial edge details. Collectively, the proposed architecture substantially enhances the model’s capability to capture and fuse multi-scale edge features. Extensive experiments show that WMEFNet improves mAP50 by 2.2% (39.1% vs. 36.9%) over RT-DETR on the VisDrone2019-test dataset while maintaining real-time performance. Further results on multiple benchmarks confirm its high accuracy, efficiency, and practical utility for aerial object detection.
航空图像目标检测面临的一个基本挑战是如何准确识别复杂背景下的多尺度特征,这些背景具有较大的尺度变化和目标分布不一致的特点。然而,现有的方法往往不能有效地结合边缘信息,而边缘信息对于精确的目标定位至关重要,并且仍然是提高航空图像检测精度的主要障碍。为了解决这一挑战,我们提出了用于空中目标检测的小波制导多尺度边缘融合网络(WMEFNet)。我们的方法首先构建一个特征边缘感知骨干网络(FEPBN),其中边缘提取器嵌入到浅层中,通过跨通道融合策略增强细粒度特征表示。随后,我们引入了小波-上下文融合金字塔网络(WCFPN),该网络集成了来自不同接受场的边缘感知线索和语义特征,从而提高了模型的上下文理解以及对规模和分辨率变化的适应性。此外,我们设计了小波上采样特征融合模块(WUFF)和小波下采样模块(WDM),最大限度地减少了采样过程中的信息损失,提高了模型对小目标的灵敏度,并保留了关键的边缘细节。总的来说,所提出的体系结构大大增强了模型捕获和融合多尺度边缘特征的能力。大量实验表明,在保持实时性能的同时,WMEFNet在visdrone2019测试数据集上的mAP50比RT-DETR提高了2.2%(39.1%对36.9%)。在多个基准测试上的进一步结果证实了它在空中目标检测方面的高精度、高效率和实用性。
{"title":"Wavelet-guided multi-scale edge fusion network for aerial object detection","authors":"Xiaoruo Li ,&nbsp;Hongwei Ding ,&nbsp;Yuanjing Zhu","doi":"10.1016/j.dsp.2026.105946","DOIUrl":"10.1016/j.dsp.2026.105946","url":null,"abstract":"<div><div>A fundamental challenge in aerial image object detection lies in accurately identifying multi-scale features within complex backgrounds, characterized by substantial scale variations and inconsistent object distribution. However, existing approaches frequently fail to effectively incorporate edge information, which is critical for precise object localization and remains a major obstacle to improving detection accuracy in aerial imagery. To address this challenge, we propose the Wavelet-guided Multi-scale Edge Fusion Network (WMEFNet) for Aerial Object Detection. Our method begins with the construction of a Feature Edge Perception Backbone Network (FEPBN), in which an edge extractor is embedded into the shallow layers to enhance fine-grained feature representations through a cross-channel fusion strategy. Subsequently, we introduce the Wavelet-Context Fusion Pyramid Network (WCFPN), which integrates edge-aware cues and semantic features from diverse receptive fields, thereby improving the model’s contextual understanding and its adaptability to scale and resolution variations. Furthermore, we design the Wavelet Upsampling Feature Fusion Module (WUFF) and the Wavelet Downsampling Module (WDM), which minimize information loss during sampling operations, enhance the model’s sensitivity to small targets, and preserve crucial edge details. Collectively, the proposed architecture substantially enhances the model’s capability to capture and fuse multi-scale edge features. Extensive experiments show that WMEFNet improves mAP50 by 2.2% (39.1% vs. 36.9%) over RT-DETR on the VisDrone2019-test dataset while maintaining real-time performance. Further results on multiple benchmarks confirm its high accuracy, efficiency, and practical utility for aerial object detection.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105946"},"PeriodicalIF":3.0,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ME-DETR: A multi-scale enhanced detection Transformer with low-quality query filter denoising for aerial oriented object detection 基于低质量查询滤波去噪的多尺度增强检测变压器
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-20 DOI: 10.1016/j.dsp.2026.105932
Shuai Shi , Li Zhang
The challenge of oriented object detection in aerial images is due to the arbitrary orientation, dense distribution, and large scale variations of objects. Although recent models based on DEtection TRansformer (DETR) in an end-to-end manner have achieved excellent performance for oriented object detection, they suffer from slow inference speed. To address this issue, this study proposes a Multi-scale Enhanced DETR (ME-DETR) to achieve efficient and effective oriented object detection for aerial images. ME-DETR is an end-to-end detection model that consists of three parts: backbone, encoder and decoder. For the encoder part, we design a novel multi-scale enhanced (ME) encoder that can effectively and efficiently fuse multi-scale features. The ME encoder mainly contains three modules related to multi-scale information fusion: Fine-grained Enhanced Intra-scale Feature Interaction (FEIFI), Multi-scale Feature Fusion (MFF), and Multi-receptive Field Feature Extraction (MRFE). Specifically, the FEIFI module combines low-level features to enrich the intra-scale feature interaction process and then outputs feature with abundant fine-grained information; the MFF module implements the multi-scale feature fusion, effectively enhancing the detailed information in high-level features and reducing background interference; the MRFE module effectively utilizes convolutions of different sizes to extract features with rich multi-scale information. To further enhance performance without affecting inference speed, we present a training scheme of Low-quality Query Filter DeNoising (LQFDN), which adaptively filters out low-quality denoised positive queries. Extensive experiments are conducted on three oriented object detection datasets (DOTA-v1.0, DOTA-v1.5 and DIOR-R). Specifically, when ResNet50 is used as the backbone, ME-DETR achieves 78.35% mAP on DOTA-v1.0 at a speed of 15.2 FPS, and 71.28% mAP on DIOR-R at a speed of 18.2 FPS.
航空图像中定向目标检测的难点在于目标的任意方向、密集分布和大尺度变化。尽管基于端到端检测变压器(DETR)的最新模型在面向对象检测方面取得了优异的成绩,但它们的推理速度较慢。为了解决这一问题,本研究提出了一种多尺度增强DETR (ME-DETR)方法,以实现高效的航空图像定向目标检测。ME-DETR是一种端到端检测模型,由主干、编码器和解码器三部分组成。在编码器部分,我们设计了一种新型的多尺度增强(ME)编码器,可以有效地融合多尺度特征。ME编码器主要包含与多尺度信息融合相关的三个模块:细粒度增强尺度内特征交互(FEIFI)、多尺度特征融合(MFF)和多感受场特征提取(MRFE)。具体来说,FEIFI模块结合底层特征,丰富尺度内特征交互过程,输出具有丰富细粒度信息的特征;MFF模块实现多尺度特征融合,有效增强高级特征中的细节信息,降低背景干扰;MRFE模块有效地利用不同大小的卷积来提取具有丰富多尺度信息的特征。为了在不影响推理速度的情况下进一步提高性能,我们提出了一种低质量查询滤波去噪(LQFDN)的训练方案,该方案自适应过滤掉低质量去噪的正查询。在三个面向目标检测数据集(DOTA-v1.0、DOTA-v1.5和DIOR-R)上进行了大量实验。具体来说,当使用ResNet50作为骨干网时,ME-DETR在DOTA-v1.0上以15.2 FPS的速度达到78.35%的mAP,在DIOR-R上以18.2 FPS的速度达到71.28%的mAP。
{"title":"ME-DETR: A multi-scale enhanced detection Transformer with low-quality query filter denoising for aerial oriented object detection","authors":"Shuai Shi ,&nbsp;Li Zhang","doi":"10.1016/j.dsp.2026.105932","DOIUrl":"10.1016/j.dsp.2026.105932","url":null,"abstract":"<div><div>The challenge of oriented object detection in aerial images is due to the arbitrary orientation, dense distribution, and large scale variations of objects. Although recent models based on DEtection TRansformer (DETR) in an end-to-end manner have achieved excellent performance for oriented object detection, they suffer from slow inference speed. To address this issue, this study proposes a Multi-scale Enhanced DETR (ME-DETR) to achieve efficient and effective oriented object detection for aerial images. ME-DETR is an end-to-end detection model that consists of three parts: backbone, encoder and decoder. For the encoder part, we design a novel multi-scale enhanced (ME) encoder that can effectively and efficiently fuse multi-scale features. The ME encoder mainly contains three modules related to multi-scale information fusion: Fine-grained Enhanced Intra-scale Feature Interaction (FEIFI), Multi-scale Feature Fusion (MFF), and Multi-receptive Field Feature Extraction (MRFE). Specifically, the FEIFI module combines low-level features to enrich the intra-scale feature interaction process and then outputs feature with abundant fine-grained information; the MFF module implements the multi-scale feature fusion, effectively enhancing the detailed information in high-level features and reducing background interference; the MRFE module effectively utilizes convolutions of different sizes to extract features with rich multi-scale information. To further enhance performance without affecting inference speed, we present a training scheme of Low-quality Query Filter DeNoising (LQFDN), which adaptively filters out low-quality denoised positive queries. Extensive experiments are conducted on three oriented object detection datasets (DOTA-v1.0, DOTA-v1.5 and DIOR-R). Specifically, when ResNet50 is used as the backbone, ME-DETR achieves 78.35% mAP on DOTA-v1.0 at a speed of 15.2 FPS, and 71.28% mAP on DIOR-R at a speed of 18.2 FPS.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105932"},"PeriodicalIF":3.0,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale dilated fusion attention for CLIP-based person re-identification 基于clip的人再识别多尺度扩展融合注意
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-20 DOI: 10.1016/j.dsp.2026.105939
Zilong Li, Jing Zhang, Jiashuai Xiao
Cross-modal learning models like Contrastive Language-Image Pre-training (CLIP) have demonstrated remarkable performance in various downstream tasks. However, applying CLIP to person re-identification (ReID) reveals key limitations, particularly its emphasis on global semantic features while neglecting fine-grained local features and spatial relationships critical for distinguishing identities. To overcome these challenges, we propose Multi-Scale Dilated Fusion Attention (MDFA), a novel framework that enhances the CLIP visual encoder with spatial and channel attention mechanisms combined with global context modeling and multi-scale dilated convolutions. By integrating multiple dilation rates, MDFA effectively aggregates information across varied receptive fields, enabling the model to gather fine-grained local details alongside broader contextual information. This design allows the model to capture richer identity cues and better handle complex scenarios such as occlusion and background clutter, effectively addressing the lack of local discrimination and contextual awareness in CLIP-based ReID models. Extensive experiments demonstrate that MDFA achieves superior performance over existing methods, offering a robust and scalable solution for real-world ReID applications such as surveillance and autonomous driving.
对比语言图像预训练(CLIP)等跨模态学习模型在各种下游任务中表现出显著的性能。然而,将CLIP应用于人物再识别(ReID)暴露出关键的局限性,特别是它强调全局语义特征,而忽略了细粒度的局部特征和空间关系,这对区分身份至关重要。为了克服这些挑战,我们提出了多尺度扩展融合注意(MDFA),这是一种新的框架,结合全局上下文建模和多尺度扩展卷积,通过空间和通道注意机制增强CLIP视觉编码器。通过整合多个扩张率,MDFA有效地聚合了不同感受野的信息,使模型能够收集细粒度的局部细节以及更广泛的上下文信息。这种设计使模型能够捕获更丰富的身份线索,更好地处理遮挡和背景杂乱等复杂场景,有效地解决了基于clip的ReID模型中缺乏局部歧视和上下文意识的问题。广泛的实验表明,MDFA比现有方法实现了卓越的性能,为现实世界的ReID应用(如监视和自动驾驶)提供了强大且可扩展的解决方案。
{"title":"Multi-scale dilated fusion attention for CLIP-based person re-identification","authors":"Zilong Li,&nbsp;Jing Zhang,&nbsp;Jiashuai Xiao","doi":"10.1016/j.dsp.2026.105939","DOIUrl":"10.1016/j.dsp.2026.105939","url":null,"abstract":"<div><div>Cross-modal learning models like Contrastive Language-Image Pre-training (CLIP) have demonstrated remarkable performance in various downstream tasks. However, applying CLIP to person re-identification (ReID) reveals key limitations, particularly its emphasis on global semantic features while neglecting fine-grained local features and spatial relationships critical for distinguishing identities. To overcome these challenges, we propose Multi-Scale Dilated Fusion Attention (MDFA), a novel framework that enhances the CLIP visual encoder with spatial and channel attention mechanisms combined with global context modeling and multi-scale dilated convolutions. By integrating multiple dilation rates, MDFA effectively aggregates information across varied receptive fields, enabling the model to gather fine-grained local details alongside broader contextual information. This design allows the model to capture richer identity cues and better handle complex scenarios such as occlusion and background clutter, effectively addressing the lack of local discrimination and contextual awareness in CLIP-based ReID models. Extensive experiments demonstrate that MDFA achieves superior performance over existing methods, offering a robust and scalable solution for real-world ReID applications such as surveillance and autonomous driving.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105939"},"PeriodicalIF":3.0,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bit-level quad-block shuffling and sequential summing dispersing image encryption based on hyperchaotic 2D Euler Pi Crossed Sine Map 基于超混沌二维欧拉Pi交叉正弦映射的位级四块洗牌和顺序和分散图像加密
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-19 DOI: 10.1016/j.dsp.2026.105941
Omer Kocak , Uğur Erkan , Ismail Babaoglu
Chaos-based image encryption methods strongly depend on the complexity and dynamic behavior of chaotic maps to achieve effective permutation and diffusion. In this study, a novel two-dimensional Euler Pi Crossed Sine (2D-EPICS) chaotic map is introduced, which exhibits hyperchaotic dynamics, wide chaotic ranges, and high sensitivity to initial conditions. The chaotic properties of the proposed map are rigorously analyzed using bifurcation diagrams, phase trajectories, Lyapunov exponents, and multiple entropy measures, including sample entropy, permutation entropy, Kolmogorov entropy, and C0 complexity, confirming its strong nonlinear behavior and unpredictability. Building upon this chaotic foundation, the Bit-Level Quad-Block Shuffling and Sequential Summing Dispersing Image Encryption (BQSSSD-IE) scheme is then developed. The encryption process consists of a bit-level permutation stage based on quadruple pixel blocks, followed by a bidirectional diffusion stage achieved through cumulative row-wise and column-wise summations, both driven by sequences generated from the 2D-EPICS map. Extensive security analyses and comparative evaluations demonstrate that the proposed method provides high entropy, low pixel correlation, strong resistance against statistical, differential, noise, and cropping attacks, and competitive computational efficiency. The enhanced dynamic behavior of the 2D-EPICS map significantly strengthens the overall confusion and diffusion capabilities of the encryption scheme, making BQSSSD-IE suitable for secure and real-time image protection applications.
基于混沌的图像加密方法强烈依赖于混沌映射的复杂性和动态行为来实现有效的排列和扩散。本文提出了一种新的二维欧拉Pi交叉正弦(2D-EPICS)混沌映射,该映射具有超混沌动力学、宽混沌范围和对初始条件的高灵敏度。利用分岔图、相轨迹、Lyapunov指数和多熵度量(包括样本熵、排列熵、Kolmogorov熵和C0复杂度)严格分析了所提出映射的混沌特性,证实了其强非线性行为和不可预测性。建立在这个混沌的基础上,然后开发了比特级四块洗牌和顺序求和分散图像加密(BQSSSD-IE)方案。加密过程包括基于四倍像素块的位级排列阶段,然后是通过累积的行向和列向求和实现的双向扩散阶段,这两个阶段都是由2D-EPICS映射生成的序列驱动的。广泛的安全性分析和比较评估表明,所提出的方法具有高熵、低像素相关性、对统计、差分、噪声和裁剪攻击具有很强的抵抗能力以及具有竞争力的计算效率。增强的2D-EPICS地图动态行为显著增强了加密方案的整体混淆和扩散能力,使BQSSSD-IE适合于安全和实时图像保护应用。
{"title":"Bit-level quad-block shuffling and sequential summing dispersing image encryption based on hyperchaotic 2D Euler Pi Crossed Sine Map","authors":"Omer Kocak ,&nbsp;Uğur Erkan ,&nbsp;Ismail Babaoglu","doi":"10.1016/j.dsp.2026.105941","DOIUrl":"10.1016/j.dsp.2026.105941","url":null,"abstract":"<div><div>Chaos-based image encryption methods strongly depend on the complexity and dynamic behavior of chaotic maps to achieve effective permutation and diffusion. In this study, a novel two-dimensional Euler Pi Crossed Sine (2D-EPICS) chaotic map is introduced, which exhibits hyperchaotic dynamics, wide chaotic ranges, and high sensitivity to initial conditions. The chaotic properties of the proposed map are rigorously analyzed using bifurcation diagrams, phase trajectories, Lyapunov exponents, and multiple entropy measures, including sample entropy, permutation entropy, Kolmogorov entropy, and C0 complexity, confirming its strong nonlinear behavior and unpredictability. Building upon this chaotic foundation, the Bit-Level Quad-Block Shuffling and Sequential Summing Dispersing Image Encryption (BQSSSD-IE) scheme is then developed. The encryption process consists of a bit-level permutation stage based on quadruple pixel blocks, followed by a bidirectional diffusion stage achieved through cumulative row-wise and column-wise summations, both driven by sequences generated from the 2D-EPICS map. Extensive security analyses and comparative evaluations demonstrate that the proposed method provides high entropy, low pixel correlation, strong resistance against statistical, differential, noise, and cropping attacks, and competitive computational efficiency. The enhanced dynamic behavior of the 2D-EPICS map significantly strengthens the overall confusion and diffusion capabilities of the encryption scheme, making BQSSSD-IE suitable for secure and real-time image protection applications.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105941"},"PeriodicalIF":3.0,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring feature pyramid networks and feature fusion for generalized Deepfake detection 基于广义深度伪造检测的特征金字塔网络与特征融合研究
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-19 DOI: 10.1016/j.dsp.2026.105945
Gaoming Yang , Biaohu Sun , Xiujun Wang
The accelerated progression of deepfake technologies has triggered a serious trust crisis and motivated numerous scholars to pursue effective methods for detecting such forgeries. However, current detection methods heavily rely on limited forgery cues and irrelevant information to boost intra-dataset performance, and they struggle with generalization and robustness in real-world scenarios. To tackle these problems, we design a Multi-Scale Feature Pyramid Network (MS-FPN) that focuses on forgery regions, and an altered-trace enhancement strategy to reveal more tampering artifacts. Specifically, the MS-FPN performs forgery-region segmentation during feature extraction, which counteracts the detector’s reliance on forgery-irrelevant information and allows it to concentrate on more altered areas. Furthermore, a plug-and-play Cross-Feature Spatial Attention (CFSA) module is proposed to strengthen the constraints on high-level features. In addition, we develop the falsified images re-mixing method to highlight more generalized artifacts by blending two augmented forgery images, while a Multi-level Feature Fusion (MLFF) module is utilized to integrate multi-scale features, enabling the network to capture fine-grained local features. Extensive experiments on multiple public benchmarks demonstrate that the proposed method achieves superior cross-dataset and cross-manipulation generalization, achieving AUC scores of 93.22% on CDF2, 96.88% on UADFV, and 92.67% on DFD. Visualization results further confirm that our approach produces interpretable and reliable evidence for face forgery forensics. The code is available at https://github.com/Sun-researcher/SD-Net-main
深度伪造技术的加速发展引发了严重的信任危机,并促使众多学者寻求检测此类伪造的有效方法。然而,当前的检测方法严重依赖于有限的伪造线索和不相关信息来提高数据集内的性能,并且在现实场景中难以实现泛化和鲁棒性。为了解决这些问题,我们设计了一个专注于伪造区域的多尺度特征金字塔网络(MS-FPN),以及一个改变痕迹增强策略来揭示更多的篡改工件。具体来说,MS-FPN在特征提取过程中执行伪造区域分割,这抵消了检测器对伪造无关信息的依赖,并允许其专注于更多改变的区域。此外,提出了一个即插即用的跨特征空间注意(CFSA)模块,以加强对高级特征的约束。此外,我们开发了伪造图像重新混合方法,通过混合两幅增强的伪造图像来突出更广义的伪像,同时利用多层次特征融合(MLFF)模块集成多尺度特征,使网络能够捕获细粒度的局部特征。在多个公共基准上的大量实验表明,该方法实现了优异的跨数据集和跨操作泛化,在CDF2、UADFV和DFD上的AUC得分分别为93.22%、96.88%和92.67%。可视化结果进一步证实,我们的方法为面部伪造取证提供了可解释和可靠的证据。代码可在https://github.com/Sun-researcher/SD-Net-main上获得
{"title":"Exploring feature pyramid networks and feature fusion for generalized Deepfake detection","authors":"Gaoming Yang ,&nbsp;Biaohu Sun ,&nbsp;Xiujun Wang","doi":"10.1016/j.dsp.2026.105945","DOIUrl":"10.1016/j.dsp.2026.105945","url":null,"abstract":"<div><div>The accelerated progression of deepfake technologies has triggered a serious trust crisis and motivated numerous scholars to pursue effective methods for detecting such forgeries. However, current detection methods heavily rely on limited forgery cues and irrelevant information to boost intra-dataset performance, and they struggle with generalization and robustness in real-world scenarios. To tackle these problems, we design a Multi-Scale Feature Pyramid Network (MS-FPN) that focuses on forgery regions, and an altered-trace enhancement strategy to reveal more tampering artifacts. Specifically, the MS-FPN performs forgery-region segmentation during feature extraction, which counteracts the detector’s reliance on forgery-irrelevant information and allows it to concentrate on more altered areas. Furthermore, a plug-and-play Cross-Feature Spatial Attention (CFSA) module is proposed to strengthen the constraints on high-level features. In addition, we develop the falsified images re-mixing method to highlight more generalized artifacts by blending two augmented forgery images, while a Multi-level Feature Fusion (MLFF) module is utilized to integrate multi-scale features, enabling the network to capture fine-grained local features. Extensive experiments on multiple public benchmarks demonstrate that the proposed method achieves superior cross-dataset and cross-manipulation generalization, achieving AUC scores of 93.22% on CDF2, 96.88% on UADFV, and 92.67% on DFD. Visualization results further confirm that our approach produces interpretable and reliable evidence for face forgery forensics. The code is available at <span><span>https://github.com/Sun-researcher/SD-Net-main</span><svg><path></path></svg></span></div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105945"},"PeriodicalIF":3.0,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Progressive Enhancement method for Low-light Images via Self-supervised Learning 基于自监督学习的微光图像渐进式增强方法
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-19 DOI: 10.1016/j.dsp.2026.105933
Wenxia Bao , Wentao Guo , Zonghao Tian , Nian Wang
Self-supervised learning provides a data-efficient paradigm for Low-Light Image Enhancement (LLIE) by alleviating the dependence on labeled supervision. However, existing approaches commonly formulate enhancement as a coupled single-stage mapping, which limits their ability to disentangle illumination distortions from intrinsic structural features. This entanglement often leads to local overexposure and the loss of high-frequency details, undermining performance in precision-critical applications such as biometrics and forensics. To overcome this limitation, we propose a Local-Global Progressive Enhancement Network (LGPENet) driven by a Composite Self-Supervision Strategy with internal Statistical Anchors, which together enforce a principled Constraint-and-Restore mechanism. The enhancement process is spatially decoupled into two progressive stages. First, the Multi-scale Adaptive Illumination Enhancement (Ms-AIE) module acts as a local illumination injector, integrating Swin Transformer–based global context with Residual Block-based local representations to recover visibility in severely under-illuminated regions. Subsequently, the Adaptive Dynamic Luminance Enhancement (ADLE) module performs global exposure alignment by regulating the overall dynamic range through pixel-wise compensation, effectively preventing over-enhancement. Extensive experiments on both standard benchmarks and domain-specific datasets demonstrate that LGPENet achieves competitive or superior performance compared to state-of-the-art methods, particularly in preserving the structural fidelity required for robust semantic and forensic analysis.
自监督学习通过减轻对标记监督的依赖,为低光图像增强(LLIE)提供了一种数据高效的范式。然而,现有的方法通常将增强作为耦合的单级映射,这限制了它们从固有结构特征中分离照明畸变的能力。这种纠缠通常会导致局部过度曝光和高频细节的丢失,从而影响生物识别和取证等精度关键应用的性能。为了克服这一限制,我们提出了一个局部-全局渐进增强网络(LGPENet),该网络由具有内部统计锚点的复合自我监督策略驱动,它们共同执行原则性的约束和恢复机制。增强过程在空间上解耦为两个递进阶段。首先,多尺度自适应照明增强(Ms-AIE)模块充当局部照明注入器,将基于Swin变压器的全局上下文与基于残差块的局部表示相结合,以恢复严重光照不足区域的可见性。随后,自适应动态亮度增强(ADLE)模块通过逐像素补偿调节整体动态范围来实现全局曝光校准,有效防止过度增强。在标准基准测试和特定领域数据集上进行的大量实验表明,与最先进的方法相比,LGPENet实现了具有竞争力或更好的性能,特别是在保留鲁棒语义和取证分析所需的结构保真度方面。
{"title":"Progressive Enhancement method for Low-light Images via Self-supervised Learning","authors":"Wenxia Bao ,&nbsp;Wentao Guo ,&nbsp;Zonghao Tian ,&nbsp;Nian Wang","doi":"10.1016/j.dsp.2026.105933","DOIUrl":"10.1016/j.dsp.2026.105933","url":null,"abstract":"<div><div>Self-supervised learning provides a data-efficient paradigm for Low-Light Image Enhancement (LLIE) by alleviating the dependence on labeled supervision. However, existing approaches commonly formulate enhancement as a coupled single-stage mapping, which limits their ability to disentangle illumination distortions from intrinsic structural features. This entanglement often leads to local overexposure and the loss of high-frequency details, undermining performance in precision-critical applications such as biometrics and forensics. To overcome this limitation, we propose a Local-Global Progressive Enhancement Network (LGPENet) driven by a Composite Self-Supervision Strategy with internal Statistical Anchors, which together enforce a principled Constraint-and-Restore mechanism. The enhancement process is spatially decoupled into two progressive stages. First, the Multi-scale Adaptive Illumination Enhancement (Ms-AIE) module acts as a local illumination injector, integrating Swin Transformer–based global context with Residual Block-based local representations to recover visibility in severely under-illuminated regions. Subsequently, the Adaptive Dynamic Luminance Enhancement (ADLE) module performs global exposure alignment by regulating the overall dynamic range through pixel-wise compensation, effectively preventing over-enhancement. Extensive experiments on both standard benchmarks and domain-specific datasets demonstrate that LGPENet achieves competitive or superior performance compared to state-of-the-art methods, particularly in preserving the structural fidelity required for robust semantic and forensic analysis.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105933"},"PeriodicalIF":3.0,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blind seismic denoising via ensemble iterative data refinement with adaptive spectral-spatial feature fusion 基于自适应光谱空间特征融合的集成迭代数据盲去噪方法
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-19 DOI: 10.1016/j.dsp.2026.105936
Zhanzhan Shi , Guo Huang , Huailai Zhou , Su Pang , Yuanjun Wang
Attenuating random noise without access to clean training targets or precise noise models remains a significant challenge for seismic data processing. This paper proposes the ensemble iterative data refinement (EIDR) framework for robust blind noise suppression, featuring three methodological innovations: 1) J-invariant ensemble learning, which integrates intermediate denoised estimations with raw noisy inputs to enable training free from explicit noise distribution assumptions, thereby eliminating reliance on predefined noise models; 2) Trainable fractional Fourier transform (FrFT) embedding layers that replace conventional convolutional blocks, facilitating adaptive frequency-spatial feature fusion through learnable fractional orders; and 3) A structure-preserving U-shape architecture (ULite) utilizing dual-stream discrete wavelet transform (DWT) pooling to preserve critical high-frequency microstructural information during downsampling. EIDR was rigorously evaluated on synthetic Marmousi data contaminated with non-stationary noise and the Mobil AVO Viking Graben field dataset. On synthetic data, EIDR achieved an output SNR of 42.755 dB, surpassing state-of-the-art (SOTA) self-supervised benchmarks by up to 17.965 dB and closing 92.2% of the performance gap compared to fully supervised models. Field validation confirmed that EIDR effectively suppresses complex unknown noise while preserving structural fidelity and amplitude integrity. The framework demonstrates significant practical feasibility, achieving a processing speed of 0.193 ms per 64 × 64 patch on an NVIDIA RTX 3090 GPU. These results establish EIDR as a highly effective and practical solution for blind seismic denoising under realistic constraints.
在没有清晰的训练目标或精确的噪声模型的情况下,如何衰减随机噪声仍然是地震数据处理的一个重大挑战。本文提出了用于鲁棒盲噪声抑制的集成迭代数据细化(EIDR)框架,该框架具有三个方法创新:1)j不变集成学习,将中间去噪估计与原始噪声输入集成在一起,使训练不需要明确的噪声分布假设,从而消除对预定义噪声模型的依赖;2)可训练分数阶傅里叶变换(FrFT)嵌入层取代传统的卷积块,通过可学习的分数阶促进自适应频率-空间特征融合;3)利用双流离散小波变换(DWT)池化的u型结构(ULite),在下采样过程中保留关键的高频微观结构信息。在含有非平稳噪声的Marmousi合成数据和Mobil AVO Viking地堑油田数据集上,对EIDR进行了严格的评估。在合成数据上,EIDR实现了42.755 dB的输出信噪比,比最先进的(SOTA)自监督基准高出17.965 dB,与完全监督模型相比,缩小了92.2%的性能差距。现场验证证实,EIDR有效地抑制了复杂的未知噪声,同时保持了结构保真度和幅度完整性。该框架具有显著的实际可行性,在NVIDIA RTX 3090 GPU上实现了每个64 × 64补丁0.193 ms的处理速度。这些结果表明,EIDR是在现实约束条件下有效和实用的盲地震去噪解决方案。
{"title":"Blind seismic denoising via ensemble iterative data refinement with adaptive spectral-spatial feature fusion","authors":"Zhanzhan Shi ,&nbsp;Guo Huang ,&nbsp;Huailai Zhou ,&nbsp;Su Pang ,&nbsp;Yuanjun Wang","doi":"10.1016/j.dsp.2026.105936","DOIUrl":"10.1016/j.dsp.2026.105936","url":null,"abstract":"<div><div>Attenuating random noise without access to clean training targets or precise noise models remains a significant challenge for seismic data processing. This paper proposes the ensemble iterative data refinement (EIDR) framework for robust blind noise suppression, featuring three methodological innovations: 1) J-invariant ensemble learning, which integrates intermediate denoised estimations with raw noisy inputs to enable training free from explicit noise distribution assumptions, thereby eliminating reliance on predefined noise models; 2) Trainable fractional Fourier transform (FrFT) embedding layers that replace conventional convolutional blocks, facilitating adaptive frequency-spatial feature fusion through learnable fractional orders; and 3) A structure-preserving U-shape architecture (ULite) utilizing dual-stream discrete wavelet transform (DWT) pooling to preserve critical high-frequency microstructural information during downsampling. EIDR was rigorously evaluated on synthetic Marmousi data contaminated with non-stationary noise and the Mobil AVO Viking Graben field dataset. On synthetic data, EIDR achieved an output SNR of 42.755 dB, surpassing state-of-the-art (SOTA) self-supervised benchmarks by up to 17.965 dB and closing 92.2% of the performance gap compared to fully supervised models. Field validation confirmed that EIDR effectively suppresses complex unknown noise while preserving structural fidelity and amplitude integrity. The framework demonstrates significant practical feasibility, achieving a processing speed of 0.193 ms per 64 × 64 patch on an NVIDIA RTX 3090 GPU. These results establish EIDR as a highly effective and practical solution for blind seismic denoising under realistic constraints.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105936"},"PeriodicalIF":3.0,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EMD-YOLOv8: A road pedestrian detection algorithm based on improved YOLOv8 EMD-YOLOv8:一种基于改进YOLOv8的道路行人检测算法
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-19 DOI: 10.1016/j.dsp.2026.105940
Zhuangzhuang Bao, Wenhua Han, Yuchen Pan
As autonomous driving technology advances, pedestrian detection has become a critical task for ensuring road safety. However, in low-light and pedestrian-dense environments, current pedestrian detection algorithms often fail to meet the accuracy requirements for practical applications. To enhance detection accuracy, this paper presents the EMD-YOLOv8, an improved pedestrian detection algorithm. First, to enhance the detail representation of the input images, a Multi-Scale Retinex with Color Restoration algorithm is introduced to optimize the dataset. Next, an enhanced residual block is proposed as a replacement for the redundant BottleNeck structure in the original C2f module, which improves multi-scale object detection capability by integrating high-frequency information with local features. Additionally, a Multi-Scale Spatial Recalibration Network is proposed to dynamically adjust local details and global context features, with the goal of improving feature representation. Finally, a detail enhanced detection head is designed to improve small-object detection performance by shared convolutional parameters and integrating cross-layer feature fusion. Experiments show that the EMD-YOLOv8 algorithm reduces parameters by 47.3% compared to YOLOv8s, while increasing P, R, mAP50, and mAP50-95 by 2.2%, 5.7%, 7.5%, and 4.9%, respectively. The improved algorithm presented in this paper not only effectively addresses the issues of missed detections and false detections but also reduces the parameter count.
随着自动驾驶技术的进步,行人检测已成为确保道路安全的关键任务。然而,在低光照和行人密集的环境下,现有的行人检测算法往往不能满足实际应用的精度要求。为了提高检测精度,本文提出了一种改进的行人检测算法EMD-YOLOv8。首先,为了增强输入图像的细节表现,引入了一种带颜色恢复的多尺度Retinex算法对数据集进行优化。其次,提出了一种增强残差块替代原C2f模块中冗余的瓶颈结构,通过将高频信息与局部特征相结合,提高了多尺度目标检测能力;此外,提出了一个多尺度空间再标定网络,动态调整局部细节和全局上下文特征,以改善特征表示。最后,设计了一种细节增强检测头,通过共享卷积参数和集成跨层特征融合来提高小目标检测性能。实验表明,EMD-YOLOv8算法比yolov8算法减少了47.3%的参数,而P、R、mAP50和mAP50-95分别提高了2.2%、5.7%、7.5%和4.9%。本文提出的改进算法不仅有效地解决了漏检和误检问题,而且减少了参数个数。
{"title":"EMD-YOLOv8: A road pedestrian detection algorithm based on improved YOLOv8","authors":"Zhuangzhuang Bao,&nbsp;Wenhua Han,&nbsp;Yuchen Pan","doi":"10.1016/j.dsp.2026.105940","DOIUrl":"10.1016/j.dsp.2026.105940","url":null,"abstract":"<div><div>As autonomous driving technology advances, pedestrian detection has become a critical task for ensuring road safety. However, in low-light and pedestrian-dense environments, current pedestrian detection algorithms often fail to meet the accuracy requirements for practical applications. To enhance detection accuracy, this paper presents the EMD-YOLOv8, an improved pedestrian detection algorithm. First, to enhance the detail representation of the input images, a Multi-Scale Retinex with Color Restoration algorithm is introduced to optimize the dataset. Next, an enhanced residual block is proposed as a replacement for the redundant BottleNeck structure in the original C2f module, which improves multi-scale object detection capability by integrating high-frequency information with local features. Additionally, a Multi-Scale Spatial Recalibration Network is proposed to dynamically adjust local details and global context features, with the goal of improving feature representation. Finally, a detail enhanced detection head is designed to improve small-object detection performance by shared convolutional parameters and integrating cross-layer feature fusion. Experiments show that the EMD-YOLOv8 algorithm reduces parameters by 47.3% compared to YOLOv8s, while increasing P, R, mAP50, and mAP50-95 by 2.2%, 5.7%, 7.5%, and 4.9%, respectively. The improved algorithm presented in this paper not only effectively addresses the issues of missed detections and false detections but also reduces the parameter count.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"174 ","pages":"Article 105940"},"PeriodicalIF":3.0,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging the sensor reality gap: Adaptive learning from implicit degradation priors for low-light image enhancement 弥合传感器现实差距:自适应学习从隐式退化先验弱光图像增强
IF 3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-19 DOI: 10.1016/j.dsp.2026.105934
Tao Cao , Baojian Ren , Zhengyang Zhang , Hongfei Cao , Xinglin Zhang , Shuchen Bai
The acquisition of signals via physical image sensors under low-light conditions constitutes a classic ill-posed inverse problem in digital signal processing. The severe signal-to-noise ratio (SNR) degradation, stemming from stochastic processes like photon shot noise and non-ideal characteristics of the sensor's signal processing pipeline, poses a significant challenge. Conventional supervised restoration algorithms are often constrained by the "sensor reality gap," where models trained on synthetic data fail to generalize to the complex, non-linear degradation profiles of real-world hardware. Meanwhile, unsupervised methods frequently suffer from unstable convergence due to the absence of reliable optimization constraints. To address this fundamental issue, we propose the Adaptive Reality Correction Network (ARC-Net), a novel self-guided refinement framework. Without requiring paired data, ARC-Net formulates the unknown physical sensor corruption as a degradation residual. This residual is iteratively estimated from real-world, unpaired samples and injected back into the training stream as a learned prior through a self-correction loop. This mechanism adaptively forces the network to learn the inverse mapping of authentic sensor artifacts. Furthermore, we introduce stochastic information occlusion as a robust regularization strategy, which enhances the network's ability to reconstruct signals from severely corrupted regions by emulating photon starvation. Extensive experiments demonstrate the state-of-the-art performance of ARC-Net. It not only surpasses the leading supervised method by over 1.4 dB in PSNR on a standard paired dataset but, more critically, it successfully restores fine-grained signal details and color fidelity in extreme real-world scenarios where most contemporary algorithms fail. This validates the framework's superiority in addressing complex, authentic signal processing challenges and highlights its significant potential for improving the reliability of sensor-based systems.
弱光条件下物理图像传感器的信号采集是数字信号处理中典型的不适定逆问题。由于光子散粒噪声等随机过程和传感器信号处理管道的非理想特性,严重的信噪比(SNR)下降给传感器带来了巨大的挑战。传统的监督恢复算法经常受到“传感器现实差距”的限制,在这种情况下,基于合成数据训练的模型无法推广到现实世界硬件的复杂、非线性退化概况。同时,由于缺乏可靠的优化约束,无监督方法往往存在不稳定收敛的问题。为了解决这一基本问题,我们提出了自适应现实校正网络(ARC-Net),这是一种新的自导向改进框架。在不需要配对数据的情况下,ARC-Net将未知的物理传感器损坏表述为退化残余。残差从真实世界的未配对样本中迭代估计,并通过自校正回路作为学习先验注入训练流。这种机制自适应地迫使网络学习真实传感器工件的逆映射。此外,我们引入随机信息遮挡作为一种鲁棒正则化策略,通过模拟光子饥饿来增强网络从严重损坏区域重建信号的能力。大量的实验证明了ARC-Net最先进的性能。它不仅在标准配对数据集上超过领先的监督方法的PSNR超过1.4 dB,而且更重要的是,它成功地恢复了细粒度信号细节和色彩保真度,在大多数当代算法失败的极端现实场景中。这证实了该框架在解决复杂、真实的信号处理挑战方面的优势,并突出了其在提高基于传感器的系统可靠性方面的巨大潜力。
{"title":"Bridging the sensor reality gap: Adaptive learning from implicit degradation priors for low-light image enhancement","authors":"Tao Cao ,&nbsp;Baojian Ren ,&nbsp;Zhengyang Zhang ,&nbsp;Hongfei Cao ,&nbsp;Xinglin Zhang ,&nbsp;Shuchen Bai","doi":"10.1016/j.dsp.2026.105934","DOIUrl":"10.1016/j.dsp.2026.105934","url":null,"abstract":"<div><div>The acquisition of signals via physical image sensors under low-light conditions constitutes a classic ill-posed inverse problem in digital signal processing. The severe signal-to-noise ratio (SNR) degradation, stemming from stochastic processes like photon shot noise and non-ideal characteristics of the sensor's signal processing pipeline, poses a significant challenge. Conventional supervised restoration algorithms are often constrained by the \"sensor reality gap,\" where models trained on synthetic data fail to generalize to the complex, non-linear degradation profiles of real-world hardware. Meanwhile, unsupervised methods frequently suffer from unstable convergence due to the absence of reliable optimization constraints. To address this fundamental issue, we propose the Adaptive Reality Correction Network (ARC-Net), a novel self-guided refinement framework. Without requiring paired data, ARC-Net formulates the unknown physical sensor corruption as a degradation residual. This residual is iteratively estimated from real-world, unpaired samples and injected back into the training stream as a learned prior through a self-correction loop. This mechanism adaptively forces the network to learn the inverse mapping of authentic sensor artifacts. Furthermore, we introduce stochastic information occlusion as a robust regularization strategy, which enhances the network's ability to reconstruct signals from severely corrupted regions by emulating photon starvation. Extensive experiments demonstrate the state-of-the-art performance of ARC-Net. It not only surpasses the leading supervised method by over 1.4 dB in PSNR on a standard paired dataset but, more critically, it successfully restores fine-grained signal details and color fidelity in extreme real-world scenarios where most contemporary algorithms fail. This validates the framework's superiority in addressing complex, authentic signal processing challenges and highlights its significant potential for improving the reliability of sensor-based systems.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105934"},"PeriodicalIF":3.0,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Digital Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1