Ocean information perception based on artificial intelligence is driving the innovative advancements in comprehensive sea observation. The underwater acoustic communication, as the neural link for ocean information interconnection, is susceptible to various interferences such as complex ocean environments and unstable communications. Considering the measurement errors caused by noisy hydroacoustic signals, this paper proposes a tensor low-rank sparse representation by nonconvex regularization (TLSRNR) model for hydroacoustic intelligent recovery. Firstly, the hydroacoustic original tensor mapped by multidimensional hydroacoustic data is decomposed into hydroacoustic sparse tensor, and hydroacoustic target tensor obtained by the t-product of hydroacoustic dictionary tensor and coefficient tensor. Secondly, the nonconvex penalty function is introduced to reduce the approximation error in the tubal rank of coefficient tensor, while the inherent deviation of hydroacoustic sparse tensor is solved by smoothly clipped absolute deviation. Thirdly, the alternating direction method of multipliers is employed to solve proposed TLSRNR model efficiently for recovering the hydroacoustic target tensor. Through simulation experiments and platform lake trials, the recovery performance of noisy hydroacoustic data is evaluated under different algorithms, demonstrating that the proposed model achieves superior accuracy and robustness.
{"title":"Intelligent recovery of low-rank sparse tensor for noisy hydroacoustic with use of nonconvex regularization","authors":"Yuhang Mei, Chengming Luo, Jinqing Cao, Zizhuo Liu, Yongshuai Fei, Fantong Kong, Biao Wang","doi":"10.1016/j.dsp.2026.105927","DOIUrl":"10.1016/j.dsp.2026.105927","url":null,"abstract":"<div><div>Ocean information perception based on artificial intelligence is driving the innovative advancements in comprehensive sea observation. The underwater acoustic communication, as the neural link for ocean information interconnection, is susceptible to various interferences such as complex ocean environments and unstable communications. Considering the measurement errors caused by noisy hydroacoustic signals, this paper proposes a tensor low-rank sparse representation by nonconvex regularization (TLSRNR) model for hydroacoustic intelligent recovery. Firstly, the hydroacoustic original tensor mapped by multidimensional hydroacoustic data is decomposed into hydroacoustic sparse tensor, and hydroacoustic target tensor obtained by the <em>t</em>-product of hydroacoustic dictionary tensor and coefficient tensor. Secondly, the nonconvex penalty function is introduced to reduce the approximation error in the tubal rank of coefficient tensor, while the inherent deviation of hydroacoustic sparse tensor is solved by smoothly clipped absolute deviation. Thirdly, the alternating direction method of multipliers is employed to solve proposed TLSRNR model efficiently for recovering the hydroacoustic target tensor. Through simulation experiments and platform lake trials, the recovery performance of noisy hydroacoustic data is evaluated under different algorithms, demonstrating that the proposed model achieves superior accuracy and robustness.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105927"},"PeriodicalIF":3.0,"publicationDate":"2026-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-17DOI: 10.1016/j.dsp.2026.105925
Wenjing Lu , Zhuang Fang , Liang Wu , Liming Tang , Hanxin Liu , Chuanjiang He
The low-rank matrix completion (LRMC) technology has achieved remarkable results in low-level visual tasks. There is an underlying assumption that the real-world matrix data is low-rank in LRMC. However, when matrix data do not strictly satisfy the low-rank property, this assumption creates serious challenges for existing matrix recovery methods. Fortunately, there exist feasible schemes that devise appropriate and effective priori representations for describing the intrinsic information of real data. In this paper, we first model the matrix data Y as the sum of a low-rank approximation component and an approximation error component . This finer-grained data decomposition framework allows each component of information to be portrayed more precisely. To effectively characterize the structured error, we design an overlapping group error representation (OGER) function, which captures structured sparsity by modeling locally correlated errors. Finally, we develop an efficient optimization algorithm based on the alternating direction method of multipliers (ADMM), which integrates the majorization-minimization (MM) technique to ensure efficient convergence. We also provide a rigorous theoretical analysis, including a detailed proof of the convexity of the OGER function and the convergence guarantees of our algorithm. In addition, numerical experiment results demonstrate that the proposed model consistently outperforms existing competing models.
{"title":"Generalized low-rank matrix completion model with overlapping group error representation","authors":"Wenjing Lu , Zhuang Fang , Liang Wu , Liming Tang , Hanxin Liu , Chuanjiang He","doi":"10.1016/j.dsp.2026.105925","DOIUrl":"10.1016/j.dsp.2026.105925","url":null,"abstract":"<div><div>The low-rank matrix completion (LRMC) technology has achieved remarkable results in low-level visual tasks. There is an underlying assumption that the real-world matrix data is low-rank in LRMC. However, when matrix data do not strictly satisfy the low-rank property, this assumption creates serious challenges for existing matrix recovery methods. Fortunately, there exist feasible schemes that devise appropriate and effective priori representations for describing the intrinsic information of real data. In this paper, we first model the matrix data <strong>Y</strong> as the sum of a low-rank approximation component <span><math><mi>X</mi></math></span> and an approximation error component <span><math><mi>E</mi></math></span>. This finer-grained data decomposition framework allows each component of information to be portrayed more precisely. To effectively characterize the structured error, we design an overlapping group error representation (OGER) function, which captures structured sparsity by modeling locally correlated errors. Finally, we develop an efficient optimization algorithm based on the alternating direction method of multipliers (ADMM), which integrates the majorization-minimization (MM) technique to ensure efficient convergence. We also provide a rigorous theoretical analysis, including a detailed proof of the convexity of the OGER function and the convergence guarantees of our algorithm. In addition, numerical experiment results demonstrate that the proposed model consistently outperforms existing competing models.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105925"},"PeriodicalIF":3.0,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-17DOI: 10.1016/j.dsp.2026.105926
Zeyad A.H. Qasem , Xingbin Tu , Chunyi Song , Hamada Esmaiel , Waheb A. Jabbar , Fengzhong Qu
Although orthogonal signal division multiplexing (OSDM) offers improved performance for underwater acoustic communication (UWAC), it still faces two major challenges related to the high peak-to-average power ratio (PAPR) and increased sensitivity to inter-vector interference (IVI). This paper proposes a novel OSDM system, termed precoded unique word OSDM based on unitary neural network (UW-OSDM-UNN), to address these issues effectively. The proposed scheme embeds the guard interval within the fast Fourier transform duration to mitigate inter-symbol interference and employs a UNN-based precoder at the transmitter to reduce PAPR and significantly overcome the IVI sensitivity. The UNN-based transmitter is completely independent of the UWAC channel, eliminating the need for receiver-side training or additional testing-stage training. Furthermore, zero vectors and frequency-shifted Chu sequences are incorporated to enable robust Doppler shift estimation and multipath compensation, respectively. The Chu sequences are inserted in the frequency domain to generate deterministic sequences within the guard interval without introducing additional inter-symbol interference. The system is validated through both simulations and real-world sea trials over a 300-meter underwater connection. Results show that the proposed scheme achieves up to a 4 dB PAPR reduction, a 5 dB improvement in bit error rate (BER), and superior robustness against challenging UWAC channel conditions compared to state-of-the-art OSDM-based systems.
{"title":"Unique word orthogonal signal division multiplexing with complex unitary neural network for underwater acoustic communication","authors":"Zeyad A.H. Qasem , Xingbin Tu , Chunyi Song , Hamada Esmaiel , Waheb A. Jabbar , Fengzhong Qu","doi":"10.1016/j.dsp.2026.105926","DOIUrl":"10.1016/j.dsp.2026.105926","url":null,"abstract":"<div><div>Although orthogonal signal division multiplexing (OSDM) offers improved performance for underwater acoustic communication (UWAC), it still faces two major challenges related to the high peak-to-average power ratio (PAPR) and increased sensitivity to inter-vector interference (IVI). This paper proposes a novel OSDM system, termed precoded unique word OSDM based on unitary neural network (UW-OSDM-UNN), to address these issues effectively. The proposed scheme embeds the guard interval within the fast Fourier transform duration to mitigate inter-symbol interference and employs a UNN-based precoder at the transmitter to reduce PAPR and significantly overcome the IVI sensitivity. The UNN-based transmitter is completely independent of the UWAC channel, eliminating the need for receiver-side training or additional testing-stage training. Furthermore, zero vectors and frequency-shifted Chu sequences are incorporated to enable robust Doppler shift estimation and multipath compensation, respectively. The Chu sequences are inserted in the frequency domain to generate deterministic sequences within the guard interval without introducing additional inter-symbol interference. The system is validated through both simulations and real-world sea trials over a 300-meter underwater connection. Results show that the proposed scheme achieves up to a 4 dB PAPR reduction, a 5 dB improvement in bit error rate (BER), and superior robustness against challenging UWAC channel conditions compared to state-of-the-art OSDM-based systems.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105926"},"PeriodicalIF":3.0,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1016/j.dsp.2026.105892
Nan Chen, Xia Liu, Huafeng Wu
This paper proposes a robust preamble detection algorithm for long range radio signals. The method integrates the short-time Fourier transform (STFT) with a generalized likelihood ratio test (GLRT), detecting signals via coherent integration along an estimated time-frequency trajectory. First, a binary hypothesis testing framework is established based on the time-frequency characteristics of the LoRa preamble to discriminate between signal and noise. Then, an STFT with optimized window parameters is adopted to extract time-frequency features. To address the uncertainty of the preamble’s starting position, a discrete time-frequency path model is introduced. By exploiting the known linear frequency modulation structure and optimized window parameters, a discretized grid path is constructed in the time-frequency domain to estimate the signal trajectory. Sliding coherent accumulation is then performed along these paths to form the GLRT statistic. Theoretical analysis shows that the STFT coefficients follow a chi-square distribution under noise-only conditions and a non-central chi-square distribution in the presence of a signal. Based on this, the probability distributions of the coherent accumulated value and the test statistic are derived. Finally, an adaptive threshold computation method is also proposed to optimally balance the detection probability and false alarm rate. Simulations are conducted under various spreading factors, preamble lengths, and carrier frequency offsets. Results indicate that the proposed GLRT detector improves detection probability by about 25% and synchronization accuracy by about 17% in low-SNR scenarios, compared with conventional methods.
{"title":"Robust time-frequency preamble detection for LoRa-modulated signals using optimized generalized likelihood ratio test","authors":"Nan Chen, Xia Liu, Huafeng Wu","doi":"10.1016/j.dsp.2026.105892","DOIUrl":"10.1016/j.dsp.2026.105892","url":null,"abstract":"<div><div>This paper proposes a robust preamble detection algorithm for long range radio signals. The method integrates the short-time Fourier transform (STFT) with a generalized likelihood ratio test (GLRT), detecting signals via coherent integration along an estimated time-frequency trajectory. First, a binary hypothesis testing framework is established based on the time-frequency characteristics of the LoRa preamble to discriminate between signal and noise. Then, an STFT with optimized window parameters is adopted to extract time-frequency features. To address the uncertainty of the preamble’s starting position, a discrete time-frequency path model is introduced. By exploiting the known linear frequency modulation structure and optimized window parameters, a discretized grid path is constructed in the time-frequency domain to estimate the signal trajectory. Sliding coherent accumulation is then performed along these paths to form the GLRT statistic. Theoretical analysis shows that the STFT coefficients follow a chi-square distribution under noise-only conditions and a non-central chi-square distribution in the presence of a signal. Based on this, the probability distributions of the coherent accumulated value and the test statistic are derived. Finally, an adaptive threshold computation method is also proposed to optimally balance the detection probability and false alarm rate. Simulations are conducted under various spreading factors, preamble lengths, and carrier frequency offsets. Results indicate that the proposed GLRT detector improves detection probability by about 25% and synchronization accuracy by about 17% in low-SNR scenarios, compared with conventional methods.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105892"},"PeriodicalIF":3.0,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wideband DOA estimation has become a significant concern in communication, navigation, and radar systems. Previous approaches employed the frequency-domain focusing method to alleviate the wideband impact, but it was constrained by its reliance on prior DOA knowledge. The time-domain wideband DOA estimation methods have also been explored, but often suffered from high-dimensional complexity. This work introduces a time-domain energy focusing (TDEF) scheme that leverages the known waveform and eliminates the reliance on prior DOA information and reduce the high-dimensional complexity. TDEF consists of multi-channel matched filtering and joint power-peak detection. The former concentrates signal energy in the time domain, while the latter mitigates peak migration induced by the wideband scenario. Through this process, the wideband scenario is transformed into an equivalent narrowband counterpart, enabling the application of narrowband DOA estimation techniques. Using matrix-perturbation analysis, we establish the theoretically asymptotic MSE equivalence between TDEF scheme and frequency-domain focusing. The numerical simulations show that the TDEF-based method achieves asymptotic performance approaching the CRLB without prior DOA information, improved resolution for closely spaced sources with different TOAs, and lower computational complexity, especially compared to time-domian sparsity-recovery methods.
{"title":"Wideband DOA estimation based on time-domain energy focusing","authors":"Yuxiang Jiang , Qing Shen , Kejiang Wu , Zexiang Zhang , Chenxi Liao , Shuyuan Xu","doi":"10.1016/j.dsp.2026.105903","DOIUrl":"10.1016/j.dsp.2026.105903","url":null,"abstract":"<div><div>Wideband DOA estimation has become a significant concern in communication, navigation, and radar systems. Previous approaches employed the frequency-domain focusing method to alleviate the wideband impact, but it was constrained by its reliance on prior DOA knowledge. The time-domain wideband DOA estimation methods have also been explored, but often suffered from high-dimensional complexity. This work introduces a time-domain energy focusing (TDEF) scheme that leverages the known waveform and eliminates the reliance on prior DOA information and reduce the high-dimensional complexity. TDEF consists of multi-channel matched filtering and joint power-peak detection. The former concentrates signal energy in the time domain, while the latter mitigates peak migration induced by the wideband scenario. Through this process, the wideband scenario is transformed into an equivalent narrowband counterpart, enabling the application of narrowband DOA estimation techniques. Using matrix-perturbation analysis, we establish the theoretically asymptotic MSE equivalence between TDEF scheme and frequency-domain focusing. The numerical simulations show that the TDEF-based method achieves asymptotic performance approaching the CRLB without prior DOA information, improved resolution for closely spaced sources with different TOAs, and lower computational complexity, especially compared to time-domian sparsity-recovery methods.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105903"},"PeriodicalIF":3.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1016/j.dsp.2025.105861
Panigrahi Srikanth, Chandan Kumar Behera
Audio-based diagnostics are rapidly emerging as non-invasive and accessible tools for identifying respiratory diseases. Medical acoustic signals such as coughs, breaths, and lung sounds carry clinically relevant information with strong potential for disease detection and monitoring. In this context, we introduce SWaRaA, a novel multi-modal deep learning framework that leverages the complementary characteristics of two distinct types of respiratory sound representations. The framework integrates Mel-spectrogram-based image features and Wav2Vec 2.0 embeddings of medical acoustic signals to enhance classification accuracy by capturing both spectral and contextual information. SWaRaA consists of two parallel processing pathways. The first extracts spectral-temporal features using a proposed lightweight CNN-Transformer model comprising Depth-Wise Separable Convolution (DSC), Parallel Convolution Series (PCS), Serial Convolution Series (SCS), and Transformer blocks (TR). The second processes raw acoustic signals through the Wav2Vec 2.0 model to capture deep contextual and temporal features. These representations are fused through a dedicated integration module and passed to a classification head for final prediction. The proposed framework effectively captures both local and long-range dependencies, enabling robust respiratory disease classification. Through extensive experiments across three benchmark datasets and 15 medical acoustic tasks, we establish SWaRaA as a state-of-the-art multi-modal acoustic classification model, offering a scalable and high-performance solution for real-world healthcare applications.
{"title":"SWaRaA: A multi-modal deep learning framework for the diagnosis and classification of respiratory diseases using medical acoustic representations","authors":"Panigrahi Srikanth, Chandan Kumar Behera","doi":"10.1016/j.dsp.2025.105861","DOIUrl":"10.1016/j.dsp.2025.105861","url":null,"abstract":"<div><div>Audio-based diagnostics are rapidly emerging as non-invasive and accessible tools for identifying respiratory diseases. Medical acoustic signals such as coughs, breaths, and lung sounds carry clinically relevant information with strong potential for disease detection and monitoring. In this context, we introduce SWaRaA, a novel multi-modal deep learning framework that leverages the complementary characteristics of two distinct types of respiratory sound representations. The framework integrates Mel-spectrogram-based image features and Wav2Vec 2.0 embeddings of medical acoustic signals to enhance classification accuracy by capturing both spectral and contextual information. SWaRaA consists of two parallel processing pathways. The first extracts spectral-temporal features using a proposed lightweight CNN-Transformer model comprising Depth-Wise Separable Convolution (DSC), Parallel Convolution Series (PCS), Serial Convolution Series (SCS), and Transformer blocks (TR). The second processes raw acoustic signals through the Wav2Vec 2.0 model to capture deep contextual and temporal features. These representations are fused through a dedicated integration module and passed to a classification head for final prediction. The proposed framework effectively captures both local and long-range dependencies, enabling robust respiratory disease classification. Through extensive experiments across three benchmark datasets and 15 medical acoustic tasks, we establish SWaRaA as a state-of-the-art multi-modal acoustic classification model, offering a scalable and high-performance solution for real-world healthcare applications.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105861"},"PeriodicalIF":3.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1016/j.dsp.2026.105911
Khalid Abdullah M. Salih , Ismail Amin Ali , Ramadhan J. Mstafa
Efficient video streaming over IP networks faces significant challenges due to packet loss and network congestion, particularly when using User Datagram Protocol (UDP), which lacks inherent error correction mechanisms. This study provides a comprehensive framework for selecting HEVC encoding configurations based on motion content and network condition. The paper evaluates the packet loss resilience of various HEVC encoding configurations across video content with high-motion, intermediate-motion, and low-motion activity. Utilizing UDP streaming in conjunction with the MPEG Transport Stream (MPEG-TS) container, video quality was quantified using the Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) under packet loss rates of up to 1.0%. Three HEVC encoding configurations IPPP, periodic I, and periodic IDR were assessed. The results indicate that periodic IDR, with its closed GOP structure, achieves the highest resilience to packet loss, rendering it ideal for unreliable networks. Specifically, for high-motion video content, periodic IDR limited PSNR degradation to 6.97 dB (from 28.78 dB to 21.87 dB) under a 0.5% packet loss rate. For intermediate-motion content (Mobcal), PSNR decreased by 9.26 dB (from 34.85 dB to 25.23 dB), and for low-motion content (FourPeople), PSNR degraded by 6.96 dB (from 40.87 dB to 33.91 dB), consistently outperforming the other configurations. In contrast, periodic I demonstrated moderate resilience, with PSNR degradation of 9.6 dB for high-motion content, up to 14.36 dB for intermediate-motion content, and approximately 11.46 dB for low-motion content. The IPPP configuration exhibited the greatest vulnerability, with PSNR degradations of 12.66 dB, 18.7 dB, and 11.95 dB for Crowd_run, Mobcal, and FourPeople, respectively, due to extensive error propagation inherent in its open GOP structure. The findings advance the understanding of error resilience in video compression and offer practical guidelines for maximizing video quality in real-world streaming scenarios over lossy IP networks.
{"title":"Performance of HEVC video coding for delivery over IP networks","authors":"Khalid Abdullah M. Salih , Ismail Amin Ali , Ramadhan J. Mstafa","doi":"10.1016/j.dsp.2026.105911","DOIUrl":"10.1016/j.dsp.2026.105911","url":null,"abstract":"<div><div>Efficient video streaming over IP networks faces significant challenges due to packet loss and network congestion, particularly when using User Datagram Protocol (UDP), which lacks inherent error correction mechanisms. This study provides a comprehensive framework for selecting HEVC encoding configurations based on motion content and network condition. The paper evaluates the packet loss resilience of various HEVC encoding configurations across video content with high-motion, intermediate-motion, and low-motion activity. Utilizing UDP streaming in conjunction with the MPEG Transport Stream (MPEG-TS) container, video quality was quantified using the Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) under packet loss rates of up to 1.0%. Three HEVC encoding configurations <em>IPPP, periodic I</em>, and <em>periodic IDR</em> were assessed. The results indicate that <em>periodic IDR</em>, with its closed GOP structure, achieves the highest resilience to packet loss, rendering it ideal for unreliable networks. Specifically, for high-motion video content, <em>periodic IDR</em> limited PSNR degradation to 6.97 dB (from 28.78 dB to 21.87 dB) under a 0.5% packet loss rate. For intermediate-motion content (Mobcal), PSNR decreased by 9.26 dB (from 34.85 dB to 25.23 dB), and for low-motion content (FourPeople), PSNR degraded by 6.96 dB (from 40.87 dB to 33.91 dB), consistently outperforming the other configurations. In contrast, <em>periodic I</em> demonstrated moderate resilience, with PSNR degradation of 9.6 dB for high-motion content, up to 14.36 dB for intermediate-motion content, and approximately 11.46 dB for low-motion content. The <em>IPPP</em> configuration exhibited the greatest vulnerability, with PSNR degradations of 12.66 dB, 18.7 dB, and 11.95 dB for Crowd_run, Mobcal, and FourPeople, respectively, due to extensive error propagation inherent in its open GOP structure. The findings advance the understanding of error resilience in video compression and offer practical guidelines for maximizing video quality in real-world streaming scenarios over lossy IP networks.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105911"},"PeriodicalIF":3.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1016/j.dsp.2026.105908
Z.M. Kurdoshev , E.A. Pchelintsev
The paper considers the optimal filtering of square integrable signals in Gaussian noise of small intensity. The problem is studied under the condition that the observed process is available only at discrete time moments. This study aims to develop an automated and data-driven model selection procedure (MSP) based on sharp oracle inequalities for optimal estimation of an unknown signal by determining the best combination of smoothness parameters that minimizes the mean square error. We propose a novel hybrid neural network architecture that combines statistical estimation theory with deep learning. A dedicated neural MSP layer is designed to generate a wide range of potential parameter combinations. For each combination, a weighted least squares estimate of the signal is calculated. A gateway network, inspired by the mixture of experts paradigm, is then used to dynamically select the most accurate estimate from this set of candidates. The entire system is trained on a variety of synthetic datasets of clean and noisy signal pairs containing different waveforms, using the mean square error. The proposed MSP demonstrates high performance over a wide range of noise levels. The mean square error for elementary signals remained below 0.5 even in high-noise scenarios. The method also proved to be robust for complex signal combinations, hybrid waveforms, ECG and CWRU signals, successfully reconstructing them with satisfactory accuracy. The gating network effectively learned to set optimal parameters by continuously selecting values within stable ranges. The developed MSP-NN system provides a robust automated solution for nonparametric signals estimation from noisy discrete observations. It successfully bridges the gap between theoretical statistical efficiency and practical application by automating the important and previously manual step of parameter selection. This work paves the way for the development of intelligent data-driven signal processing systems that can operate reliably in the presence of noise uncertainty.
{"title":"Model selection method based on the neural networks for signal processing","authors":"Z.M. Kurdoshev , E.A. Pchelintsev","doi":"10.1016/j.dsp.2026.105908","DOIUrl":"10.1016/j.dsp.2026.105908","url":null,"abstract":"<div><div>The paper considers the optimal filtering of square integrable signals in Gaussian noise of small intensity. The problem is studied under the condition that the observed process is available only at discrete time moments. This study aims to develop an automated and data-driven model selection procedure (MSP) based on sharp oracle inequalities for optimal estimation of an unknown signal by determining the best combination of smoothness parameters that minimizes the mean square error. We propose a novel hybrid neural network architecture that combines statistical estimation theory with deep learning. A dedicated neural MSP layer is designed to generate a wide range of potential parameter combinations. For each combination, a weighted least squares estimate of the signal is calculated. A gateway network, inspired by the mixture of experts paradigm, is then used to dynamically select the most accurate estimate from this set of candidates. The entire system is trained on a variety of synthetic datasets of clean and noisy signal pairs containing different waveforms, using the mean square error. The proposed MSP demonstrates high performance over a wide range of noise levels. The mean square error for elementary signals remained below 0.5 even in high-noise scenarios. The method also proved to be robust for complex signal combinations, hybrid waveforms, ECG and CWRU signals, successfully reconstructing them with satisfactory accuracy. The gating network effectively learned to set optimal parameters by continuously selecting values within stable ranges. The developed MSP-NN system provides a robust automated solution for nonparametric signals estimation from noisy discrete observations. It successfully bridges the gap between theoretical statistical efficiency and practical application by automating the important and previously manual step of parameter selection. This work paves the way for the development of intelligent data-driven signal processing systems that can operate reliably in the presence of noise uncertainty.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105908"},"PeriodicalIF":3.0,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13DOI: 10.1016/j.dsp.2026.105906
Haiyi Tong, Dekang Zhu, Zhou Zhang
This paper presents HAIR-GLMB, a Hybrid Appearance and IoU Reinforced Generalized Labeled Multi-Bernoulli (GLMB) filter tailored for multi-target tracking in challenging unmanned aerial vehicle (UAV) scenarios. To address frequent association ambiguities caused by dense target distributions, we propose an adaptive hybrid cost matrix that integrates Intersection-over-Union (IoU) spatial cues with appearance similarity. Specifically, an entropy-based adaptive weighting mechanism dynamically balances spatial and appearance information, thereby enhancing association reliability. We further develop a reinforced likelihood computation within the GLMB recursion, explicitly embedding spatial and appearance information into the update process. A motion-aware adaptive survival probability model is also proposed, effectively sustaining track continuity for inward-moving targets near the boundaries of the camera’s field of view. To improve efficiency, the Gibbs sampler is initialized with an assignment obtained by the Hungarian algorithm on the hybrid cost matrix, placing the Markov chain near high-probability regions and reducing sampling overhead under a limited computational budget. Experiments on challenging UAV benchmarks (VisDrone2019, UAVDT) show that HAIR-GLMB consistently outperforms a GLMB baseline relying only on IoU, yielding higher tracking accuracy, fewer identity switches, and reduced fragmentation.
{"title":"HAIR-GLMB: Hybrid appearance-IoU reinforced GLMB filter for UAV-based multi-target tracking","authors":"Haiyi Tong, Dekang Zhu, Zhou Zhang","doi":"10.1016/j.dsp.2026.105906","DOIUrl":"10.1016/j.dsp.2026.105906","url":null,"abstract":"<div><div>This paper presents HAIR-GLMB, a Hybrid Appearance and IoU Reinforced Generalized Labeled Multi-Bernoulli (GLMB) filter tailored for multi-target tracking in challenging unmanned aerial vehicle (UAV) scenarios. To address frequent association ambiguities caused by dense target distributions, we propose an adaptive hybrid cost matrix that integrates Intersection-over-Union (IoU) spatial cues with appearance similarity. Specifically, an entropy-based adaptive weighting mechanism dynamically balances spatial and appearance information, thereby enhancing association reliability. We further develop a reinforced likelihood computation within the GLMB recursion, explicitly embedding spatial and appearance information into the update process. A motion-aware adaptive survival probability model is also proposed, effectively sustaining track continuity for inward-moving targets near the boundaries of the camera’s field of view. To improve efficiency, the Gibbs sampler is initialized with an assignment obtained by the Hungarian algorithm on the hybrid cost matrix, placing the Markov chain near high-probability regions and reducing sampling overhead under a limited computational budget. Experiments on challenging UAV benchmarks (VisDrone2019, UAVDT) show that HAIR-GLMB consistently outperforms a GLMB baseline relying only on IoU, yielding higher tracking accuracy, fewer identity switches, and reduced fragmentation.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105906"},"PeriodicalIF":3.0,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13DOI: 10.1016/j.dsp.2026.105910
Dingli Lou, Tuo Fu, Defeng Chen, Huawei Cao
Dense false target jamming (DFTJ) is a typical form of active jamming that generates numerous false targets along the radar line of sight, significantly degrading the detection and tracking performance of radar systems. In multistatic radar systems with spatially separated receivers, jamming signals originating from the same source become highly correlated across various receivers after compensating for their delay and Doppler frequency differences, whereas true target echoes remain weakly correlated because of varying observation geometries. On the basis of these differences, we propose a method for extracting true target signals from jammed echoes. First, the jamming signals are aligned across different receivers by compensating for their amplitude, delay, and Doppler frequency differences. The compensated and pulse-compressed echoes are then stacked into a signal matrix, where the false targets remain nearly invariant across different columns and thus form a low-rank component, while the true targets exhibit amplitude, delay, and Doppler frequency variations, manifesting as sparse high-rank components. Based on this structural distinction, we formulate a robust principal component analysis problem for extracting the true target signals and solve it using the block coordinate descent approach. To satisfy real-time processing demands, we further develop a sequential processing-based version of the proposed method. The numerical simulation results demonstrate the effectiveness of the proposed method, which shows stable performance under different DFTJ strategies, jamming parameters and target characteristics.
{"title":"A true target signal extraction method for defending against dense false target jamming in multistatic radar systems","authors":"Dingli Lou, Tuo Fu, Defeng Chen, Huawei Cao","doi":"10.1016/j.dsp.2026.105910","DOIUrl":"10.1016/j.dsp.2026.105910","url":null,"abstract":"<div><div>Dense false target jamming (DFTJ) is a typical form of active jamming that generates numerous false targets along the radar line of sight, significantly degrading the detection and tracking performance of radar systems. In multistatic radar systems with spatially separated receivers, jamming signals originating from the same source become highly correlated across various receivers after compensating for their delay and Doppler frequency differences, whereas true target echoes remain weakly correlated because of varying observation geometries. On the basis of these differences, we propose a method for extracting true target signals from jammed echoes. First, the jamming signals are aligned across different receivers by compensating for their amplitude, delay, and Doppler frequency differences. The compensated and pulse-compressed echoes are then stacked into a signal matrix, where the false targets remain nearly invariant across different columns and thus form a low-rank component, while the true targets exhibit amplitude, delay, and Doppler frequency variations, manifesting as sparse high-rank components. Based on this structural distinction, we formulate a robust principal component analysis problem for extracting the true target signals and solve it using the block coordinate descent approach. To satisfy real-time processing demands, we further develop a sequential processing-based version of the proposed method. The numerical simulation results demonstrate the effectiveness of the proposed method, which shows stable performance under different DFTJ strategies, jamming parameters and target characteristics.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"173 ","pages":"Article 105910"},"PeriodicalIF":3.0,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}