Pub Date : 2024-10-14DOI: 10.1109/LSP.2024.3479918
T. Averty;A. O. Boudraa;D. Daré-Emzivat
Most natural matrices that incorporate information about a graph are the adjacency and the Laplacian matrices. These algebraic representations govern the fundamental concepts and tools in graph signal processing even though they reveal information in different ways. Furthermore, in the context of spectral graph classification, the problem of cospectrality may arise and it is not well handled by these matrices. Thus, the question of finding the best graph representation matrix still stands. In this letter, a new family of representations that well captures information about graphs and also allows to find the standard representation matrices, is introduced. This family of unified matrices well captures the graph information and extends the recent works of the literature. Two properties are proven, namely its positive semidefiniteness and the monotonicity of their eigenvalues. Reported experimental results of spectral graph classification highlight the potential and the added value of this new family of matrices, and evidence that the best representation depends upon the structure of the underlying graph.
{"title":"A New Family of Graph Representation Matrices: Application to Graph and Signal Classification","authors":"T. Averty;A. O. Boudraa;D. Daré-Emzivat","doi":"10.1109/LSP.2024.3479918","DOIUrl":"https://doi.org/10.1109/LSP.2024.3479918","url":null,"abstract":"Most natural matrices that incorporate information about a graph are the adjacency and the Laplacian matrices. These algebraic representations govern the fundamental concepts and tools in graph signal processing even though they reveal information in different ways. Furthermore, in the context of spectral graph classification, the problem of cospectrality may arise and it is not well handled by these matrices. Thus, the question of finding the best graph representation matrix still stands. In this letter, a new family of representations that well captures information about graphs and also allows to find the standard representation matrices, is introduced. This family of unified matrices well captures the graph information and extends the recent works of the literature. Two properties are proven, namely its positive semidefiniteness and the monotonicity of their eigenvalues. Reported experimental results of spectral graph classification highlight the potential and the added value of this new family of matrices, and evidence that the best representation depends upon the structure of the underlying graph.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2935-2939"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-14DOI: 10.1109/LSP.2024.3479934
Penghui Lai;Yaru Shan;Fanggang Wang;Shilian Wang;Peiguo Liu
The orthogonal time frequency space (OTFS) modulation is resilient to the Doppler effect and thus is employed in high-mobility communications. In earlier work, the equivalent representations of the OTFS transmission were established, revealing the profound impact of fractional delay on channel sparsity. However, these representations tend to be distorted and cumbersome when accounting for digital reception and non-ideal filtering. In this letter, we propose an alternative representation with low distortion that uses the variable-fractional-delay filter to characterize the fractional delays in a conventional digital transceiver. In our proposed method, the approximation of the received signal is improved, and the error performance is enhanced compared to the original approaches. At last, the simulations show that our proposed representation is valid.
{"title":"Approximation of Non-Ideal Filtering in OTFS via Variable Fractional Delay","authors":"Penghui Lai;Yaru Shan;Fanggang Wang;Shilian Wang;Peiguo Liu","doi":"10.1109/LSP.2024.3479934","DOIUrl":"https://doi.org/10.1109/LSP.2024.3479934","url":null,"abstract":"The orthogonal time frequency space (OTFS) modulation is resilient to the Doppler effect and thus is employed in high-mobility communications. In earlier work, the equivalent representations of the OTFS transmission were established, revealing the profound impact of fractional delay on channel sparsity. However, these representations tend to be distorted and cumbersome when accounting for digital reception and non-ideal filtering. In this letter, we propose an alternative representation with low distortion that uses the variable-fractional-delay filter to characterize the fractional delays in a conventional digital transceiver. In our proposed method, the approximation of the received signal is improved, and the error performance is enhanced compared to the original approaches. At last, the simulations show that our proposed representation is valid.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2990-2994"},"PeriodicalIF":3.2,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-10DOI: 10.1109/LSP.2024.3478211
Yiya Hao;Feifei Xiong;Bei Li;Nai Ding;Jinwei Feng
We present a neural speech quality assessment model with speaker embedding. This model, i.e., EMDSQA, can precisely predict the Mean Opinion Score (MOS) of speech quality during online communications. Intrusive speech quality assessment methods such as perceptual objective listening quality analysis (POLQA) are not practical for online communications because every piece of degraded speech requires a corresponding clean reference. Non-intrusive methods can assess the quality of online speech, but have not reached the accuracy and robustness required for real-world applications. EMDSQA extracts the speaker embedding using an independent pipeline and feeds it as a prior feature to a self-attention-based MOS prediction model. Since EMDSQA does not need the corresponding clean reference, it is practical for real-world communication applications. An open-source test corpus, featuring real-world data, was also developed. Experimental results show that EMDSQA achieves a 0.92 Pearson correlation coefficient with the MOS measured from humans, surpassing other state-of-the-art intrusive or non-intrusive methods.
我们提出了一种具有说话者嵌入功能的神经语音质量评估模型。该模型(即 EMDSQA)可精确预测在线通信中语音质量的平均意见分(MOS)。感知客观听力质量分析(POLQA)等侵入式语音质量评估方法在在线通信中并不实用,因为每一段降级语音都需要相应的干净参考。非侵入式方法可以评估在线语音质量,但尚未达到实际应用所需的准确性和鲁棒性。EMDSQA 使用独立管道提取说话者嵌入,并将其作为先验特征输入基于自我关注的 MOS 预测模型。由于 EMDSQA 不需要相应的干净参考,因此在实际通信应用中非常实用。此外,还开发了一个以真实世界数据为特色的开源测试语料库。实验结果表明,EMDSQA 与人工测量的 MOS 之间的皮尔逊相关系数达到了 0.92,超过了其他最先进的侵入式或非侵入式方法。
{"title":"EMDSQA: A Neural Speech Quality Assessment Model With Speaker Embedding","authors":"Yiya Hao;Feifei Xiong;Bei Li;Nai Ding;Jinwei Feng","doi":"10.1109/LSP.2024.3478211","DOIUrl":"https://doi.org/10.1109/LSP.2024.3478211","url":null,"abstract":"We present a neural speech quality assessment model with speaker embedding. This model, i.e., EMDSQA, can precisely predict the Mean Opinion Score (MOS) of speech quality during online communications. Intrusive speech quality assessment methods such as perceptual objective listening quality analysis (POLQA) are not practical for online communications because every piece of degraded speech requires a corresponding clean reference. Non-intrusive methods can assess the quality of online speech, but have not reached the accuracy and robustness required for real-world applications. EMDSQA extracts the speaker embedding using an independent pipeline and feeds it as a prior feature to a self-attention-based MOS prediction model. Since EMDSQA does not need the corresponding clean reference, it is practical for real-world communication applications. An open-source test corpus, featuring real-world data, was also developed. Experimental results show that EMDSQA achieves a 0.92 Pearson correlation coefficient with the MOS measured from humans, surpassing other state-of-the-art intrusive or non-intrusive methods.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3064-3068"},"PeriodicalIF":3.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10713506","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-10DOI: 10.1109/LSP.2024.3478109
Fan Wang;Zhangjie Fu;Xiang Zhang;Junjie Lu
Adversarial embedding for image steganography is a novel technology to effectively enhance the steganographic security of the traditional steganographic algorithms. However, the existing schemes still have room for further improvement in the design of optimization strategy and the steganographic post-processing of optimization failure. In this paper, we design the progressive probability optimizing strategy (PPO). It dynamically selects more efficient gradients to guide the optimization of the probability optimization in a progressive manner. Moreover, we propose a discarded stego recycling mechanism (DSR) to re-select the stego from the discarded stego set that have failed to deceive the target steganalyzer after the optimzation fails. In such way, the statistical distribution of the stego can still further approximate the cover, thus further improving the steganographic security on re-trained steganalyzers in adversary-aware scenario. Comprehensive experiments show that compared with the existing advanced schemes, the proposed method boosts the security improvement against both the re-trained hand-crafted feature-based and deep leanring-based steganalysis models.
{"title":"Adversarial Embedding Steganography via Progressive Probability Optimizing and Discarded Stego Recycling","authors":"Fan Wang;Zhangjie Fu;Xiang Zhang;Junjie Lu","doi":"10.1109/LSP.2024.3478109","DOIUrl":"https://doi.org/10.1109/LSP.2024.3478109","url":null,"abstract":"Adversarial embedding for image steganography is a novel technology to effectively enhance the steganographic security of the traditional steganographic algorithms. However, the existing schemes still have room for further improvement in the design of optimization strategy and the steganographic post-processing of optimization failure. In this paper, we design the progressive probability optimizing strategy (PPO). It dynamically selects more efficient gradients to guide the optimization of the probability optimization in a progressive manner. Moreover, we propose a discarded stego recycling mechanism (DSR) to re-select the stego from the discarded stego set that have failed to deceive the target steganalyzer after the optimzation fails. In such way, the statistical distribution of the stego can still further approximate the cover, thus further improving the steganographic security on re-trained steganalyzers in adversary-aware scenario. Comprehensive experiments show that compared with the existing advanced schemes, the proposed method boosts the security improvement against both the re-trained hand-crafted feature-based and deep leanring-based steganalysis models.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2920-2924"},"PeriodicalIF":3.2,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Within the domain of multimodal communication, the compression of audio, image, and video information is well-established, but compressing haptic signals, including vibrotactile signals, remains challenging. Particularly with the enhancement of haptic signal sampling rate and degrees of freedom, there is a substantial increase in data volume. While existing algorithms have made progress in vibrotactile codecs, there remains significant room for improvement in compression ratios. We propose an innovative Nbeats Network-based Vibrotactile Codec (NNVC) that leverages the statistical characteristics of vibrotactile data. This advanced codec integrates the Nbeats network for precise vibrotactile prediction, residual quantization, efficient Run-Length Encoding, and Huffman coding. The algorithm not only captures the intricate details of vibrotactile signals but also ensures high-efficiency data compression. It exhibits robust overall performance in terms of Signal-to-Noise Ratio (SNR) and Peak Signal-to-Noise Ratio (PSNR), significantly surpassing the state-of-the-art.
{"title":"Efficient Vibrotactile Codec Based on Nbeats Network","authors":"Yiwen Xu;Dongfang Chen;Ying Fang;Yang Lu;Tiesong Zhao","doi":"10.1109/LSP.2024.3477251","DOIUrl":"https://doi.org/10.1109/LSP.2024.3477251","url":null,"abstract":"Within the domain of multimodal communication, the compression of audio, image, and video information is well-established, but compressing haptic signals, including vibrotactile signals, remains challenging. Particularly with the enhancement of haptic signal sampling rate and degrees of freedom, there is a substantial increase in data volume. While existing algorithms have made progress in vibrotactile codecs, there remains significant room for improvement in compression ratios. We propose an innovative Nbeats Network-based Vibrotactile Codec (NNVC) that leverages the statistical characteristics of vibrotactile data. This advanced codec integrates the Nbeats network for precise vibrotactile prediction, residual quantization, efficient Run-Length Encoding, and Huffman coding. The algorithm not only captures the intricate details of vibrotactile signals but also ensures high-efficiency data compression. It exhibits robust overall performance in terms of Signal-to-Noise Ratio (SNR) and Peak Signal-to-Noise Ratio (PSNR), significantly surpassing the state-of-the-art.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2845-2849"},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142447070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1109/LSP.2024.3477254
Zhuo Chen;Xiaoming Niu;Jian Ding;Hong Wu;Zhiyang Liu
The orthogonal time frequency space (OTFS) has emerged as a promising modulation waveform for high-mobility wireless communications owing to its robust advantages of resisting Doppler effects. However, due to the limit of the frame duration, the fractional Doppler shift appears, which is a challenge for channel estimation in OTFS systems. In this letter, we formulate the channel estimation problem as a block sparse signal recovery issue and propose an adaptive pattern-coupled sparse Bayesian learning (APCSBL) method. To be specific, we introduce a pattern-coupled hierarchical Gaussian prior model to characterize the dependencies among adjacent channel coefficients. On this basis, an adaptive hyperparameter strategy is presented, in which we appropriately utilize various coupling parameters further to characterize the strength of the correlation between adjacent elements. Then we exploit the expectation maximization (EM) algorithm to update the hidden variables and the channel vector. Simulation results demonstrate that the proposed algorithm outperforms existing methods and works for various environments.
{"title":"Adaptive Pattern-Coupled Sparse Bayesian Learning for Channel Estimation in OTFS Systems","authors":"Zhuo Chen;Xiaoming Niu;Jian Ding;Hong Wu;Zhiyang Liu","doi":"10.1109/LSP.2024.3477254","DOIUrl":"https://doi.org/10.1109/LSP.2024.3477254","url":null,"abstract":"The orthogonal time frequency space (OTFS) has emerged as a promising modulation waveform for high-mobility wireless communications owing to its robust advantages of resisting Doppler effects. However, due to the limit of the frame duration, the fractional Doppler shift appears, which is a challenge for channel estimation in OTFS systems. In this letter, we formulate the channel estimation problem as a block sparse signal recovery issue and propose an adaptive pattern-coupled sparse Bayesian learning (APCSBL) method. To be specific, we introduce a pattern-coupled hierarchical Gaussian prior model to characterize the dependencies among adjacent channel coefficients. On this basis, an adaptive hyperparameter strategy is presented, in which we appropriately utilize various coupling parameters further to characterize the strength of the correlation between adjacent elements. Then we exploit the expectation maximization (EM) algorithm to update the hidden variables and the channel vector. Simulation results demonstrate that the proposed algorithm outperforms existing methods and works for various environments.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"2895-2899"},"PeriodicalIF":3.2,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-09DOI: 10.1109/LSP.2024.3477298
Haoqian Wang;Zhongyang Xing;Zhongjie Xu;Xiangai Cheng;Teng Li
In this study, we explore poor edge reconstruction in image super-resolution (SR) tasks, emphasizing the significance of enhancing edge details identified through visual analysis. Existing SR networks typically optimize their network architectures, enabling complete feature extraction from feature maps. This is because the management of spatial and channel information during SR is often pivotal to the network's feature extraction capacity. Despite continuous improvements, directly comparing SR and high-resolution (HR) images through differential mapping reveals the suboptimal performance of these methods in edge reconstruction. In this paper, we introduce a edgey-aware attention transformer (EAT), which focuses on edge reconstruction while maintaining the effective original low frequency information retrieval. Our framework utilizes deformable convolution (DC) to adaptively extract edge features. Then feature enhancement techniques are employed to intensify edge-sensitive features. Furthermore, extensive experiments demonstrate our EAT's exceptional quantitative and visual results, which surpass most benchmarks. This validates the EAT's effectiveness when compared to state-of-the-art models. The code is available at https://github.com/ImWangHaoqian/EAT