首页 > 最新文献

IEEE Signal Processing Letters最新文献

英文 中文
Enhancing Multi-Label Deep Hashing for Image and Audio With Joint Internal Global Loss Constraints and Large Vision-Language Model 利用联合内部全局损失约束和大型视觉语言模型增强图像和音频的多标签深度哈希算法
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-06 DOI: 10.1109/LSP.2024.3455991
Ye Liu;Yan Pan;Jian Yin
Deep hashing algorithms can transform high-dimensional features into low-dimensional hash codes, which can reduce storage space and improve computational efficiency in traditional information retrieval (IR) and large model related retrieval augmented generation (RAG) scenarios. In recent years, pre-trained convolutional or transformer networks are commonly chosen as the backbone in deep hashing frameworks. This involves incorporating local loss constraints among training samples, and then fine-tuning the model to generate hash codes. Due to the relatively limited local information of constraints among training samples, we propose to design the novel anchor constraint and structural constraint as internal global loss constraints with the vision transformer network, and augment external information by integrating the large vision-language model, thereby enhancing the performance of hash code generation. Additionally, to enhance the scalability of the novel deep hashing framework, we propose to incorporate the adapter module to extend its application from the image domain to the audio domain. By conducting comparative experiments and ablation analysis on various image and audio datasets, it can be confirmed that the proposed method achieves state-of-the-art retrieval results.
深度散列算法可以将高维特征转化为低维散列码,从而在传统信息检索(IR)和大型模型相关检索增强生成(RAG)场景中减少存储空间并提高计算效率。近年来,深度散列框架通常选择预训练的卷积或变换网络作为骨干。这涉及在训练样本中加入局部损失约束,然后对模型进行微调以生成哈希代码。由于训练样本中的局部约束信息相对有限,我们建议将新颖的锚约束和结构约束设计为视觉转换器网络的内部全局损失约束,并通过整合大型视觉语言模型来增强外部信息,从而提高哈希代码生成的性能。此外,为了增强新型深度散列框架的可扩展性,我们建议加入适配器模块,将其应用从图像领域扩展到音频领域。通过在各种图像和音频数据集上进行对比实验和消融分析,可以证实所提出的方法取得了最先进的检索结果。
{"title":"Enhancing Multi-Label Deep Hashing for Image and Audio With Joint Internal Global Loss Constraints and Large Vision-Language Model","authors":"Ye Liu;Yan Pan;Jian Yin","doi":"10.1109/LSP.2024.3455991","DOIUrl":"10.1109/LSP.2024.3455991","url":null,"abstract":"Deep hashing algorithms can transform high-dimensional features into low-dimensional hash codes, which can reduce storage space and improve computational efficiency in traditional information retrieval (IR) and large model related retrieval augmented generation (RAG) scenarios. In recent years, pre-trained convolutional or transformer networks are commonly chosen as the backbone in deep hashing frameworks. This involves incorporating local loss constraints among training samples, and then fine-tuning the model to generate hash codes. Due to the relatively limited local information of constraints among training samples, we propose to design the novel anchor constraint and structural constraint as internal global loss constraints with the vision transformer network, and augment external information by integrating the large vision-language model, thereby enhancing the performance of hash code generation. Additionally, to enhance the scalability of the novel deep hashing framework, we propose to incorporate the adapter module to extend its application from the image domain to the audio domain. By conducting comparative experiments and ablation analysis on various image and audio datasets, it can be confirmed that the proposed method achieves state-of-the-art retrieval results.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-Temporal Multi-Image Reflection Removal 时空多图像反射去除
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-06 DOI: 10.1109/LSP.2024.3456006
Xinxin Zhang;Wenjing Shang;Qiangchang Wang;Yongshun Gong;Qifang Liu
In this letter, we propose a precise algorithm to eliminate reflections from two images by utilizing temporal and spatial priors. For the temporal prior, we compute the motion information between reflection layers in the two input reflection-contaminated images. Different from numerous popular multi-image reflection removal methods, our proposed algorithm does not assume that two input images are captured under similar lighting conditions and the same camera settings. Furthermore, the proposed algorithm is robust to the difference between the two reflection layers, such as moving objects and different reflections. For the spatial term, a sparsity gradient regularization is adopted to enforce the spatial smoothness of transmission layers and reflection layers. Importantly, the proposed algorithm does not rely on additional training data or high-performance computing devices. Experimental results on both synthetic images and real-world photographs demonstrate that the proposed algorithm achieves State-of-the-Art performance.
在这封信中,我们提出了一种利用时间和空间先验消除两幅图像中反射的精确算法。在时间先验方面,我们计算了两张输入反射污染图像中反射层之间的运动信息。与众多流行的多图像反射去除方法不同,我们提出的算法不假定两幅输入图像是在相似的光照条件和相同的相机设置下拍摄的。此外,所提出的算法对两个反射层之间的差异(如移动物体和不同反射)具有鲁棒性。在空间项上,采用了稀疏梯度正则化技术,以加强透射层和反射层的空间平滑性。重要的是,所提出的算法不依赖于额外的训练数据或高性能计算设备。在合成图像和真实世界照片上的实验结果表明,所提出的算法达到了最先进的性能。
{"title":"Spatio-Temporal Multi-Image Reflection Removal","authors":"Xinxin Zhang;Wenjing Shang;Qiangchang Wang;Yongshun Gong;Qifang Liu","doi":"10.1109/LSP.2024.3456006","DOIUrl":"10.1109/LSP.2024.3456006","url":null,"abstract":"In this letter, we propose a precise algorithm to eliminate reflections from two images by utilizing temporal and spatial priors. For the temporal prior, we compute the motion information between reflection layers in the two input reflection-contaminated images. Different from numerous popular multi-image reflection removal methods, our proposed algorithm does not assume that two input images are captured under similar lighting conditions and the same camera settings. Furthermore, the proposed algorithm is robust to the difference between the two reflection layers, such as moving objects and different reflections. For the spatial term, a sparsity gradient regularization is adopted to enforce the spatial smoothness of transmission layers and reflection layers. Importantly, the proposed algorithm does not rely on additional training data or high-performance computing devices. Experimental results on both synthetic images and real-world photographs demonstrate that the proposed algorithm achieves State-of-the-Art performance.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HyperBT: Redundancy Reduction-Based Self-Supervised Learning for Hyperspectral Image Classification HyperBT:基于减少冗余的自监督学习进行高光谱图像分类
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-05 DOI: 10.1109/LSP.2024.3455234
Jinhui Li;Xiaorun Li;Shuhan Chen
Self-supervised learning effectively leverages the information from unlabeled data to extract spatial-spectral features that are both representative and discriminative, partially addressing the challenge of high data annotation costs in hyperspectral image classification. Inspired by the success of redundancy reduction-based self-supervised learning in other domains, we introduce it into HSIC. We proposed a spatial-spectral feature extraction network, HyperBT, to more effectively reduce redundancy. Specifically, we added the off-diagonal terms of the cross-covariance matrix to the loss function and new data augmentation methods, including band bisection and edge weakening. Experimental results demonstrate that our method achieves high accuracy in classification, surpassing many state-of-the-art methods. Through ablation experiments, we validate the effectiveness of each component in the loss function.
自监督学习能有效利用未标注数据中的信息,提取既有代表性又有区分度的空间光谱特征,从而部分解决高光谱图像分类中数据标注成本高的难题。受基于冗余减少的自监督学习在其他领域取得成功的启发,我们将其引入了 HSIC。我们提出了一种空间光谱特征提取网络 HyperBT,以更有效地减少冗余。具体来说,我们在损失函数中加入了交叉协方差矩阵的非对角项,并采用了新的数据增强方法,包括波段分割和边缘弱化。实验结果表明,我们的方法达到了很高的分类准确率,超过了许多最先进的方法。通过消融实验,我们验证了损失函数中每个分量的有效性。
{"title":"HyperBT: Redundancy Reduction-Based Self-Supervised Learning for Hyperspectral Image Classification","authors":"Jinhui Li;Xiaorun Li;Shuhan Chen","doi":"10.1109/LSP.2024.3455234","DOIUrl":"10.1109/LSP.2024.3455234","url":null,"abstract":"Self-supervised learning effectively leverages the information from unlabeled data to extract spatial-spectral features that are both representative and discriminative, partially addressing the challenge of high data annotation costs in hyperspectral image classification. Inspired by the success of redundancy reduction-based self-supervised learning in other domains, we introduce it into HSIC. We proposed a spatial-spectral feature extraction network, HyperBT, to more effectively reduce redundancy. Specifically, we added the off-diagonal terms of the cross-covariance matrix to the loss function and new data augmentation methods, including band bisection and edge weakening. Experimental results demonstrate that our method achieves high accuracy in classification, surpassing many state-of-the-art methods. Through ablation experiments, we validate the effectiveness of each component in the loss function.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Progressive Mask Transformer With Edge Enhancement for Image Manipulation Localization 用于图像处理定位的边缘增强渐进掩模变换器
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-05 DOI: 10.1109/LSP.2024.3455230
Ye Zhu;Jian Liu;Yang Yu;Yingchun Guo;Xiaoke Hao
Recent developments in image editing techniques have given rise to serious challenges to the credibility of multimedia data. Although some deep learning methods have achieved impressive results, they often fail to detect subtle edge artefacts, and current mainstream methods focus mainly on the foreground content and ignore the background content, which also contains abundant information related to manipulation. To address this issue, this letter proposes a progressive mask transformer with an edge enhancement network for image manipulation localization. Specifically, an edge enhancement flow is introduced to detect subtle manipulated edge artefacts and guide the localization of manipulated regions. Then, the manipulated, genuine and global features are progressively refined using a progressive mask transformer module. We perform extensive experiments on NIST16, Coverage, CASIA and IMD20 datasets to verify the effectiveness of our method, and the results demonstrate that the proposed method outperforms state-of-the-art methods by a wide margin based on on commonly used evaluation metrics.
图像编辑技术的最新发展给多媒体数据的可信度带来了严峻挑战。虽然一些深度学习方法取得了令人瞩目的成果,但它们往往无法检测到细微的边缘伪影,而且目前的主流方法主要关注前景内容,忽略了背景内容,而背景内容中也包含大量与操作相关的信息。为了解决这个问题,本文提出了一种带有边缘增强网络的渐进式掩膜变换器,用于图像操作定位。具体来说,它引入了边缘增强流程来检测微妙的操纵边缘伪影,并引导操纵区域的定位。然后,使用渐进式掩码转换器模块逐步完善操纵特征、真实特征和全局特征。我们在 NIST16、Coverage、CASIA 和 IMD20 数据集上进行了大量实验,以验证我们方法的有效性,结果表明,根据常用的评估指标,所提出的方法在很大程度上优于最先进的方法。
{"title":"Progressive Mask Transformer With Edge Enhancement for Image Manipulation Localization","authors":"Ye Zhu;Jian Liu;Yang Yu;Yingchun Guo;Xiaoke Hao","doi":"10.1109/LSP.2024.3455230","DOIUrl":"10.1109/LSP.2024.3455230","url":null,"abstract":"Recent developments in image editing techniques have given rise to serious challenges to the credibility of multimedia data. Although some deep learning methods have achieved impressive results, they often fail to detect subtle edge artefacts, and current mainstream methods focus mainly on the foreground content and ignore the background content, which also contains abundant information related to manipulation. To address this issue, this letter proposes a progressive mask transformer with an edge enhancement network for image manipulation localization. Specifically, an edge enhancement flow is introduced to detect subtle manipulated edge artefacts and guide the localization of manipulated regions. Then, the manipulated, genuine and global features are progressively refined using a progressive mask transformer module. We perform extensive experiments on NIST16, Coverage, CASIA and IMD20 datasets to verify the effectiveness of our method, and the results demonstrate that the proposed method outperforms state-of-the-art methods by a wide margin based on on commonly used evaluation metrics.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Channels With Hundreds of Sub-Paths for MU-MIMO Uplink: A Structured High-Rank Tensor Approach 为 MU-MIMO 上行链路估计具有数百条子路径的信道:结构化高张量方法
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-03 DOI: 10.1109/LSP.2024.3453655
Panqi Chen;Lei Cheng
This letter introduces a structured high-rank tensor approach for estimating sub-6G uplink channels in multi-user multiple-input and multiple-output (MU-MIMO) systems. To tackle the difficulty of channel estimation in sub-6G bands with hundreds of sub-paths, our approach fully exploits the physical structure of channel and establishes the link between sub-6G channel model and a high-rank four-dimensional (4D) tensor Canonical Polyadic Decomposition (CPD) with three factor matrices being Vandermonde-constrained. Accordingly, a stronger uniqueness property is derived in this work. This model supports an efficient one-pass algorithm for estimating sub-path parameters, which ensures plug-in compatibility with the widely-used baseline. Our method performs much better than the state-of-the-art tensor-based techniques on the simulations adhering to the 3GPP-R18 5G protocols.
本文介绍了一种结构化高阶张量方法,用于估计多用户多输入多输出(MU-MIMO)系统中的6G以下上行链路信道。为了解决在具有数百条子路径的亚 6G 频段中估计信道的难题,我们的方法充分利用了信道的物理结构,并在亚 6G 信道模型和高阶四维(4D)张量佳能多向分解(CPD)之间建立了联系,其中三个因子矩阵是范德蒙德约束的。因此,这项工作推导出了一个更强的唯一性属性。该模型支持估算子路径参数的高效单程算法,确保了与广泛使用的基线插件的兼容性。在符合 3GPP-R18 5G 协议的仿真中,我们的方法比最先进的基于张量的技术表现得更好。
{"title":"Estimating Channels With Hundreds of Sub-Paths for MU-MIMO Uplink: A Structured High-Rank Tensor Approach","authors":"Panqi Chen;Lei Cheng","doi":"10.1109/LSP.2024.3453655","DOIUrl":"https://doi.org/10.1109/LSP.2024.3453655","url":null,"abstract":"This letter introduces a structured high-rank tensor approach for estimating sub-6G uplink channels in multi-user multiple-input and multiple-output (MU-MIMO) systems. To tackle the difficulty of channel estimation in sub-6G bands with hundreds of sub-paths, our approach fully exploits the physical structure of channel and establishes the link between sub-6G channel model and a high-rank four-dimensional (4D) tensor Canonical Polyadic Decomposition (CPD) with three factor matrices being Vandermonde-constrained. Accordingly, a stronger uniqueness property is derived in this work. This model supports an efficient one-pass algorithm for estimating sub-path parameters, which ensures plug-in compatibility with the widely-used baseline. Our method performs much better than the state-of-the-art tensor-based techniques on the simulations adhering to the 3GPP-R18 5G protocols.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142174010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time-Frequency Analysis of Variable-Length WiFi CSI Signals for Person Re-Identification 用于人员再识别的可变长度 WiFi CSI 信号的时频分析
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-03 DOI: 10.1109/LSP.2024.3453201
Chen Mao;Chong Tan;Jingqi Hu;Min Zheng
Person re-identification (ReID), as a crucial technology in the field of security, plays an important role in security detection and people counting. Current security and monitoring systems largely rely on visual information, which may infringe on personal privacy and be susceptible to interference from pedestrian appearances and clothing in certain scenarios. Meanwhile, the widespread use of routers offers new possibilities for ReID. This letter introduces a method using WiFi Channel State Information (CSI), leveraging the multipath propagation characteristics of WiFi signals as a basis for distinguishing different pedestrian features. We propose a two-stream network structure capable of processing variable-length data, which analyzes the amplitude in the time domain and the phase in the frequency domain of WiFi signals, fuses time-frequency information through continuous lateral connections, and employs advanced objective functions for representation and metric learning. Tested on a dataset collected in the real world, our method achieves 93.68% mAP and 98.13% Rank-1.
人员再识别(ReID)作为安防领域的一项重要技术,在安全检测和人员计数方面发挥着重要作用。目前的安防和监控系统主要依靠视觉信息,这可能会侵犯个人隐私,在某些情况下还容易受到行人外貌和服装的干扰。同时,路由器的广泛使用为 ReID 提供了新的可能性。这封信介绍了一种使用 WiFi 信道状态信息(CSI)的方法,利用 WiFi 信号的多径传播特性作为区分不同行人特征的基础。我们提出了一种能处理变长数据的双流网络结构,它能分析 WiFi 信号的时域振幅和频域相位,通过连续的横向连接融合时频信息,并采用先进的目标函数进行表示和度量学习。在真实世界收集的数据集上进行测试,我们的方法实现了 93.68% 的 mAP 和 98.13% 的 Rank-1。
{"title":"Time-Frequency Analysis of Variable-Length WiFi CSI Signals for Person Re-Identification","authors":"Chen Mao;Chong Tan;Jingqi Hu;Min Zheng","doi":"10.1109/LSP.2024.3453201","DOIUrl":"https://doi.org/10.1109/LSP.2024.3453201","url":null,"abstract":"Person re-identification (ReID), as a crucial technology in the field of security, plays an important role in security detection and people counting. Current security and monitoring systems largely rely on visual information, which may infringe on personal privacy and be susceptible to interference from pedestrian appearances and clothing in certain scenarios. Meanwhile, the widespread use of routers offers new possibilities for ReID. This letter introduces a method using WiFi Channel State Information (CSI), leveraging the multipath propagation characteristics of WiFi signals as a basis for distinguishing different pedestrian features. We propose a two-stream network structure capable of processing variable-length data, which analyzes the amplitude in the time domain and the phase in the frequency domain of WiFi signals, fuses time-frequency information through continuous lateral connections, and employs advanced objective functions for representation and metric learning. Tested on a dataset collected in the real world, our method achieves 93.68% mAP and 98.13% Rank-1.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance Analysis of RIS-Assisted Coded Cooperation System Based on Polar Codes With Finite Code Length 基于有限码长极性码的 RIS 辅助编码合作系统性能分析
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-03 DOI: 10.1109/LSP.2024.3453662
Yan Pan;Shunwai Zhang
In this letter, we propose a novel reconfigurable intelligent surface (RIS)-assisted coded cooperation system based on polar codes to pursue the ultra-reliable and global coverage transmission. Firstly, we establish the RIS-assisted coded cooperation system based on polar codes. The polar codes employed at the source and relay are jointly designed by Plotkin construction method, and the joint decoding is performed at the destination. Subsequently, we derive the theoretical expressions for ergodic capacity (EC) under Nakagami-$m$ fading channel model with central limit theorem (CLT) and Gamma approximation (GA). Specially, the closed-form expression for EC with finite code length is obtained by approximating its tight upper and lower bounds. Finally, theoretical analysis and simulation results demonstrate the superiorities of the proposed system compared to the existing schemes, and also reveal that the EC of the proposed system with finite code length approaches the ideal EC as the code length of polar codes increases.
在这封信中,我们提出了一种基于极地编码的新型可重构智能表面(RIS)辅助编码合作系统,以实现超可靠的全球覆盖传输。首先,我们建立了基于极地编码的 RIS 辅助编码合作系统。源极码和中继极码采用普洛特金构造方法共同设计,并在目的地进行联合译码。随后,我们利用中心极限定理(CLT)和伽马近似(GA)推导出了 Nakagamii-$m$ 衰减信道模型下的遍历容量(EC)理论表达式。特别是,通过近似其严格的上下限,得到了有限码长情况下 Ergodic 容量的闭式表达式。最后,理论分析和仿真结果证明了所提系统与现有方案相比的优越性,并揭示了随着极码码长的增加,有限码长所提系统的EC接近理想EC。
{"title":"Performance Analysis of RIS-Assisted Coded Cooperation System Based on Polar Codes With Finite Code Length","authors":"Yan Pan;Shunwai Zhang","doi":"10.1109/LSP.2024.3453662","DOIUrl":"https://doi.org/10.1109/LSP.2024.3453662","url":null,"abstract":"In this letter, we propose a novel reconfigurable intelligent surface (RIS)-assisted coded cooperation system based on polar codes to pursue the ultra-reliable and global coverage transmission. Firstly, we establish the RIS-assisted coded cooperation system based on polar codes. The polar codes employed at the source and relay are jointly designed by Plotkin construction method, and the joint decoding is performed at the destination. Subsequently, we derive the theoretical expressions for ergodic capacity (EC) under Nakagami-\u0000<inline-formula><tex-math>$m$</tex-math></inline-formula>\u0000 fading channel model with central limit theorem (CLT) and Gamma approximation (GA). Specially, the closed-form expression for EC with finite code length is obtained by approximating its tight upper and lower bounds. Finally, theoretical analysis and simulation results demonstrate the superiorities of the proposed system compared to the existing schemes, and also reveal that the EC of the proposed system with finite code length approaches the ideal EC as the code length of polar codes increases.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance Analysis of Hammerstein Block-Oriented Functional Link Adaptive Filters 哈默斯坦块导向功能链路自适应滤波器的性能分析
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-03 DOI: 10.1109/LSP.2024.3453663
Pavankumar Ganjimala;Vinay Chakravarthi Gogineni;Subrahmanyam Mula
Nonlinear adaptive filters (NAFs) exhibit superior modeling capabilities compared to conventional linear adaptive filters, especially in practical applications involving nonlinear input-output relationships. The functional link adaptive filter (FLAF) is an NAF that uses nonlinear functional expansions to achieve nonlinear modelling, however, at the expense of high computational complexity. In response, a low-complexity Hammerstein-type block-oriented functional link adaptive filter (HBO-FLAF) was recently developed, which requires less computation than that of the traditional FLAF. To shed more light on its behaviour and design, we provide a steady-state theoretical analysis of the HBO-FLAF in this paper. We derive the conditions for steady-state mean and mean square convergence of the weight update equations, specifically, an upper bound on the step-size parameter, an expression for the steady-state excess mean square error (EMSE) and a lower bound on the steady-state EMSE of the HBO-FLAF. Numerical simulation results show a close relation with the derived results, thus validating the theoretical analysis.
与传统的线性自适应滤波器相比,非线性自适应滤波器(NAF)具有更强的建模能力,尤其是在涉及非线性输入输出关系的实际应用中。函数链路自适应滤波器(FLAF)是一种利用非线性函数展开来实现非线性建模的自适应滤波器,但其代价是高计算复杂度。为此,最近开发了一种低复杂度的哈默斯坦型块导向功能链路自适应滤波器(HBO-FLAF),它所需的计算量比传统的 FLAF 少。为了进一步阐明其行为和设计,我们在本文中对 HBO-FLAF 进行了稳态理论分析。我们推导出了权值更新方程稳态均值和均方收敛的条件,特别是步长参数的上限、稳态过剩均方误差(EMSE)的表达式以及 HBO-FLAF 稳态 EMSE 的下限。数值模拟结果显示与推导结果关系密切,从而验证了理论分析。
{"title":"Performance Analysis of Hammerstein Block-Oriented Functional Link Adaptive Filters","authors":"Pavankumar Ganjimala;Vinay Chakravarthi Gogineni;Subrahmanyam Mula","doi":"10.1109/LSP.2024.3453663","DOIUrl":"https://doi.org/10.1109/LSP.2024.3453663","url":null,"abstract":"Nonlinear adaptive filters (NAFs) exhibit superior modeling capabilities compared to conventional linear adaptive filters, especially in practical applications involving nonlinear input-output relationships. The functional link adaptive filter (FLAF) is an NAF that uses nonlinear functional expansions to achieve nonlinear modelling, however, at the expense of high computational complexity. In response, a low-complexity Hammerstein-type block-oriented functional link adaptive filter (HBO-FLAF) was recently developed, which requires less computation than that of the traditional FLAF. To shed more light on its behaviour and design, we provide a steady-state theoretical analysis of the HBO-FLAF in this paper. We derive the conditions for steady-state mean and mean square convergence of the weight update equations, specifically, an upper bound on the step-size parameter, an expression for the steady-state excess mean square error (EMSE) and a lower bound on the steady-state EMSE of the HBO-FLAF. Numerical simulation results show a close relation with the derived results, thus validating the theoretical analysis.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142174003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conversational Short-Phrase Speaker Diarization via Self-Adjusting Speech Segmentation and Embedding Extraction 通过自调整语音分割和嵌入式提取实现对话短语说话者日记化
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-03 DOI: 10.1109/LSP.2024.3453772
Haitian Lu;Gaofeng Cheng;Yonghong Yan
Conversational short-phrase speaker diarization focuses on diarizing the phrases that are short in duration. Nonetheless, conventional speaker diarization systems fail to give enough importance to conversational short phrases. This letter proposed a novel speaker diarization system to address this issue. Firstly, we employ an RNN-T model for joint speech recognition and speaker change detection. The speech recognition results can be utilized directly in downstream tasks while the speaker change points serve as guidance for the following steps. Secondly, we introduce self-adjusting speech segmentation, which dynamically adjusts segment lengths based on the temporal distribution of speaker change points. Thirdly, we introduce self-adjusting embedding extraction, which employs speaker encoders trained under different speech duration conditions by projecting them to the same embedding space. Our method achieves a major reduction of Diarization Error Rate (DER) and Conversational Diarization Error Rate (CDER) on the MagicData-RAMC and Mixer 6 datasets.
会话短语说话者日记法侧重于记录持续时间较短的短语。然而,传统的说话者日记系统未能对会话短语给予足够的重视。针对这一问题,本文提出了一种新型的说话者日记系统。首先,我们采用 RNN-T 模型进行联合语音识别和说话人变化检测。语音识别结果可直接用于下游任务,而说话人变化点则可为后续步骤提供指导。其次,我们引入了自调整语音分割技术,根据说话人变化点的时间分布动态调整分割长度。第三,我们引入了自调整嵌入提取,通过将在不同语音时长条件下训练的扬声器编码器投射到相同的嵌入空间,从而使用扬声器编码器。在 MagicData-RAMC 和 Mixer 6 数据集上,我们的方法大大降低了 Diarization Error Rate (DER) 和 Conversational Diarization Error Rate (CDER)。
{"title":"Conversational Short-Phrase Speaker Diarization via Self-Adjusting Speech Segmentation and Embedding Extraction","authors":"Haitian Lu;Gaofeng Cheng;Yonghong Yan","doi":"10.1109/LSP.2024.3453772","DOIUrl":"10.1109/LSP.2024.3453772","url":null,"abstract":"Conversational short-phrase speaker diarization focuses on diarizing the phrases that are short in duration. Nonetheless, conventional speaker diarization systems fail to give enough importance to conversational short phrases. This letter proposed a novel speaker diarization system to address this issue. Firstly, we employ an RNN-T model for joint speech recognition and speaker change detection. The speech recognition results can be utilized directly in downstream tasks while the speaker change points serve as guidance for the following steps. Secondly, we introduce self-adjusting speech segmentation, which dynamically adjusts segment lengths based on the temporal distribution of speaker change points. Thirdly, we introduce self-adjusting embedding extraction, which employs speaker encoders trained under different speech duration conditions by projecting them to the same embedding space. Our method achieves a major reduction of Diarization Error Rate (DER) and Conversational Diarization Error Rate (CDER) on the MagicData-RAMC and Mixer 6 datasets.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting Generative Diffusion Prior With Latent Low-Rank Regularization for Image Inpainting 利用潜在低秩正则化生成扩散先验进行图像绘制
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-03 DOI: 10.1109/LSP.2024.3453665
Zhentao Zou;Lin Chen;Xue Jiang;Abdelhak M. Zoubir
Generative diffusion models have recently shown impressive results in image restoration. However, the predicted noise from existing diffusion-based methods may be inaccurate, especially when the noise amplitude is small, thereby leading to sub-optimal results. In this letter, an unsupervised diffusion model with latent low-rank regularization is proposed to alleviate this challenge. In particular, we first create a latent low-rank space using self-supervised learning for each degraded images, from which we derive corresponding latent low-rank regularization. This regularization, combining with observed prior information and smoothness regularization, guides the reserve sampling process, resulting in the generation of high-quality images with fine-grained textures and fewer artifacts. In addition, by utilizing the pre-trained unconditional diffusion model, the proposed model reconstructs the missing pixels in a zero-shot manner, which does not need any reference images for additional training. Extensive experimental results demonstrate that our proposed method is superior to the self-supervised tensor completion methods and representative diffusion model-based image restoration methods.
生成扩散模型最近在图像修复方面取得了令人瞩目的成果。然而,现有的基于扩散的方法对噪声的预测可能并不准确,尤其是当噪声振幅较小时,从而导致效果不理想。在这封信中,我们提出了一种带有潜在低秩正则化的无监督扩散模型来缓解这一难题。具体来说,我们首先利用自监督学习为每幅降解图像创建一个潜在低阶空间,并从中推导出相应的潜在低阶正则化。这种正则化与观察到的先验信息和平滑度正则化相结合,指导后备采样过程,从而生成具有精细纹理和较少伪影的高质量图像。此外,通过利用预先训练好的无条件扩散模型,所提出的模型能以零镜头的方式重建缺失的像素,不需要任何参考图像来进行额外的训练。大量实验结果表明,我们提出的方法优于自监督张量补全方法和基于扩散模型的代表性图像修复方法。
{"title":"Exploiting Generative Diffusion Prior With Latent Low-Rank Regularization for Image Inpainting","authors":"Zhentao Zou;Lin Chen;Xue Jiang;Abdelhak M. Zoubir","doi":"10.1109/LSP.2024.3453665","DOIUrl":"10.1109/LSP.2024.3453665","url":null,"abstract":"Generative diffusion models have recently shown impressive results in image restoration. However, the predicted noise from existing diffusion-based methods may be inaccurate, especially when the noise amplitude is small, thereby leading to sub-optimal results. In this letter, an unsupervised diffusion model with latent low-rank regularization is proposed to alleviate this challenge. In particular, we first create a latent low-rank space using self-supervised learning for each degraded images, from which we derive corresponding latent low-rank regularization. This regularization, combining with observed prior information and smoothness regularization, guides the reserve sampling process, resulting in the generation of high-quality images with fine-grained textures and fewer artifacts. In addition, by utilizing the pre-trained unconditional diffusion model, the proposed model reconstructs the missing pixels in a zero-shot manner, which does not need any reference images for additional training. Extensive experimental results demonstrate that our proposed method is superior to the self-supervised tensor completion methods and representative diffusion model-based image restoration methods.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Signal Processing Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1