首页 > 最新文献

IEEE Signal Processing Letters最新文献

英文 中文
Diagnosis of Parkinson's Disease Based on Hybrid Fusion Approach of Offline Handwriting Images 基于离线手写图像的混合融合方法诊断帕金森病
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-12 DOI: 10.1109/LSP.2024.3496579
Shanyu Dong;Jin Liu;Jianxin Wang
Handwriting images are commonly used to diagnose Parkinson's disease due to their intuitive nature and easy accessibility. However, existing methods have not explored the potential of the fusion of different handwriting image sources for diagnosis. To address this issue, this study proposes a hybrid fusion approach that makes use of the visual information derived from different handwriting images and handwriting templates, significantly enhancing the performance in diagnosing Parkinson's disease. The proposed method involves several key steps. Initially, different preprocessed handwriting images undergo pixel-level fusion using Laplacian transformation. Subsequently, the fused and original images are fed into a pre-trained CNN separately to extract visual features. Finally, feature-level fusion is performed by concatenating the feature vectors extracted from the flatten layer, and the fused feature vectors are input into SVM to obtain classification results. Our experimental results validate that the proposed method achieves excellent performance by only utilizing visual features from images, with 95.45% accuracy on the NewHandPD. Furthermore, the results obtained on our dataset verify the strong generalizability of the proposed approach.
手写图像因其直观性和易获取性而常用于诊断帕金森病。然而,现有的方法尚未挖掘出融合不同手写图像来源进行诊断的潜力。为解决这一问题,本研究提出了一种混合融合方法,利用从不同笔迹图像和笔迹模板中获得的视觉信息,显著提高诊断帕金森病的性能。所提出的方法包括几个关键步骤。首先,使用拉普拉斯变换对不同的预处理手写图像进行像素级融合。随后,将融合图像和原始图像分别输入预训练的 CNN,以提取视觉特征。最后,通过连接从扁平化层提取的特征向量进行特征级融合,并将融合后的特征向量输入 SVM 以获得分类结果。我们的实验结果验证了所提出的方法仅利用图像中的视觉特征就能实现出色的性能,在 NewHandPD 上的准确率高达 95.45%。此外,在我们的数据集上获得的结果也验证了所提出的方法具有很强的通用性。
{"title":"Diagnosis of Parkinson's Disease Based on Hybrid Fusion Approach of Offline Handwriting Images","authors":"Shanyu Dong;Jin Liu;Jianxin Wang","doi":"10.1109/LSP.2024.3496579","DOIUrl":"https://doi.org/10.1109/LSP.2024.3496579","url":null,"abstract":"Handwriting images are commonly used to diagnose Parkinson's disease due to their intuitive nature and easy accessibility. However, existing methods have not explored the potential of the fusion of different handwriting image sources for diagnosis. To address this issue, this study proposes a hybrid fusion approach that makes use of the visual information derived from different handwriting images and handwriting templates, significantly enhancing the performance in diagnosing Parkinson's disease. The proposed method involves several key steps. Initially, different preprocessed handwriting images undergo pixel-level fusion using Laplacian transformation. Subsequently, the fused and original images are fed into a pre-trained CNN separately to extract visual features. Finally, feature-level fusion is performed by concatenating the feature vectors extracted from the flatten layer, and the fused feature vectors are input into SVM to obtain classification results. Our experimental results validate that the proposed method achieves excellent performance by only utilizing visual features from images, with 95.45% accuracy on the NewHandPD. Furthermore, the results obtained on our dataset verify the strong generalizability of the proposed approach.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3179-3183"},"PeriodicalIF":3.2,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142671993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Multi-Prototypes Aware Integration for Zero-Shot Cross-Domain Slot Filling 用于零点跨域插槽填充的鲁棒多原型感知集成技术
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-11 DOI: 10.1109/LSP.2024.3495561
Shaoshen Chen;Peijie Huang;Zhanbiao Zhu;Yexing Zhang;Yuhong Xu
Cross-domain slot filling is a widely explored problem in spoken language understanding (SLU), which requires the model to transfer between different domains under data sparsity conditions. Dominant two-step hierarchical models first extract slot entities and then calculate the similarity score between slot description-based prototypes and the last hidden layer of the slot entity, selecting the closest prototype as the predicted slot type. However, these models only use slot descriptions as prototypes, which lacks robustness. Moreover, these approaches have less regard for the inherent knowledge in the slot entity embedding to suffer from the issue of overfitting. In this letter, we propose a Robust Multi-prototypes Aware Integration (RMAI) method for zero-shot cross-domain slot filling. In RMAI, more robust slot entity-based prototypes and inherent knowledge in the slot entity embedding are utilized to improve the classification performance and alleviate the risk of overfitting. Furthermore, a multi-prototypes aware integration approach is proposed to effectively integrate both our proposed slot entity-based prototypes and the slot description-based prototypes. Experimental results on the SNIPS dataset demonstrate the well performance of RMAI.
跨域插槽填充是口语理解(SLU)中一个被广泛探讨的问题,它要求模型在数据稀疏的条件下在不同域之间转移。主流的两步分层模型首先提取槽实体,然后计算基于槽描述的原型与槽实体最后一个隐藏层之间的相似度得分,选择最接近的原型作为预测槽类型。然而,这些模型仅使用槽描述作为原型,缺乏稳健性。此外,这些方法较少考虑槽实体嵌入中的固有知识,因而存在过拟合问题。在这封信中,我们提出了一种用于零点跨域插槽填充的鲁棒多原型感知集成(RMAI)方法。在 RMAI 中,我们利用基于插槽实体的更稳健的插槽原型和插槽实体嵌入中的固有知识来提高分类性能并降低过拟合风险。此外,我们还提出了一种多原型感知集成方法,以有效集成我们提出的基于插槽实体的原型和基于插槽描述的原型。SNIPS 数据集上的实验结果证明了 RMAI 的良好性能。
{"title":"Robust Multi-Prototypes Aware Integration for Zero-Shot Cross-Domain Slot Filling","authors":"Shaoshen Chen;Peijie Huang;Zhanbiao Zhu;Yexing Zhang;Yuhong Xu","doi":"10.1109/LSP.2024.3495561","DOIUrl":"https://doi.org/10.1109/LSP.2024.3495561","url":null,"abstract":"Cross-domain slot filling is a widely explored problem in spoken language understanding (SLU), which requires the model to transfer between different domains under data sparsity conditions. Dominant two-step hierarchical models first extract slot entities and then calculate the similarity score between slot description-based prototypes and the last hidden layer of the slot entity, selecting the closest prototype as the predicted slot type. However, these models only use slot descriptions as prototypes, which lacks robustness. Moreover, these approaches have less regard for the inherent knowledge in the slot entity embedding to suffer from the issue of overfitting. In this letter, we propose a Robust Multi-prototypes Aware Integration (RMAI) method for zero-shot cross-domain slot filling. In RMAI, more robust slot entity-based prototypes and inherent knowledge in the slot entity embedding are utilized to improve the classification performance and alleviate the risk of overfitting. Furthermore, a multi-prototypes aware integration approach is proposed to effectively integrate both our proposed slot entity-based prototypes and the slot description-based prototypes. Experimental results on the SNIPS dataset demonstrate the well performance of RMAI.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3169-3173"},"PeriodicalIF":3.2,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SoLAD: Sampling Over Latent Adapter for Few Shot Generation SoLAD: 在潜在适配器上采样,生成少量镜头
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-11 DOI: 10.1109/LSP.2024.3496822
Arnab Kumar Mondal;Piyush Tiwary;Parag Singla;Prathosh A.P.
Few-shot adaptation of Generative Adversarial Networks (GANs) under distributional shift is generally achieved via regularized retraining or latent space adaptation. While the former methods offer fast inference, the latter generate diverse images. This work aims to solve these issues and achieve the best of both regimes in a principled manner via Bayesian reformulation of the GAN objective. We highlight a hidden expectation term over GAN parameters, that is often overlooked but is critical in few-shot settings. This observation helps us justify prepending a latent adapter network (LAN) before a pre-trained GAN and propose a sampling procedure over the parameters of LAN (called SoLAD) to compute the usually-ignored hidden expectation. SoLAD enables fast generation of quality samples from multiple few-shot target domains using a GAN pre-trained on a single source domain.
生成式对抗网络(GAN)在分布偏移情况下的少量适应通常是通过正则化重训练或潜空间适应来实现的。前一种方法推理速度快,而后一种方法生成的图像却千差万别。这项研究旨在解决这些问题,并通过对 GAN 目标的贝叶斯重新表述,以原则性的方式实现两种机制的最佳效果。我们强调了 GAN 参数上的一个隐藏期望项,该期望项经常被忽视,但在少镜头设置中却至关重要。这一观察结果帮助我们证明了在预训练 GAN 之前预置潜在适配器网络 (LAN) 的合理性,并提出了一种针对 LAN 参数的采样程序(称为 SoLAD)来计算通常被忽视的隐藏期望值。SoLAD 可以使用在单个源域上预先训练好的 GAN,从多个少量目标域中快速生成高质量样本。
{"title":"SoLAD: Sampling Over Latent Adapter for Few Shot Generation","authors":"Arnab Kumar Mondal;Piyush Tiwary;Parag Singla;Prathosh A.P.","doi":"10.1109/LSP.2024.3496822","DOIUrl":"https://doi.org/10.1109/LSP.2024.3496822","url":null,"abstract":"Few-shot adaptation of Generative Adversarial Networks (GANs) under distributional shift is generally achieved via regularized retraining or latent space adaptation. While the former methods offer fast inference, the latter generate diverse images. This work aims to solve these issues and achieve the best of both regimes in a principled manner via Bayesian reformulation of the GAN objective. We highlight a hidden expectation term over GAN parameters, that is often overlooked but is critical in few-shot settings. This observation helps us justify prepending a latent adapter network (LAN) before a pre-trained GAN and propose a sampling procedure over the parameters of LAN (called SoLAD) to compute the usually-ignored hidden expectation. SoLAD enables fast generation of quality samples from multiple few-shot target domains using a GAN pre-trained on a single source domain.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3174-3178"},"PeriodicalIF":3.2,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differentiable Duration Refinement Using Internal Division for Non-Autoregressive Text-to-Speech 利用内分法对非自回归文本到语音进行可变时长细化
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-11 DOI: 10.1109/LSP.2024.3495578
Jaeuk Lee;Yoonsoo Shin;Joon-Hyuk Chang
Most non-autoregressive text-to-speech (TTS) models acquire target phoneme duration (target duration) from internal or external aligners. They transform the speech-phoneme alignment produced by the aligner into the target duration. Since this transformation is not differentiable, the gradient of the loss function that maximizes the TTS model's likelihood of speech (e.g., mel spectrogram or waveform) cannot be propagated to the target duration. In other words, the target duration is produced regardless of the TTS model's likelihood of speech. Hence, we introduce a differentiable duration refinement that produces a learnable target duration for maximizing the likelihood of speech. The proposed method uses an internal division to locate the phoneme boundary, which is determined to improve the performance of the TTS model. Additionally, we propose a duration distribution loss to enhance the performance of the duration predictor. Our baseline model is JETS, a representative end-to-end TTS model, and we apply the proposed methods to the baseline model. Experimental results show that the proposed method outperforms the baseline model in terms of subjective naturalness and character error rate.
大多数非自回归文本到语音(TTS)模型都从内部或外部对齐器中获取目标音素时长(目标时长)。它们将对齐器产生的语音-音素对齐转换为目标持续时间。由于这种转换是不可微的,因此能最大化 TTS 模型语音可能性的损失函数梯度(如融化频谱图或波形)无法传播到目标时长。换句话说,目标时长的产生与 TTS 模型的语音可能性无关。因此,我们引入了一种可微分的持续时间细化方法,它能产生可学习的目标持续时间,从而最大限度地提高语音的可能性。所提出的方法使用内部分割来定位音素边界,以提高 TTS 模型的性能。此外,我们还提出了时长分布损失,以提高时长预测器的性能。我们的基准模型是具有代表性的端到端 TTS 模型 JETS,我们将提出的方法应用于基准模型。实验结果表明,所提出的方法在主观自然度和字符错误率方面优于基线模型。
{"title":"Differentiable Duration Refinement Using Internal Division for Non-Autoregressive Text-to-Speech","authors":"Jaeuk Lee;Yoonsoo Shin;Joon-Hyuk Chang","doi":"10.1109/LSP.2024.3495578","DOIUrl":"https://doi.org/10.1109/LSP.2024.3495578","url":null,"abstract":"Most non-autoregressive text-to-speech (TTS) models acquire target phoneme duration (target duration) from internal or external aligners. They transform the speech-phoneme alignment produced by the aligner into the target duration. Since this transformation is not differentiable, the gradient of the loss function that maximizes the TTS model's likelihood of speech (e.g., mel spectrogram or waveform) cannot be propagated to the target duration. In other words, the target duration is produced regardless of the TTS model's likelihood of speech. Hence, we introduce a differentiable duration refinement that produces a learnable target duration for maximizing the likelihood of speech. The proposed method uses an internal division to locate the phoneme boundary, which is determined to improve the performance of the TTS model. Additionally, we propose a duration distribution loss to enhance the performance of the duration predictor. Our baseline model is JETS, a representative end-to-end TTS model, and we apply the proposed methods to the baseline model. Experimental results show that the proposed method outperforms the baseline model in terms of subjective naturalness and character error rate.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3154-3158"},"PeriodicalIF":3.2,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142671988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LFSamba: Marry SAM With Mamba for Light Field Salient Object Detection LFSamba:将 SAM 与 Mamba 相结合,用于光场显著目标检测
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-07 DOI: 10.1109/LSP.2024.3493799
Zhengyi Liu;Longzhen Wang;Xianyong Fang;Zhengzheng Tu;Linbo Wang
A light field camera can reconstruct 3D scenes using captured multi-focus images that contain rich spatial geometric information, enhancing applications in stereoscopic photography, virtual reality, and robotic vision. In this work, a state-of-the-art salient object detection model for multi-focus light field images, called LFSamba, is introduced to emphasize four main insights: (a) Efficient feature extraction, where SAM is used to extract modality-aware discriminative features; (b) Inter-slice relation modeling, leveraging Mamba to capture long-range dependencies across multiple focal slices, thus extracting implicit depth cues; (c) Inter-modal relation modeling, utilizing Mamba to integrate all-focus and multi-focus images, enabling mutual enhancement; (d) Weakly supervised learning capability, developing a scribble annotation dataset from an existing pixel-level mask dataset, establishing the first scribble-supervised baseline for light field salient object detection.
光场照相机可以利用捕捉到的包含丰富空间几何信息的多焦点图像重建三维场景,从而提高立体摄影、虚拟现实和机器人视觉领域的应用水平。在这项工作中,介绍了一种用于多焦点光场图像的最先进的突出物体检测模型,称为 LFSamba,强调了四个主要观点:(a) 高效特征提取,其中 SAM 用于提取模态感知的判别特征;(b) 片间关系建模,利用 Mamba 捕捉多个焦点切片之间的长距离依赖关系,从而提取隐含的深度线索;(c) 跨模态关系建模,利用 Mamba 整合全焦和多焦图像,实现相互增强;(d) 弱监督学习能力,从现有的像素级掩膜数据集开发涂鸦注释数据集,为光场突出物体检测建立首个涂鸦监督基线。
{"title":"LFSamba: Marry SAM With Mamba for Light Field Salient Object Detection","authors":"Zhengyi Liu;Longzhen Wang;Xianyong Fang;Zhengzheng Tu;Linbo Wang","doi":"10.1109/LSP.2024.3493799","DOIUrl":"https://doi.org/10.1109/LSP.2024.3493799","url":null,"abstract":"A light field camera can reconstruct 3D scenes using captured multi-focus images that contain rich spatial geometric information, enhancing applications in stereoscopic photography, virtual reality, and robotic vision. In this work, a state-of-the-art salient object detection model for multi-focus light field images, called LFSamba, is introduced to emphasize four main insights: (a) Efficient feature extraction, where SAM is used to extract modality-aware discriminative features; (b) Inter-slice relation modeling, leveraging Mamba to capture long-range dependencies across multiple focal slices, thus extracting implicit depth cues; (c) Inter-modal relation modeling, utilizing Mamba to integrate all-focus and multi-focus images, enabling mutual enhancement; (d) Weakly supervised learning capability, developing a scribble annotation dataset from an existing pixel-level mask dataset, establishing the first scribble-supervised baseline for light field salient object detection.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3144-3148"},"PeriodicalIF":3.2,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142671986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Binomial Harmonic Approximation Double-Phase Estimator Tracking for BOC Modulated Signals 用于 BOC 调制信号的二项式谐波近似双相估计跟踪器
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-07 DOI: 10.1109/LSP.2024.3493793
Xiangjie Ding;Zhi Zhao;Ying Yang
For binary offset carrier (BOC) signal tracking, the Two-Dimensional (2D) tracking method that independently tracks the code and subcarrier has garnered significant attention. The double estimator (DE) and the double phase estimator (DPE) are prominent approaches. However, the performance of the DE suffers under limited front-end bandwidths and sampling rates. The DPE, which treats the subcarrier as a sine wave, neglects side lobes, leading to performance degradation. This letter introduces the Binomial Harmonic Approximation DPE (BH-DPE), which uses two phase lock loops to track the first and third harmonics of the subcarrier. By applying a weighted combination of correlation values, the BH-DPE effectively reduces coherent output signal-to-noise ratio (SNR) loss and enhances ranging accuracy through combined delay estimations from both the harmonics. Theoretical analysis and simulations show that the BH-DPE outperforms both the DE and the DPE in terms of SNR loss and ranging accuracy under constrained front-end bandwidths and sampling rates, and approaches the DE while exceeds the DPE under wide front-end bandwidths.
在二进制偏移载波(BOC)信号跟踪方面,独立跟踪编码和子载波的二维(2D)跟踪方法备受关注。双估算器(DE)和双相位估算器(DPE)是比较突出的方法。然而,在有限的前端带宽和采样率条件下,DE 的性能受到影响。DPE 将子载波视为正弦波,忽略了边叶,导致性能下降。这封信介绍了二项式谐波近似 DPE(BH-DPE),它使用两个锁相环来跟踪副载波的第一次和第三次谐波。通过应用相关值的加权组合,BH-DPE 有效地减少了相干输出信噪比(SNR)损失,并通过两个谐波的组合延迟估计提高了测距精度。理论分析和仿真结果表明,在有限的前端带宽和采样率条件下,BH-DPE 在信噪比损失和测距精度方面优于 DE 和 DPE;在宽前端带宽条件下,BH-DPE 接近 DE,但超过 DPE。
{"title":"Binomial Harmonic Approximation Double-Phase Estimator Tracking for BOC Modulated Signals","authors":"Xiangjie Ding;Zhi Zhao;Ying Yang","doi":"10.1109/LSP.2024.3493793","DOIUrl":"https://doi.org/10.1109/LSP.2024.3493793","url":null,"abstract":"For binary offset carrier (BOC) signal tracking, the Two-Dimensional (2D) tracking method that independently tracks the code and subcarrier has garnered significant attention. The double estimator (DE) and the double phase estimator (DPE) are prominent approaches. However, the performance of the DE suffers under limited front-end bandwidths and sampling rates. The DPE, which treats the subcarrier as a sine wave, neglects side lobes, leading to performance degradation. This letter introduces the Binomial Harmonic Approximation DPE (BH-DPE), which uses two phase lock loops to track the first and third harmonics of the subcarrier. By applying a weighted combination of correlation values, the BH-DPE effectively reduces coherent output signal-to-noise ratio (SNR) loss and enhances ranging accuracy through combined delay estimations from both the harmonics. Theoretical analysis and simulations show that the BH-DPE outperforms both the DE and the DPE in terms of SNR loss and ranging accuracy under constrained front-end bandwidths and sampling rates, and approaches the DE while exceeds the DPE under wide front-end bandwidths.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3139-3143"},"PeriodicalIF":3.2,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CADeTT: Context-Adaptive Deep-Trinary-Tree Lossless Compression of Event Camera Frames CADeTT:事件摄像机帧的上下文自适应深度二叉树无损压缩
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-07 DOI: 10.1109/LSP.2024.3493801
Ionut Schiopu;Radu Ciprian Bilcu
The letter proposes an efficient context-adaptive lossless compression method for encoding event frame sequences. A first contribution proposes the use of a deep-ternary-tree of the current pixel position context as the context-tree model selector. The arithmetic codec encodes each trinary symbol using the probability distribution of the associated context-tree-leaf model. Another contribution proposes a novel context design based on several frames, where the context order controls the codec's complexity. Another contribution proposes a model search procedure to replace the context-tree prune-and-encode strategy by searching for the closest “mature” context model between lower-order context-tree models. The experimental evaluation shows that the proposed method provides an improved coding performance of 34.34% and a smaller runtime of up to $5.18times$ compared with state-of-the-art lossless image codec FLIF and, respectively, 6.95% and $14.42times$ compared with our prior work.
这封信提出了一种高效的上下文自适应无损压缩方法,用于编码事件帧序列。第一个贡献是提出使用当前像素位置上下文的深三叉树作为上下文树模型选择器。算术编解码器使用相关上下文树叶模型的概率分布对每个二进制符号进行编码。另一篇论文提出了一种基于多个帧的新型上下文设计,其中上下文顺序控制着编解码器的复杂性。另一篇论文提出了一种模型搜索程序,通过在低阶上下文树模型之间搜索最接近的 "成熟 "上下文模型来取代上下文树剪枝编码策略。实验评估表明,与最先进的无损图像编解码器 FLIF 相比,所提方法的编码性能提高了 34.34%,运行时间缩短了 5.18 美元/次;与我们之前的工作相比,所提方法的编码性能提高了 6.95%,运行时间缩短了 14.42 美元/次。
{"title":"CADeTT: Context-Adaptive Deep-Trinary-Tree Lossless Compression of Event Camera Frames","authors":"Ionut Schiopu;Radu Ciprian Bilcu","doi":"10.1109/LSP.2024.3493801","DOIUrl":"https://doi.org/10.1109/LSP.2024.3493801","url":null,"abstract":"The letter proposes an efficient context-adaptive lossless compression method for encoding event frame sequences. A first contribution proposes the use of a deep-ternary-tree of the current pixel position context as the context-tree model selector. The arithmetic codec encodes each trinary symbol using the probability distribution of the associated context-tree-leaf model. Another contribution proposes a novel context design based on several frames, where the context order controls the codec's complexity. Another contribution proposes a model search procedure to replace the context-tree prune-and-encode strategy by searching for the closest “mature” context model between lower-order context-tree models. The experimental evaluation shows that the proposed method provides an improved coding performance of 34.34% and a smaller runtime of up to \u0000<inline-formula><tex-math>$5.18times$</tex-math></inline-formula>\u0000 compared with state-of-the-art lossless image codec FLIF and, respectively, 6.95% and \u0000<inline-formula><tex-math>$14.42times$</tex-math></inline-formula>\u0000 compared with our prior work.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3149-3153"},"PeriodicalIF":3.2,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142671989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple Subspace-Based Target Detection in Deterministic Interference 确定性干扰中基于多个子空间的目标探测
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-04 DOI: 10.1109/LSP.2024.3491012
Mengru Sun;Weijian Liu;Jun Liu;Chengpeng Hao;Kefei Li
In this letter, the problem of detecting a multiple subspace-based target in the presence of deterministic interference is considered. To solve the problem, we utilize the Kullback-Leibler information criterion and model order selection rules to design detection schemes. The alternative hypothesis related to the most likely signal subspace is selected from multiple alternative hypotheses, and is tested versus the null hypothesis for target detection. Numerical examples verify the effectiveness of the proposed detection schemes, which can achieve the target detection and subspace-based target classification simultaneously.
在这封信中,我们考虑了在存在确定性干扰的情况下探测基于多个子空间的目标的问题。为了解决这个问题,我们利用库尔贝克-莱伯勒信息准则和模型阶次选择规则来设计检测方案。从多个可选假设中选出与最有可能的信号子空间相关的可选假设,并与目标检测的零假设进行对比测试。数值示例验证了所提检测方案的有效性,该方案可同时实现目标检测和基于子空间的目标分类。
{"title":"Multiple Subspace-Based Target Detection in Deterministic Interference","authors":"Mengru Sun;Weijian Liu;Jun Liu;Chengpeng Hao;Kefei Li","doi":"10.1109/LSP.2024.3491012","DOIUrl":"https://doi.org/10.1109/LSP.2024.3491012","url":null,"abstract":"In this letter, the problem of detecting a multiple subspace-based target in the presence of deterministic interference is considered. To solve the problem, we utilize the Kullback-Leibler information criterion and model order selection rules to design detection schemes. The alternative hypothesis related to the most likely signal subspace is selected from multiple alternative hypotheses, and is tested versus the null hypothesis for target detection. Numerical examples verify the effectiveness of the proposed detection schemes, which can achieve the target detection and subspace-based target classification simultaneously.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3134-3138"},"PeriodicalIF":3.2,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142671985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Precise Analysis of Covariance Identifiability for Activity Detection in Grant-Free Random Access 无赠予随机存取中活动检测的协方差可识别性精确分析
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-04 DOI: 10.1109/LSP.2024.3491018
Shengsong Luo;Junjie Ma;Chongbin Xu;Xin Wang
We consider the identifiability issue of maximum-likelihood based activity detection in massive MIMO-based grant-free random access. An intriguing observation by (Chen et al., 2022) indicates that the identifiability undergoes a phase transition for commonly-used random user signatures as $L^{2}$, $N$ and $K$ tend to infinity with fixed ratios, where $L$, $N$ and $K$ denote the user signature length, the total number of users, and the number of active users, respectively. In this letter, we provide a precise analytical characterization of the phase transition based on a spectral universality conjecture. Numerical results demonstrate excellent agreement between our theoretical predictions and the empirical phase transitions.
我们考虑了在基于大规模多输入多输出(MIMO)的免授权随机接入中基于最大似然的活动检测的可识别性问题。Chen等人,2022)的一个有趣的观察结果表明,对于常用的随机用户签名,当$L^{2}$、$N$和$K$以固定比率趋于无穷大时,可识别性会发生相变,其中$L$、$N$和$K$分别表示用户签名长度、用户总数和活动用户数。在这封信中,我们根据频谱普遍性猜想对相变进行了精确的分析描述。数值结果表明,我们的理论预测与经验相变非常吻合。
{"title":"Precise Analysis of Covariance Identifiability for Activity Detection in Grant-Free Random Access","authors":"Shengsong Luo;Junjie Ma;Chongbin Xu;Xin Wang","doi":"10.1109/LSP.2024.3491018","DOIUrl":"https://doi.org/10.1109/LSP.2024.3491018","url":null,"abstract":"We consider the identifiability issue of maximum-likelihood based activity detection in massive MIMO-based grant-free random access. An intriguing observation by (Chen et al., 2022) indicates that the identifiability undergoes a phase transition for commonly-used random user signatures as \u0000<inline-formula><tex-math>$L^{2}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$N$</tex-math></inline-formula>\u0000 and \u0000<inline-formula><tex-math>$K$</tex-math></inline-formula>\u0000 tend to infinity with fixed ratios, where \u0000<inline-formula><tex-math>$L$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$N$</tex-math></inline-formula>\u0000 and \u0000<inline-formula><tex-math>$K$</tex-math></inline-formula>\u0000 denote the user signature length, the total number of users, and the number of active users, respectively. In this letter, we provide a precise analytical characterization of the phase transition based on a spectral universality conjecture. Numerical results demonstrate excellent agreement between our theoretical predictions and the empirical phase transitions.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3184-3188"},"PeriodicalIF":3.2,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142671996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gradient-Level Differential Privacy Against Attribute Inference Attack for Speech Emotion Recognition 针对语音情感识别的属性推理攻击的渐变级差异隐私保护
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-01 DOI: 10.1109/LSP.2024.3490379
Haijiao Chen;Huan Zhao;Zixing Zhang
The Federated Learning (FL) paradigm for distributed privacy preservation is valued for its ability to collaboratively train Speech Emotion Recognition (SER) models while keeping data localized. However, recent studies reveal privacy leakage in the model sharing process. Existing differential privacy schemes face increasing inference attack risks as clients expose more model updates. To address these challenges, we propose a Gradient-level Hierarchical Differential Privacy (GHDP) strategy to mitigate attribute inference attacks. GHDP employs normalization to distinguish gradient importance, clipping significant gradients and filtering out sensitive information that may lead to privacy leaks. Additionally, increased random perturbations are applied to early model layers during backpropagation, achieving hierarchical differential privacy through layered noise addition. This theoretically grounded approach offers enhanced protection for critical information. Our experiments show that GHDP maintains stable SER performance while providing robust privacy protection, unaffected by the number of model updates.
用于分布式隐私保护的联合学习(FL)范式因其在保持数据本地化的同时协同训练语音情感识别(SER)模型的能力而备受推崇。然而,最近的研究揭示了模型共享过程中的隐私泄露问题。随着客户端暴露出更多的模型更新,现有的差分隐私方案面临着越来越大的推理攻击风险。为了应对这些挑战,我们提出了梯度级分层差分隐私(GHDP)策略,以减轻属性推断攻击。GHDP 采用归一化来区分梯度的重要性,剪切重要梯度并过滤掉可能导致隐私泄露的敏感信息。此外,在反向传播过程中,增加的随机扰动会应用到早期模型层,通过分层噪声添加实现分层差异隐私。这种以理论为基础的方法为关键信息提供了更强的保护。我们的实验表明,GHDP 能保持稳定的 SER 性能,同时提供稳健的隐私保护,不受模型更新次数的影响。
{"title":"Gradient-Level Differential Privacy Against Attribute Inference Attack for Speech Emotion Recognition","authors":"Haijiao Chen;Huan Zhao;Zixing Zhang","doi":"10.1109/LSP.2024.3490379","DOIUrl":"https://doi.org/10.1109/LSP.2024.3490379","url":null,"abstract":"The Federated Learning (FL) paradigm for distributed privacy preservation is valued for its ability to collaboratively train Speech Emotion Recognition (SER) models while keeping data localized. However, recent studies reveal privacy leakage in the model sharing process. Existing differential privacy schemes face increasing inference attack risks as clients expose more model updates. To address these challenges, we propose a \u0000<underline>G</u>\u0000radient-level \u0000<underline>H</u>\u0000ierarchical \u0000<underline>D</u>\u0000ifferential \u0000<underline>P</u>\u0000rivacy (GHDP) strategy to mitigate attribute inference attacks. GHDP employs normalization to distinguish gradient importance, clipping significant gradients and filtering out sensitive information that may lead to privacy leaks. Additionally, increased random perturbations are applied to early model layers during backpropagation, achieving hierarchical differential privacy through layered noise addition. This theoretically grounded approach offers enhanced protection for critical information. Our experiments show that GHDP maintains stable SER performance while providing robust privacy protection, unaffected by the number of model updates.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"31 ","pages":"3124-3128"},"PeriodicalIF":3.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142671123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Signal Processing Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1