首页 > 最新文献

IEEE Signal Processing Letters最新文献

英文 中文
Heterogeneous Dual-Branch Emotional Consistency Network for Facial Expression Recognition 基于异构双分支情绪一致性网络的面部表情识别
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-17 DOI: 10.1109/LSP.2024.3505798
Shasha Mao;Yuanyuan Zhang;Dandan Yan;Puhua Chen
Due to labeling subjectivity, label noises have become a critical issue that is addressed in facial expression recognition. From the view of human visual perception, the facial exhibited emotion characteristic should be unaltered corresponding to its truth expression, rather than the noise label, whereas most methods ignore the emotion consistency during FER, especially from different networks. Based on this, we propose a new FER method based heterogeneous dual-branch emotional consistency constrains, to prevent the model from memorizing noise samples based on features associated with noisy labels. In the proposed method, the emotion consistency from spatial transformation and heterogeneous networks are simultaneously considered to guide the model to perceive the overall visual features of expressions. Meanwhile, the confidence of the given label is evaluated based on emotional attention maps of original and transformed images, which effectively enhances the classification reliability of two branches to alleviate the negative effect of noisy labels in the learning process. Additionally, the weighted ensemble strategy is used to unify two branches. Experimental results illustrate that the proposed method achieves better performance than the state-of-the-art methods for 10%, 20% and 30% label noises.
由于标签的主观性,标签噪声已成为人脸识别中亟待解决的一个关键问题。从人类视觉感知的角度来看,面部所表现出的情绪特征应该是与其真实表达相对应的不变的,而不是与噪声标签相对应的,而大多数方法都忽略了FER过程中的情绪一致性,尤其是来自不同网络的情绪一致性。在此基础上,我们提出了一种基于异构双分支情感一致性约束的FER方法,以防止模型基于与噪声标签相关的特征记忆噪声样本。该方法同时考虑了空间变换和异构网络的情感一致性,引导模型感知表情的整体视觉特征。同时,基于原始图像和变换后图像的情感注意图对给定标签的置信度进行评估,有效提高了两个分支的分类可靠性,缓解了学习过程中噪声标签的负面影响。此外,采用加权集成策略统一两个分支。实验结果表明,对于10%、20%和30%的标签噪声,本文方法的性能优于现有方法。
{"title":"Heterogeneous Dual-Branch Emotional Consistency Network for Facial Expression Recognition","authors":"Shasha Mao;Yuanyuan Zhang;Dandan Yan;Puhua Chen","doi":"10.1109/LSP.2024.3505798","DOIUrl":"https://doi.org/10.1109/LSP.2024.3505798","url":null,"abstract":"Due to labeling subjectivity, label noises have become a critical issue that is addressed in facial expression recognition. From the view of human visual perception, the facial exhibited emotion characteristic should be unaltered corresponding to its truth expression, rather than the noise label, whereas most methods ignore the emotion consistency during FER, especially from different networks. Based on this, we propose a new FER method based heterogeneous dual-branch emotional consistency constrains, to prevent the model from memorizing noise samples based on features associated with noisy labels. In the proposed method, the emotion consistency from spatial transformation and heterogeneous networks are simultaneously considered to guide the model to perceive the overall visual features of expressions. Meanwhile, the confidence of the given label is evaluated based on emotional attention maps of original and transformed images, which effectively enhances the classification reliability of two branches to alleviate the negative effect of noisy labels in the learning process. Additionally, the weighted ensemble strategy is used to unify two branches. Experimental results illustrate that the proposed method achieves better performance than the state-of-the-art methods for 10%, 20% and 30% label noises.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"566-570"},"PeriodicalIF":3.2,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video Inpainting Localization With Contrastive Learning 视频绘画定位与对比学习
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-08 DOI: 10.1109/LSP.2025.3527196
Zijie Lou;Gang Cao;Man Lin
Video inpainting techniques typically serve to restore destroyed or missing regions in digital videos. However, such techniques may also be illegally used to remove important objects for creating forged videos. This letter proposes a simple yet effective forensic scheme for Video Inpainting LOcalization with ContrAstive Learning (ViLocal). A 3D Uniformer encoder is applied to the video noise residual for learning effective spatiotemporal features. To enhance discriminative power, supervised contrastive learning is adopted to capture the local regional inconsistency through separating the pristine and inpainted pixels. The pixel-wise inpainting localization map is yielded by a lightweight convolution decoder with two-stage training. To prepare enough training samples, we build a video object segmentation dataset (VOS2k5) of 2500 videos with pixel-level annotations per frame. Extensive experimental results validate the superiority of ViLocal over the state-of-the-arts.
视频补漆技术通常用于恢复数字视频中被破坏或缺失的区域。然而,这种技术也可能被非法用于删除重要对象以制作伪造视频。本文提出了一种简单而有效的基于对比学习(ViLocal)的视频图像定位取证方案。将三维均匀编码器应用于视频噪声残差中,学习有效的时空特征。为了提高判别能力,采用监督对比学习方法,通过分离原始像素和填充像素来捕捉局部区域的不一致性。像素级的图像定位图是由一个轻量级的卷积解码器通过两阶段的训练生成的。为了准备足够的训练样本,我们构建了一个包含2500个视频的视频对象分割数据集(VOS2k5),每帧具有像素级注释。大量的实验结果验证了ViLocal的优越性。
{"title":"Video Inpainting Localization With Contrastive Learning","authors":"Zijie Lou;Gang Cao;Man Lin","doi":"10.1109/LSP.2025.3527196","DOIUrl":"https://doi.org/10.1109/LSP.2025.3527196","url":null,"abstract":"Video inpainting techniques typically serve to restore destroyed or missing regions in digital videos. However, such techniques may also be illegally used to remove important objects for creating forged videos. This letter proposes a simple yet effective forensic scheme for Video Inpainting LOcalization with ContrAstive Learning (ViLocal). A 3D Uniformer encoder is applied to the video noise residual for learning effective spatiotemporal features. To enhance discriminative power, supervised contrastive learning is adopted to capture the local regional inconsistency through separating the pristine and inpainted pixels. The pixel-wise inpainting localization map is yielded by a lightweight convolution decoder with two-stage training. To prepare enough training samples, we build a video object segmentation dataset (VOS2k5) of 2500 videos with pixel-level annotations per frame. Extensive experimental results validate the superiority of ViLocal over the state-of-the-arts.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"611-615"},"PeriodicalIF":3.2,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Superpixel-Guided Non-Homogeneous Image Dehazing 自适应超像素引导非均匀图像去雾
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-08 DOI: 10.1109/LSP.2025.3527197
Hao Zhang;Ping Lu;Te Qi;Yan Xu;Tieyong Zeng
Image dehazing is regarded as a fundamental image processing task with a major impact on higher-level imaging tasks. Many existing haze removal methods are designed for homogeneous haze, but in real-world cases, the haze is normally non-homogeneous. Superpixels, which segment an image into a set of closely spaced regions, can be employed in real-world scenarios to deal with non-homogeneous haze. In our paper, an adaptive non-homogeneous image dehazing approach that utilizes the superpixel-guided algorithm is designed to segment different hazy regions. Considering that both ambient light and transmission map estimation have a significant impact on the results, our research focuses on the development of a variational dehazing model that takes into account non-uniform ambient light and non-uniform transmission maps to address varying levels of haze. A series of numerical results illustrate the superiority and efficacy of our method.
图像去雾是一项基础性的图像处理任务,对更高层次的成像任务有着重要的影响。现有的许多除霾方法都是针对均匀雾霾而设计的,但在实际情况下,雾霾通常是非均匀的。超像素将图像分割成一组紧密间隔的区域,可以在现实场景中用于处理非均匀雾霾。本文设计了一种利用超像素引导算法的自适应非均匀图像去雾方法来分割不同的雾区。考虑到环境光和透射图估计对结果都有重大影响,我们的研究重点是开发一个考虑非均匀环境光和非均匀透射图的变分除雾模型,以解决不同程度的雾霾问题。一系列的数值结果表明了该方法的优越性和有效性。
{"title":"Adaptive Superpixel-Guided Non-Homogeneous Image Dehazing","authors":"Hao Zhang;Ping Lu;Te Qi;Yan Xu;Tieyong Zeng","doi":"10.1109/LSP.2025.3527197","DOIUrl":"https://doi.org/10.1109/LSP.2025.3527197","url":null,"abstract":"Image dehazing is regarded as a fundamental image processing task with a major impact on higher-level imaging tasks. Many existing haze removal methods are designed for homogeneous haze, but in real-world cases, the haze is normally non-homogeneous. Superpixels, which segment an image into a set of closely spaced regions, can be employed in real-world scenarios to deal with non-homogeneous haze. In our paper, an adaptive non-homogeneous image dehazing approach that utilizes the superpixel-guided algorithm is designed to segment different hazy regions. Considering that both ambient light and transmission map estimation have a significant impact on the results, our research focuses on the development of a variational dehazing model that takes into account non-uniform ambient light and non-uniform transmission maps to address varying levels of haze. A series of numerical results illustrate the superiority and efficacy of our method.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"591-595"},"PeriodicalIF":3.2,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10833755","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-View Fusion for Multi-View Clustering 多视图聚类的跨视图融合
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-08 DOI: 10.1109/LSP.2025.3527231
Zhijie Huang;Binqiang Huang;Qinghai Zheng;Yuanlong Yu
Multi-view clustering has attracted significant attention in recent years because it can leverage the consistent and complementary information of multiple views to improve clustering performance. However, effectively fuse the information and balance the consistent and complementary information of multiple views are common challenges faced by multi-view clustering. Most existing multi-view fusion works focus on weighted-sum fusion and concatenating fusion, which unable to fully fuse the underlying information, and not consider balancing the consistent and complementary information of multiple views. To this end, we propose Cross-view Fusion for Multi-view Clustering (CFMVC). Specifically, CFMVC combines deep neural network and graph convolutional network for cross-view information fusion, which fully fuses feature information and structural information of multiple views. In order to balance the consistent and complementary information of multiple views, CFMVC enhances the correlation among the same samples to maximize the consistent information while simultaneously reinforcing the independence among different samples to maximize the complementary information. Experimental results on several multi-view datasets demonstrate the effectiveness of CFMVC for multi-view clustering task.
多视图聚类由于能够利用多视图信息的一致性和互补性来提高聚类性能,近年来引起了广泛的关注。然而,如何有效地融合信息,平衡多视图信息的一致性和互补性是多视图聚类面临的共同挑战。现有的多视图融合多以加权和融合和串联融合为主,不能充分融合底层信息,也没有考虑多视图信息一致性和互补性的平衡。为此,我们提出了跨视图融合多视图集群(CFMVC)。CFMVC结合深度神经网络和图卷积网络进行跨视图信息融合,充分融合了多视图的特征信息和结构信息。为了平衡多个视图的一致性和互补性信息,CFMVC通过增强相同样本间的相关性来最大化一致性信息,同时增强不同样本间的独立性来最大化互补性信息。在多个多视图数据集上的实验结果证明了CFMVC在多视图聚类任务中的有效性。
{"title":"Cross-View Fusion for Multi-View Clustering","authors":"Zhijie Huang;Binqiang Huang;Qinghai Zheng;Yuanlong Yu","doi":"10.1109/LSP.2025.3527231","DOIUrl":"https://doi.org/10.1109/LSP.2025.3527231","url":null,"abstract":"Multi-view clustering has attracted significant attention in recent years because it can leverage the consistent and complementary information of multiple views to improve clustering performance. However, effectively fuse the information and balance the consistent and complementary information of multiple views are common challenges faced by multi-view clustering. Most existing multi-view fusion works focus on weighted-sum fusion and concatenating fusion, which unable to fully fuse the underlying information, and not consider balancing the consistent and complementary information of multiple views. To this end, we propose Cross-view Fusion for Multi-view Clustering (CFMVC). Specifically, CFMVC combines deep neural network and graph convolutional network for cross-view information fusion, which fully fuses feature information and structural information of multiple views. In order to balance the consistent and complementary information of multiple views, CFMVC enhances the correlation among the same samples to maximize the consistent information while simultaneously reinforcing the independence among different samples to maximize the complementary information. Experimental results on several multi-view datasets demonstrate the effectiveness of CFMVC for multi-view clustering task.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"621-625"},"PeriodicalIF":3.2,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PFCNet: Enhancing Rail Surface Defect Detection With Pixel-Aware Frequency Conversion Networks PFCNet:基于像素感知的变频网络增强钢轨表面缺陷检测
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-06 DOI: 10.1109/LSP.2025.3525855
Yue Wu;Fangfang Qiang;Wujie Zhou;Weiqing Yan
Applying computer vision techniques to rail surface defect detection (RSDD) is crucial for preventing catastrophic accidents. However, challenges such as complex backgrounds and irregular defect shapes persist. Previous methods have focused on extracting salient object information from a pixel perspective, thereby neglecting valuable high- and low-frequency image information, which can better capture global structural information. In this study, we design a pixel-aware frequency conversion network (PFCNet) to explore RSDD from a frequency domain perspective. We use different attention mechanisms and frequency enhancement for high-level and shallow features to explore local details and global structures comprehensively. In addition, we design a dual-control reorganization module to refine the features across levels. We conducted extensive experiments on an industrial RGB-D dataset (NEU RSDDS-AUG), and PFCNet achieved superior performance.
将计算机视觉技术应用到钢轨表面缺陷检测中,对于防止灾难性事故的发生至关重要。然而,复杂的背景和不规则的缺陷形状等挑战仍然存在。以往的方法侧重于从像素角度提取显著目标信息,从而忽略了有价值的高低频图像信息,而这些信息可以更好地捕获全局结构信息。在本研究中,我们设计了一个像素感知频率转换网络(PFCNet),从频域角度探索RSDD。我们对高层次和浅层特征采用不同的注意机制和频率增强,以全面探索局部细节和全局结构。此外,我们设计了一个双控重组模块来细化跨关卡的特征。我们在工业RGB-D数据集(NEU RSDDS-AUG)上进行了大量实验,PFCNet取得了优异的性能。
{"title":"PFCNet: Enhancing Rail Surface Defect Detection With Pixel-Aware Frequency Conversion Networks","authors":"Yue Wu;Fangfang Qiang;Wujie Zhou;Weiqing Yan","doi":"10.1109/LSP.2025.3525855","DOIUrl":"https://doi.org/10.1109/LSP.2025.3525855","url":null,"abstract":"Applying computer vision techniques to rail surface defect detection (RSDD) is crucial for preventing catastrophic accidents. However, challenges such as complex backgrounds and irregular defect shapes persist. Previous methods have focused on extracting salient object information from a pixel perspective, thereby neglecting valuable high- and low-frequency image information, which can better capture global structural information. In this study, we design a pixel-aware frequency conversion network (PFCNet) to explore RSDD from a frequency domain perspective. We use different attention mechanisms and frequency enhancement for high-level and shallow features to explore local details and global structures comprehensively. In addition, we design a dual-control reorganization module to refine the features across levels. We conducted extensive experiments on an industrial RGB-D dataset (NEU RSDDS-AUG), and PFCNet achieved superior performance.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"606-610"},"PeriodicalIF":3.2,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Piecewise Student's t-distribution Mixture Model-Based Estimation for NAND Flash Memory Channels 基于分段学生t分布混合模型的NAND闪存通道估计
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-06 DOI: 10.1109/LSP.2024.3521326
Cheng Wang;Zhen Mei;Jun Li;Kui Cai;Lingjun Kong
Accurate modeling and estimation of the threshold voltages of the flash memory can facilitate the efficient design of channel codes and detectors. However, most flash memory channel models are based on Gaussian distributions, which fail to capture certain key properties of the threshold voltages, such as their heavy-tails. To enhance the model accuracy, we first propose a piecewise student's t-distribution mixture model (PSTMM), which features degrees of freedom to control the left and right tails of the voltage distributions. We further propose an PSTMM based expectation maximization (PSTMM-EM) algorithm to estimate model parameters for flash memories by alternately computing the expected values of the missing data and maximizing the likelihood function with respect to the model parameters. Simulation results demonstrate that our proposed algorithm exhibits superior stability and can effectively extend the flash memory lifespan by 1700 program/erase (PE) cycles compared with the existing parameter estimation algorithms.
对闪存的阈值电压进行准确的建模和估计,有助于有效地设计通道码和检测器。然而,大多数闪存通道模型是基于高斯分布的,无法捕捉阈值电压的某些关键特性,例如它们的重尾。为了提高模型的精度,我们首先提出了分段学生t分布混合模型(PSTMM),该模型具有控制电压分布的左右尾部的自由度。我们进一步提出了一种基于PSTMM的期望最大化(PSTMM- em)算法,通过交替计算缺失数据的期望值和最大化关于模型参数的似然函数来估计闪存的模型参数。仿真结果表明,与现有的参数估计算法相比,该算法具有优异的稳定性,可以有效地将闪存寿命延长1700个程序/擦除(PE)周期。
{"title":"Piecewise Student's t-distribution Mixture Model-Based Estimation for NAND Flash Memory Channels","authors":"Cheng Wang;Zhen Mei;Jun Li;Kui Cai;Lingjun Kong","doi":"10.1109/LSP.2024.3521326","DOIUrl":"https://doi.org/10.1109/LSP.2024.3521326","url":null,"abstract":"Accurate modeling and estimation of the threshold voltages of the flash memory can facilitate the efficient design of channel codes and detectors. However, most flash memory channel models are based on Gaussian distributions, which fail to capture certain key properties of the threshold voltages, such as their heavy-tails. To enhance the model accuracy, we first propose a piecewise student's t-distribution mixture model (PSTMM), which features degrees of freedom to control the left and right tails of the voltage distributions. We further propose an PSTMM based expectation maximization (PSTMM-EM) algorithm to estimate model parameters for flash memories by alternately computing the expected values of the missing data and maximizing the likelihood function with respect to the model parameters. Simulation results demonstrate that our proposed algorithm exhibits superior stability and can effectively extend the flash memory lifespan by 1700 program/erase (PE) cycles compared with the existing parameter estimation algorithms.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"451-455"},"PeriodicalIF":3.2,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142938333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging the Modality Gap in Multimodal Eye Disease Screening: Learning Modality Shared-Specific Features via Multi-Level Regularization 弥合多模态眼病筛查的模态差距:通过多层次正则化学习模态共享特异性特征
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-06 DOI: 10.1109/LSP.2025.3526094
Jiayue Zhao;Shiman Li;Yi Hao;Chenxi Zhang
Color fundus photography (CFP) and optical coherence tomography (OCT) are two common modalities used in eye disease screening, providing crucial complementary information for the diagnosis of eye diseases. However, existing multimodal learning methods cannot fully leverage the information from each modality due to the large dimensional and semantic gap between 2D CFP and 3D OCT images, leading to suboptimal classification performance. To bridge the modality gap and fully exploit the information from each modality, we propose a novel feature disentanglement method that decomposes features into modality-shared and modality-specific components. We design a multi-level regularization strategy including intra-modality, inter-modality, and intra-inter-modality regularization to facilitate the effective learning of the modality Shared-Specific features. Our method achieves state-of-the-art performance on two eye disease diagnosis tasks using two publicly available datasets. Our method promises to serve as a useful tool for multimodal eye disease diagnosis.
彩色眼底摄影(CFP)和光学相干断层扫描(OCT)是眼病筛查中常用的两种方式,为眼病的诊断提供了重要的补充信息。然而,由于二维CFP和三维OCT图像之间存在较大的维度和语义差距,现有的多模态学习方法无法充分利用每个模态的信息,导致分类性能不理想。为了消除模态差异并充分利用每种模态的信息,我们提出了一种新的特征解纠缠方法,将特征分解为模态共享组件和模态特定组件。我们设计了一种多级正则化策略,包括模态内、模态间和模态间的正则化,以促进模态共享特定特征的有效学习。我们的方法使用两个公开可用的数据集在两个眼病诊断任务上实现了最先进的性能。该方法有望成为多模态眼病诊断的有效工具。
{"title":"Bridging the Modality Gap in Multimodal Eye Disease Screening: Learning Modality Shared-Specific Features via Multi-Level Regularization","authors":"Jiayue Zhao;Shiman Li;Yi Hao;Chenxi Zhang","doi":"10.1109/LSP.2025.3526094","DOIUrl":"https://doi.org/10.1109/LSP.2025.3526094","url":null,"abstract":"Color fundus photography (CFP) and optical coherence tomography (OCT) are two common modalities used in eye disease screening, providing crucial complementary information for the diagnosis of eye diseases. However, existing multimodal learning methods cannot fully leverage the information from each modality due to the large dimensional and semantic gap between 2D CFP and 3D OCT images, leading to suboptimal classification performance. To bridge the modality gap and fully exploit the information from each modality, we propose a novel feature disentanglement method that decomposes features into modality-shared and modality-specific components. We design a multi-level regularization strategy including intra-modality, inter-modality, and intra-inter-modality regularization to facilitate the effective learning of the modality Shared-Specific features. Our method achieves state-of-the-art performance on two eye disease diagnosis tasks using two publicly available datasets. Our method promises to serve as a useful tool for multimodal eye disease diagnosis.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"586-590"},"PeriodicalIF":3.2,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Noise Covariance Matrix Estimation in Block-Correlated Noise Field for Direction Finding 面向测向的块相关噪声场噪声协方差矩阵估计
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-06 DOI: 10.1109/LSP.2025.3525898
Majdoddin Esfandiari;Sergiy A. Vorobyov
A noise covariance matrix estimation approach in unknown noise field for direction finding applicable for the practically important cases of nonuniform and block-diagonal sensor noise is proposed. It is based on an alternating procedure that can be adjusted for a specific noise type. Numerical simulations are conducted in order to establish the generality and superiority of the proposed approach over the existing state-of-the-art methods, especially in challenging scenarios.
提出了一种未知噪声场中噪声协方差矩阵估计的测向方法,适用于非均匀和对角块传感器噪声的实际重要情况。它是基于一个交替的程序,可以根据特定的噪声类型进行调整。数值模拟是为了证明所提出的方法比现有的最先进的方法具有通用性和优越性,特别是在具有挑战性的情况下。
{"title":"Noise Covariance Matrix Estimation in Block-Correlated Noise Field for Direction Finding","authors":"Majdoddin Esfandiari;Sergiy A. Vorobyov","doi":"10.1109/LSP.2025.3525898","DOIUrl":"https://doi.org/10.1109/LSP.2025.3525898","url":null,"abstract":"A noise covariance matrix estimation approach in unknown noise field for direction finding applicable for the practically important cases of nonuniform and block-diagonal sensor noise is proposed. It is based on an alternating procedure that can be adjusted for a specific noise type. Numerical simulations are conducted in order to establish the generality and superiority of the proposed approach over the existing state-of-the-art methods, especially in challenging scenarios.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"531-535"},"PeriodicalIF":3.2,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10824965","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142976117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Consensus Iterated Posterior Linearization Filter for Distributed State Estimation 分布状态估计的一致迭代后验线性化滤波器
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-06 DOI: 10.1109/LSP.2025.3526092
Ángel F. García-Fernández;Giorgio Battistelli
This paper presents the consensus iterated posterior linearisation filter (IPLF) for distributed state estimation. The consensus IPLF algorithm is based on a measurement model described by its conditional mean and covariance given the state, and performs iterated statistical linear regressions of the measurements with respect to the current approximation of the posterior to improve estimation performance. Three variants of the algorithm are presented based on the type of consensus that is used: consensus on information, consensus on measurements, and hybrid consensus on measurements and information. Simulation results show the benefits of the proposed algorithm in distributed state estimation.
提出了一种用于分布状态估计的一致迭代后验线性化滤波器。共识IPLF算法基于给定状态下的条件均值和协方差描述的测量模型,并对测量值相对于当前后验近似值进行迭代统计线性回归,以提高估计性能。基于所使用的共识类型,提出了该算法的三种变体:信息共识、测量共识和测量和信息混合共识。仿真结果表明了该算法在分布式状态估计中的优越性。
{"title":"Consensus Iterated Posterior Linearization Filter for Distributed State Estimation","authors":"Ángel F. García-Fernández;Giorgio Battistelli","doi":"10.1109/LSP.2025.3526092","DOIUrl":"https://doi.org/10.1109/LSP.2025.3526092","url":null,"abstract":"This paper presents the consensus iterated posterior linearisation filter (IPLF) for distributed state estimation. The consensus IPLF algorithm is based on a measurement model described by its conditional mean and covariance given the state, and performs iterated statistical linear regressions of the measurements with respect to the current approximation of the posterior to improve estimation performance. Three variants of the algorithm are presented based on the type of consensus that is used: consensus on information, consensus on measurements, and hybrid consensus on measurements and information. Simulation results show the benefits of the proposed algorithm in distributed state estimation.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"561-565"},"PeriodicalIF":3.2,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cramér-Rao Bounds and Resolution Benefits of Sparse Arrays in Measurement-Dependent SNR Regimes 测量相关信噪比条件下稀疏阵列的cram<s:1> - rao边界和分辨率优势
IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-01-03 DOI: 10.1109/LSP.2024.3525400
Sina Shahsavari;Piya Pal
This paper derives new non-asymptotic characterization of the Cramér-Rao Bound (CRB) of any sparse array as a function of the angular separation between two far-field narrowband sources in certain regimes characterized by a low Signal-to-Noise Ratio (SNR). The primary contribution is the derivation of matching upper and lower bounds on the CRB in a certain measurement-dependent SNR (MD-SNR) regime, where one can zoom into progressively lower SNR as the number of sensors increases. This tight characterization helps to establish that sparse arrays such as nested and coprime arrays provably exhibit lower CRB compared to Uniform Linear Arrays (ULAs) in the specified SNR regime.
本文给出了在低信噪比条件下,任意稀疏阵列的cram r- rao界(CRB)作为两个远场窄带源间角间距函数的非渐近刻画。主要贡献是在特定测量依赖的信噪比(MD-SNR)制度下匹配CRB的上界和下界的推导,其中可以随着传感器数量的增加而逐渐放大到较低的信噪比。这种严格的表征有助于建立稀疏阵列,如嵌套阵列和协素数阵列,在特定的信噪比下,与均匀线性阵列(ULAs)相比,可证明具有更低的CRB。
{"title":"Cramér-Rao Bounds and Resolution Benefits of Sparse Arrays in Measurement-Dependent SNR Regimes","authors":"Sina Shahsavari;Piya Pal","doi":"10.1109/LSP.2024.3525400","DOIUrl":"https://doi.org/10.1109/LSP.2024.3525400","url":null,"abstract":"This paper derives new non-asymptotic characterization of the Cramér-Rao Bound (CRB) of any sparse array as a function of the angular separation between two far-field narrowband sources in certain regimes characterized by a low Signal-to-Noise Ratio (SNR). The primary contribution is the derivation of matching upper and lower bounds on the CRB in a certain measurement-dependent SNR (MD-SNR) regime, where one can zoom into progressively lower SNR as the number of sensors increases. This tight characterization helps to establish that sparse arrays such as nested and coprime arrays provably exhibit lower CRB compared to Uniform Linear Arrays (ULAs) in the specified SNR regime.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"601-605"},"PeriodicalIF":3.2,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Signal Processing Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1