首页 > 最新文献

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society最新文献

英文 中文
Boosting Geometric Invariants for Discriminative Forensics of Large-Scale Generated Visual Content 增强大规模生成视觉内容判别取证的几何不变量。
IF 13.7 Pub Date : 2025-11-28 DOI: 10.1109/TIP.2025.3633580
Shuren Qi;Chao Wang;Zhiqiu Huang;Yushu Zhang;Xiangyu Chen;Yi Zhang;Tieyong Zeng;Fenglei Fan
Generative artificial intelligence has shown great success in visual content synthesis such that humans struggle to distinguish between real and synthesized images. Forensic research seeks to reveal artifacts in such generated images, ensuring information security or improving generation capability. In this regard, the robustness and interpretability are important for the trustworthy purpose of forensic tasks. However, typical forensic models and their underlying data representations rely on empirical learning algorithms, which cannot effectively handle the high robustness and interpretability requirements beyond experience. As an effective solution, we extend the classical geometric invariants to the forensic research of large-scale generated images. Invariants are handcrafted representations with robust and interpretable geometric principles. However, their discriminability is far from the large scale of today’s forensic tasks. We boost the discriminability by extending the classical invariants to the hierarchical architecture of convolutional neural networks. The resulting overcompleteness allows for an automatic selection of task-discriminative features, while retaining the previous advantages of robustness and interpretability. From generative adversarial networks to diffusion models, the forensic with our boosted invariants demonstrates state-of-the-art discriminability against large-scale content diversity. It also exhibits high efficiency on training examples, intrinsic invariance to geometric variations, and better interpretability of the forensic process.
生成式人工智能在视觉内容合成方面取得了巨大成功,以至于人类很难区分真实图像和合成图像。法医研究旨在揭示这些生成图像中的人工制品,以确保信息安全或提高生成能力。在这方面,健壮性和可解释性对于取证任务的可信目的非常重要。然而,典型的取证模型及其底层数据表示依赖于经验学习算法,无法有效处理超出经验的高鲁棒性和可解释性要求。作为一种有效的解决方案,我们将经典几何不变量扩展到大规模生成图像的法医研究中。不变量是具有健壮和可解释的几何原理的手工表示。然而,它们的可辨别性与今天的大规模法医任务相去甚远。我们通过将经典不变量扩展到卷积神经网络的层次结构中来提高可判别性。由此产生的过完备性允许自动选择任务区分特征,同时保留先前的鲁棒性和可解释性优势。从生成对抗网络到扩散模型,我们增强的不变量证明了对大规模内容多样性的最先进的可辨别性。它还表现出高效率的训练样例,对几何变化的内在不变性,以及更好的法医过程可解释性。
{"title":"Boosting Geometric Invariants for Discriminative Forensics of Large-Scale Generated Visual Content","authors":"Shuren Qi;Chao Wang;Zhiqiu Huang;Yushu Zhang;Xiangyu Chen;Yi Zhang;Tieyong Zeng;Fenglei Fan","doi":"10.1109/TIP.2025.3633580","DOIUrl":"10.1109/TIP.2025.3633580","url":null,"abstract":"Generative artificial intelligence has shown great success in visual content synthesis such that humans struggle to distinguish between real and synthesized images. Forensic research seeks to reveal artifacts in such generated images, ensuring information security or improving generation capability. In this regard, the robustness and interpretability are important for the trustworthy purpose of forensic tasks. However, typical forensic models and their underlying data representations rely on empirical learning algorithms, which cannot effectively handle the high robustness and interpretability requirements beyond experience. As an effective solution, we extend the classical geometric invariants to the forensic research of large-scale generated images. Invariants are handcrafted representations with robust and interpretable geometric principles. However, their discriminability is far from the large scale of today’s forensic tasks. We boost the discriminability by extending the classical invariants to the hierarchical architecture of convolutional neural networks. The resulting overcompleteness allows for an automatic selection of task-discriminative features, while retaining the previous advantages of robustness and interpretability. From generative adversarial networks to diffusion models, the forensic with our boosted invariants demonstrates state-of-the-art discriminability against large-scale content diversity. It also exhibits high efficiency on training examples, intrinsic invariance to geometric variations, and better interpretability of the forensic process.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7959-7974"},"PeriodicalIF":13.7,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145613236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepDC: Deep Distance Correlation as a Perceptual Image Quality Evaluator 作为感知图像质量评估器的深度距离相关
IF 13.7 Pub Date : 2025-11-27 DOI: 10.1109/TIP.2025.3635025
Hanwei Zhu;Baoliang Chen;Lingyu Zhu;Shiqi Wang;Weisi Lin
Deep neural networks pre-trained on ImageNet have demonstrated remarkable transferability for developing effective full-reference image quality assessment (FR-IQA) models. However, existing approaches typically demand pixel-level alignment between reference and distorted images—a requirement that poses significant challenges in practical scenarios involving natural photography and texture similarity evaluation. To address this limitation, we propose a novel FR-IQA model leveraging deep statistical similarity derived from pre-trained features without relying on spatial co-location of these features or requiring fine-tuning with mean opinion scores. Specifically, we employ distance correlation, a potent yet relatively underexplored statistical measure, to quantify similarity between reference and distorted images within a deep feature space. The distance correlation is computed via the ratio of the distance covariance to the product of their respective distance standard deviations, for which we derive a closed-form solution using the inner product of deep double-centered distance matrices. Extensive experimental evaluations across diverse IQA benchmarks demonstrate the superiority and robustness of the proposed model. Furthermore, we demonstrate the utility of our model for optimizing texture synthesis and neural style transfer tasks, achieving state-of-the-art performance in both quantitative measures and qualitative assessments. The implementation is publicly available at https://github.com/h4nwei/DeepDC
在ImageNet上预训练的深度神经网络已经证明了开发有效的全参考图像质量评估(FR-IQA)模型的显著可移植性。然而,现有的方法通常要求参考图像和扭曲图像之间的像素级对齐,这一要求在涉及自然摄影和纹理相似性评估的实际场景中提出了重大挑战。为了解决这一限制,我们提出了一种新的FR-IQA模型,利用来自预训练特征的深度统计相似性,而不依赖于这些特征的空间共定位或需要对平均意见得分进行微调。具体来说,我们使用距离相关,一种有效但相对未被充分开发的统计度量,来量化深度特征空间中参考图像和扭曲图像之间的相似性。距离相关性通过距离协方差与各自距离标准差乘积的比值来计算,为此我们使用深双中心距离矩阵的内积推导出封闭形式的解。广泛的实验评估跨越不同的IQA基准证明了所提出的模型的优越性和稳健性。此外,我们展示了我们的模型在优化纹理合成和神经风格转移任务方面的效用,在定量测量和定性评估方面都取得了最先进的性能。该实现可在https://github.com/h4nwei/DeepDC上公开获得
{"title":"DeepDC: Deep Distance Correlation as a Perceptual Image Quality Evaluator","authors":"Hanwei Zhu;Baoliang Chen;Lingyu Zhu;Shiqi Wang;Weisi Lin","doi":"10.1109/TIP.2025.3635025","DOIUrl":"10.1109/TIP.2025.3635025","url":null,"abstract":"Deep neural networks pre-trained on ImageNet have demonstrated remarkable transferability for developing effective full-reference image quality assessment (FR-IQA) models. However, existing approaches typically demand pixel-level alignment between reference and distorted images—a requirement that poses significant challenges in practical scenarios involving natural photography and texture similarity evaluation. To address this limitation, we propose a novel FR-IQA model leveraging deep statistical similarity derived from pre-trained features without relying on spatial co-location of these features or requiring fine-tuning with mean opinion scores. Specifically, we employ distance correlation, a potent yet relatively underexplored statistical measure, to quantify similarity between reference and distorted images within a deep feature space. The distance correlation is computed via the ratio of the distance covariance to the product of their respective distance standard deviations, for which we derive a closed-form solution using the inner product of deep double-centered distance matrices. Extensive experimental evaluations across diverse IQA benchmarks demonstrate the superiority and robustness of the proposed model. Furthermore, we demonstrate the utility of our model for optimizing texture synthesis and neural style transfer tasks, achieving state-of-the-art performance in both quantitative measures and qualitative assessments. The implementation is publicly available at <uri>https://github.com/h4nwei/DeepDC</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7859-7873"},"PeriodicalIF":13.7,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145610931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quality-Aware Spatio-Temporal Transformer Network for RGBT Tracking 面向RGBT跟踪的质量感知时空变压器网络
IF 13.7 Pub Date : 2025-11-27 DOI: 10.1109/TIP.2025.3635483
Zhaodong Ding;Chenglong Li;Tao Wang;Futian Wang
Transformer-based RGBT tracking has attracted much attention due to the strong modeling capacity of self attention and cross attention mechanisms. These attention mechanisms utilize the correlations among tokens to construct powerful feature representations, but are easily affected by low-quality tokens. To address this issue, we propose a novel Quality-aware Spatio-temporal Transformer Network (QSTNet), which calculates the quality weights of tokens in search regions based on the correlation with multimodal template tokens to suppress the negative effects of low-quality tokens in spatio-temporal feature representations, for robust RGBT tracking. In particular, we argue that the correlation between search tokens of one modality and multimodal template tokens could reflect the quality of these search tokens, and thus design the Quality-aware Token Weighting Module (QTWM) based on the correlation matrix of search and template tokens to suppress the negative effects of low-quality tokens. Specifically, we calculate the difference matrix derived from the attention matrices of the search tokens from both modalities and the multimodal template tokens, and then assign the quality weight for each search token based on the difference matrix, which reflects the relative correlation of search tokens from different modalities to multimodal template tokens. In addition, we propose the Prompt-based Spatio-temporal Encoder Module (PSEM) to utilize spatio-temporal multimodal information while alleviating the impact of low-quality spatio-temporal features. Extensive experiments on four RGBT benchmark datasets demonstrate that the proposed QSTNet exhibits superior performance compared to other state-of-the-art tracking methods. Our code and supplementary video are now available: https://zhaodongah.github.io/QSTNet
基于变压器的rbt跟踪由于其自身注意和交叉注意机制的强大建模能力而备受关注。这些注意机制利用标记之间的相关性来构建强大的特征表示,但容易受到低质量标记的影响。为了解决这个问题,我们提出了一种新的质量感知时空转换网络(QSTNet),它基于与多模态模板令牌的相关性计算搜索区域中令牌的质量权重,以抑制低质量令牌在时空特征表示中的负面影响,从而实现鲁棒的RGBT跟踪。特别是,我们认为一种模态的搜索令牌与多模态模板令牌之间的相关性可以反映这些搜索令牌的质量,因此基于搜索令牌与模板令牌的关联矩阵设计了质量感知令牌加权模块(QTWM),以抑制低质量令牌的负面影响。具体来说,我们计算了两种模式下的搜索标记和多模态模板标记的注意矩阵的差矩阵,然后根据差矩阵为每个搜索标记分配质量权重,反映了不同模式下的搜索标记与多模态模板标记的相对相关性。此外,我们提出了基于提示的时空编码器模块(PSEM),以利用时空多模态信息,同时减轻低质量时空特征的影响。在四个RGBT基准数据集上进行的大量实验表明,与其他最先进的跟踪方法相比,所提出的QSTNet具有优越的性能。我们的代码和补充视频现已提供:https://zhaodongah.github.io/QSTNet
{"title":"Quality-Aware Spatio-Temporal Transformer Network for RGBT Tracking","authors":"Zhaodong Ding;Chenglong Li;Tao Wang;Futian Wang","doi":"10.1109/TIP.2025.3635483","DOIUrl":"10.1109/TIP.2025.3635483","url":null,"abstract":"Transformer-based RGBT tracking has attracted much attention due to the strong modeling capacity of self attention and cross attention mechanisms. These attention mechanisms utilize the correlations among tokens to construct powerful feature representations, but are easily affected by low-quality tokens. To address this issue, we propose a novel Quality-aware Spatio-temporal Transformer Network (QSTNet), which calculates the quality weights of tokens in search regions based on the correlation with multimodal template tokens to suppress the negative effects of low-quality tokens in spatio-temporal feature representations, for robust RGBT tracking. In particular, we argue that the correlation between search tokens of one modality and multimodal template tokens could reflect the quality of these search tokens, and thus design the Quality-aware Token Weighting Module (QTWM) based on the correlation matrix of search and template tokens to suppress the negative effects of low-quality tokens. Specifically, we calculate the difference matrix derived from the attention matrices of the search tokens from both modalities and the multimodal template tokens, and then assign the quality weight for each search token based on the difference matrix, which reflects the relative correlation of search tokens from different modalities to multimodal template tokens. In addition, we propose the Prompt-based Spatio-temporal Encoder Module (PSEM) to utilize spatio-temporal multimodal information while alleviating the impact of low-quality spatio-temporal features. Extensive experiments on four RGBT benchmark datasets demonstrate that the proposed QSTNet exhibits superior performance compared to other state-of-the-art tracking methods. Our code and supplementary video are now available: <uri>https://zhaodongah.github.io/QSTNet</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7845-7858"},"PeriodicalIF":13.7,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145610933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hyperspectral Change Detection Method for Small Vehicles 小型车辆的高光谱变化检测方法
IF 13.7 Pub Date : 2025-11-26 DOI: 10.1109/TIP.2025.3635479
Shuyi Xu;He Sun;Xu Sun;Li Ni;Lianru Gao
Small vehicles (SV) detection is crucial for urban security and traffic management. However, detecting such targets from a single image presents significant challenges due to the difficulty in discerning their dynamic movements. In this paper, we propose a deep joint image-level and feature-level processing network, IFNet, designed for detecting changes in SV using bi-temporal hyperspectral images. At the image-level, a new Gumbel Softmax trick (GS)-based band selection strategy is introduced to address the problem of inconsistent spectral resolutions of bi-temporal images. At the feature-level, to tackle the challenge of capturing edge and shape details of SV, we propose a feature-based edge enhancement module, it can extract the target edge using high-level difference features, and the refined change map will be generated with the guidance of the edge map. Moreover, current deep learning-based hyperspectral change detection (HCD) methods are limited by HCD datasets. Therefore, we propose a benchmark dataset, the Hyperspectral Vehicle Change Detection (HVCD) dataset, which consists of 201 pairs of aerial hyperspectral images, each with a size of $256times 256$ , and exhibits inconsistent spectral resolutions across the bi-temporal data. Extensive experiments conducted on the HVCD dataset demonstrate that our IFNet obtains state-of-the-art performance with an acceptable computational cost.
小型车辆(SV)检测对于城市安全和交通管理至关重要。然而,从单个图像中检测此类目标由于难以识别其动态运动而面临重大挑战。在本文中,我们提出了一个深度联合图像级和特征级处理网络IFNet,旨在利用双时相高光谱图像检测SV的变化。在图像级,提出了一种新的基于Gumbel Softmax技巧(GS)的波段选择策略,以解决双时相图像光谱分辨率不一致的问题。在特征层面,为解决SV边缘和形状细节的捕获问题,提出了基于特征的边缘增强模块,该模块利用高层差分特征提取目标边缘,并在边缘图的引导下生成精细化的变化图。此外,目前基于深度学习的高光谱变化检测(HCD)方法受到HCD数据集的限制。因此,我们提出了一个基准数据集,即高光谱车辆变化检测(HVCD)数据集,该数据集由201对航空高光谱图像组成,每张图像的大小为256 × 256,并且在双时间数据中表现出不一致的光谱分辨率。在HVCD数据集上进行的大量实验表明,我们的IFNet以可接受的计算成本获得了最先进的性能。
{"title":"A Hyperspectral Change Detection Method for Small Vehicles","authors":"Shuyi Xu;He Sun;Xu Sun;Li Ni;Lianru Gao","doi":"10.1109/TIP.2025.3635479","DOIUrl":"10.1109/TIP.2025.3635479","url":null,"abstract":"Small vehicles (SV) detection is crucial for urban security and traffic management. However, detecting such targets from a single image presents significant challenges due to the difficulty in discerning their dynamic movements. In this paper, we propose a deep joint image-level and feature-level processing network, IFNet, designed for detecting changes in SV using bi-temporal hyperspectral images. At the image-level, a new Gumbel Softmax trick (GS)-based band selection strategy is introduced to address the problem of inconsistent spectral resolutions of bi-temporal images. At the feature-level, to tackle the challenge of capturing edge and shape details of SV, we propose a feature-based edge enhancement module, it can extract the target edge using high-level difference features, and the refined change map will be generated with the guidance of the edge map. Moreover, current deep learning-based hyperspectral change detection (HCD) methods are limited by HCD datasets. Therefore, we propose a benchmark dataset, the Hyperspectral Vehicle Change Detection (HVCD) dataset, which consists of 201 pairs of aerial hyperspectral images, each with a size of <inline-formula> <tex-math>$256times 256$ </tex-math></inline-formula>, and exhibits inconsistent spectral resolutions across the bi-temporal data. Extensive experiments conducted on the HVCD dataset demonstrate that our IFNet obtains state-of-the-art performance with an acceptable computational cost.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7874-7888"},"PeriodicalIF":13.7,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145609209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
X-Fake: Juggling Utility Evaluation and Explanation of Simulated SAR Images X-Fake:模拟SAR图像的杂耍效用评估与解释
IF 13.7 Pub Date : 2025-11-25 DOI: 10.1109/TIP.2025.3634988
Zhongling Huang;Yihan Zhuang;Zipei Zhong;Feng Xu;Gong Cheng;Junwei Han
Synthetic aperture radar (SAR) image simulation has attracted much attention due to its great potential to supplement the scarce training data for deep learning algorithms. Consequently, evaluating the quality of the simulated SAR image is crucial for practical applications. The current literature primarily uses image quality assessment (IQA) techniques for evaluation that rely on human observers’ perceptions. However, because of the unique imaging mechanism of SAR, these techniques may produce evaluation results that are not entirely valid. The distribution inconsistency between real and simulated data is the main obstacle that influences the utility of simulated SAR images. To this end, we propose a novel trustworthy utility evaluation framework with a counterfactual explanation for simulated SAR images for the first time, denoted as X-Fake. It unifies a probabilistic evaluator and a causal explainer to achieve a trustworthy utility assessment. We construct the evaluator using a probabilistic Bayesian deep model to learn the posterior distribution, conditioned on real data. Quantitatively, the predicted uncertainty of simulated data can reflect the distribution discrepancy. We build the causal explainer with an introspective variational auto-encoder (IntroVAE) to generate high-resolution counterfactuals. The latent code of IntroVAE is finally optimized with evaluation indicators and prior information to generate the counterfactual explanation, thus revealing the inauthentic details of simulated data explicitly. The proposed framework is validated on four simulated SAR image datasets obtained from electromagnetic models and generative artificial intelligence approaches. The results demonstrate the proposed X-Fake framework outperforms other IQA methods in terms of utility. Furthermore, the results illustrate that the generated counterfactual explanations are trustworthy, and can further improve the data utility in applications.
合成孔径雷达(SAR)图像仿真由于具有弥补深度学习算法训练数据不足的巨大潜力而备受关注。因此,评估模拟SAR图像的质量对实际应用至关重要。目前的文献主要使用图像质量评估(IQA)技术来评估依赖于人类观察者的感知。然而,由于SAR独特的成像机制,这些技术可能产生不完全有效的评估结果。真实数据与模拟数据的分布不一致是影响模拟SAR图像有效性的主要障碍。为此,我们首次提出了一种新的可信效用评估框架,对模拟SAR图像进行反事实解释,称为X-Fake。它统一了概率评估器和因果解释器,以实现可信赖的效用评估。我们使用概率贝叶斯深度模型来学习后验分布,以真实数据为条件,构建评估器。定量地说,模拟数据的预测不确定性可以反映分布差异。我们用内省变分自编码器(IntroVAE)构建因果解释器,以生成高分辨率的反事实。最后利用评价指标和先验信息对IntroVAE潜码进行优化,生成反事实解释,从而明确揭示模拟数据的不真实细节。在电磁模型和生成式人工智能方法获得的四个模拟SAR图像数据集上验证了该框架。结果表明,所提出的X-Fake框架在实用性方面优于其他IQA方法。此外,结果表明,生成的反事实解释是可信的,可以进一步提高应用中的数据效用。
{"title":"X-Fake: Juggling Utility Evaluation and Explanation of Simulated SAR Images","authors":"Zhongling Huang;Yihan Zhuang;Zipei Zhong;Feng Xu;Gong Cheng;Junwei Han","doi":"10.1109/TIP.2025.3634988","DOIUrl":"10.1109/TIP.2025.3634988","url":null,"abstract":"Synthetic aperture radar (SAR) image simulation has attracted much attention due to its great potential to supplement the scarce training data for deep learning algorithms. Consequently, evaluating the quality of the simulated SAR image is crucial for practical applications. The current literature primarily uses image quality assessment (IQA) techniques for evaluation that rely on human observers’ perceptions. However, because of the unique imaging mechanism of SAR, these techniques may produce evaluation results that are not entirely valid. The distribution inconsistency between real and simulated data is the main obstacle that influences the utility of simulated SAR images. To this end, we propose a novel trustworthy utility evaluation framework with a counterfactual explanation for simulated SAR images for the first time, denoted as X-Fake. It unifies a probabilistic evaluator and a causal explainer to achieve a trustworthy utility assessment. We construct the evaluator using a probabilistic Bayesian deep model to learn the posterior distribution, conditioned on real data. Quantitatively, the predicted uncertainty of simulated data can reflect the distribution discrepancy. We build the causal explainer with an introspective variational auto-encoder (IntroVAE) to generate high-resolution counterfactuals. The latent code of IntroVAE is finally optimized with evaluation indicators and prior information to generate the counterfactual explanation, thus revealing the inauthentic details of simulated data explicitly. The proposed framework is validated on four simulated SAR image datasets obtained from electromagnetic models and generative artificial intelligence approaches. The results demonstrate the proposed X-Fake framework outperforms other IQA methods in terms of utility. Furthermore, the results illustrate that the generated counterfactual explanations are trustworthy, and can further improve the data utility in applications.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7830-7844"},"PeriodicalIF":13.7,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145599029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text-Guided Semantic Alignment Network With Spatial-Frequency Interaction for Infrared-Visible Image Fusion Under Extreme Illumination 基于空间-频率交互的文本引导语义对齐网络在极端光照下的红外-可见光图像融合
IF 13.7 Pub Date : 2025-11-25 DOI: 10.1109/TIP.2025.3635048
Guanghui Yue;Wentao Li;Cheng Zhao;Zhiliang Wu;Tianwei Zhou;Qiuping Jiang;Runmin Cong
Although text-guided infrared-visible image fusion helps improve content understanding under extreme illumination, existing methods usually ignore semantic differences between textual and visual features, resulting in limited improvement. To address this challenge, we propose a Text-Guided Semantic Alignment Network, termed TSANet, for extreme-illumination infrared-visible image fusion. The network follows an encoder-decoder structure, with two image encoders, two text encoders, and one decoder. It uses a Semantic Alignment and Fusion (SAF) block to bridge the two image encoders in each layer. Specifically, the SAF block consists of two parallel Semantic Alignment (SA) modules, corresponding to the infrared and visible modalities, respectively, and a Spatial-Frequency Interaction (SFI) module. The SA module aligns the visual feature from the image encoder with its corresponding textual feature from the text encoder, to guide the network focus on key semantic regions of infrared and visible images. The SFI module aggregates the spatial and frequency information extracted from the modality-aligned features of two SA modules for complementary representation learning. The network progressively complements two image modalities by connecting the SAF blocks from top to down, and finally provides a visually pleasing fusion effect by feeding the output of the last block into the decoder. Recognizing that existing datasets lack illumination diversity, we contribute a new dataset specifically designed for extreme-illumination image fusion. Extensive experiments show the effectiveness and superiority of TSANet over seven state-of-the-art methods. The source code and dataset are available at https://github.com/WentaoLi-CV/TSANet
虽然文本引导的红外-可见图像融合有助于提高极端光照下的内容理解能力,但现有方法通常忽略了文本特征和视觉特征之间的语义差异,导致改进有限。为了解决这一挑战,我们提出了一种文本引导语义对齐网络,称为TSANet,用于极端照明红外-可见光图像融合。该网络采用编码器-解码器结构,包括两个图像编码器、两个文本编码器和一个解码器。它使用语义对齐和融合(SAF)块来桥接每层中的两个图像编码器。具体来说,SAF模块包括两个平行的语义对齐(SA)模块,分别对应于红外和可见光模态,以及一个空间频率交互(SFI)模块。SA模块将来自图像编码器的视觉特征与来自文本编码器的相应文本特征对齐,引导网络聚焦红外和可见光图像的关键语义区域。SFI模块将从两个SA模块的模态对齐特征中提取的空间和频率信息聚合在一起,用于互补表示学习。该网络通过从上到下连接SAF块来逐步补充两种图像模态,最后通过将最后一个块的输出输入解码器来提供视觉上令人愉悦的融合效果。认识到现有的数据集缺乏光照多样性,我们贡献了一个新的数据集,专门为极端光照图像融合设计。大量的实验表明,TSANet的有效性和优越性超过了七个最先进的方法。源代码和数据集可从https://github.com/WentaoLi-CV/TSANet获得
{"title":"Text-Guided Semantic Alignment Network With Spatial-Frequency Interaction for Infrared-Visible Image Fusion Under Extreme Illumination","authors":"Guanghui Yue;Wentao Li;Cheng Zhao;Zhiliang Wu;Tianwei Zhou;Qiuping Jiang;Runmin Cong","doi":"10.1109/TIP.2025.3635048","DOIUrl":"10.1109/TIP.2025.3635048","url":null,"abstract":"Although text-guided infrared-visible image fusion helps improve content understanding under extreme illumination, existing methods usually ignore semantic differences between textual and visual features, resulting in limited improvement. To address this challenge, we propose a Text-Guided Semantic Alignment Network, termed TSANet, for extreme-illumination infrared-visible image fusion. The network follows an encoder-decoder structure, with two image encoders, two text encoders, and one decoder. It uses a Semantic Alignment and Fusion (SAF) block to bridge the two image encoders in each layer. Specifically, the SAF block consists of two parallel Semantic Alignment (SA) modules, corresponding to the infrared and visible modalities, respectively, and a Spatial-Frequency Interaction (SFI) module. The SA module aligns the visual feature from the image encoder with its corresponding textual feature from the text encoder, to guide the network focus on key semantic regions of infrared and visible images. The SFI module aggregates the spatial and frequency information extracted from the modality-aligned features of two SA modules for complementary representation learning. The network progressively complements two image modalities by connecting the SAF blocks from top to down, and finally provides a visually pleasing fusion effect by feeding the output of the last block into the decoder. Recognizing that existing datasets lack illumination diversity, we contribute a new dataset specifically designed for extreme-illumination image fusion. Extensive experiments show the effectiveness and superiority of TSANet over seven state-of-the-art methods. The source code and dataset are available at <uri>https://github.com/WentaoLi-CV/TSANet</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7943-7958"},"PeriodicalIF":13.7,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145599031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linear Complexity Multi-View Unsupervised Feature Selection via Anchor-Based Feature Relationship Construction 基于锚点特征关系构建的线性复杂性多视图无监督特征选择
IF 13.7 Pub Date : 2025-11-25 DOI: 10.1109/TIP.2025.3635015
Qi Liu;Suyuan Liu;Jianhua Dai;Xueling Zhu;Xinwang Liu
In recent years, multi-view unsupervised feature selection has gained significant interest for its ability to efficiently handle multi-view datasets while offering better interpretability. Existing multi-view unsupervised feature selection methods construct graphs based on the relationship between samples. In fact, in feature selection, it is more important to focus on the relationships between features. However, constructing a complete graph to capture the relationship between features would incur a space and time complexity of $O(d^{2})$ or even higher. Therefore, we introduce an anchor-based strategy and build a feature bipartite graph to reduce complexity. In addition, since existing methods cannot directly extract feature importance from a feature bipartite graph, we design an effective and low-complexity method to directly obtain feature scores from a feature bipartite graph. Compared with the feature importance extraction method based on the complete graph, our proposed method reduces the time complexity from $O(d^{3})$ to $O(d)$ . To the best of our knowledge, our proposed method is the first multi-view unsupervised feature selection algorithm that achieves $O(nd)$ space and time complexity without data segmentation. Specifically, this method adaptively learns feature-level anchor graph structures through self-expressive multi-view subspace learning, which can effectively capture the structural information between features and anchors. Meanwhile, the proposed method projects low-dimensional anchors to common dimensions and aligns them with consensus anchors to capture the consistency and complementary information between different views. The superiority of the proposed algorithm is demonstrated by comparing it with seven state-of-the-art algorithms on five public image and two biological information multi-view datasets. The code of the proposed method is publicly available at https://github.com/getupLiu/AFRC
近年来,多视图无监督特征选择因其能够有效处理多视图数据集并提供更好的可解释性而引起了人们的极大兴趣。现有的多视图无监督特征选择方法是基于样本之间的关系构造图。事实上,在特征选择中,更重要的是关注特征之间的关系。然而,构建一个完整的图来捕获特征之间的关系将导致$O(d^{2})$甚至更高的空间和时间复杂度。因此,我们引入了基于锚点的策略,并构建了特征二部图来降低复杂度。此外,由于现有方法无法直接从特征二部图中提取特征重要性,我们设计了一种有效且低复杂度的方法来直接从特征二部图中提取特征分数。与基于完全图的特征重要性提取方法相比,我们提出的方法将时间复杂度从$O(d^{3})$降低到$O(d)$。据我们所知,我们提出的方法是第一个在没有数据分割的情况下实现$O(nd)$空间和时间复杂度的多视图无监督特征选择算法。具体而言,该方法通过自表达的多视图子空间学习自适应学习特征级锚点图结构,能够有效捕获特征与锚点之间的结构信息。同时,该方法将低维锚点投影到公共维度,并与共识锚点对齐,以捕获不同视图之间的一致性和互补性信息。通过在5个公共图像和2个生物信息多视图数据集上与7种最先进的算法进行比较,证明了该算法的优越性。建议的方法的代码可在https://github.com/getupLiu/AFRC上公开获得
{"title":"Linear Complexity Multi-View Unsupervised Feature Selection via Anchor-Based Feature Relationship Construction","authors":"Qi Liu;Suyuan Liu;Jianhua Dai;Xueling Zhu;Xinwang Liu","doi":"10.1109/TIP.2025.3635015","DOIUrl":"10.1109/TIP.2025.3635015","url":null,"abstract":"In recent years, multi-view unsupervised feature selection has gained significant interest for its ability to efficiently handle multi-view datasets while offering better interpretability. Existing multi-view unsupervised feature selection methods construct graphs based on the relationship between samples. In fact, in feature selection, it is more important to focus on the relationships between features. However, constructing a complete graph to capture the relationship between features would incur a space and time complexity of <inline-formula> <tex-math>$O(d^{2})$ </tex-math></inline-formula> or even higher. Therefore, we introduce an anchor-based strategy and build a feature bipartite graph to reduce complexity. In addition, since existing methods cannot directly extract feature importance from a feature bipartite graph, we design an effective and low-complexity method to directly obtain feature scores from a feature bipartite graph. Compared with the feature importance extraction method based on the complete graph, our proposed method reduces the time complexity from <inline-formula> <tex-math>$O(d^{3})$ </tex-math></inline-formula> to <inline-formula> <tex-math>$O(d)$ </tex-math></inline-formula>. To the best of our knowledge, our proposed method is the first multi-view unsupervised feature selection algorithm that achieves <inline-formula> <tex-math>$O(nd)$ </tex-math></inline-formula> space and time complexity without data segmentation. Specifically, this method adaptively learns feature-level anchor graph structures through self-expressive multi-view subspace learning, which can effectively capture the structural information between features and anchors. Meanwhile, the proposed method projects low-dimensional anchors to common dimensions and aligns them with consensus anchors to capture the consistency and complementary information between different views. The superiority of the proposed algorithm is demonstrated by comparing it with seven state-of-the-art algorithms on five public image and two biological information multi-view datasets. The code of the proposed method is publicly available at <uri>https://github.com/getupLiu/AFRC</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7889-7902"},"PeriodicalIF":13.7,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145599028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Prior Fusion Transfer Plugin for Adapting In-Air Models to Underwater Image Enhancement and Detection 用于水下图像增强与检测的空中模型多先验融合传输插件
IF 13.7 Pub Date : 2025-11-24 DOI: 10.1109/TIP.2025.3634001
Jingchun Zhou;Dehuan Zhang;Zongxin He;Qilin Gai;Qiuping Jiang
Underwater data is inherently scarce and exhibits complex distributions, making it challenging to train high-performance models from scratch. In contrast, in-air models are structurally mature, resource-rich, and offer strong potential for transfer. However, significant discrepancies in visual characteristics and feature distributions between underwater and in-air environments often lead to severe performance degradation when applying in-air models directly. To address this issue, we propose IA2U, a lightweight plugin designed for efficient underwater adaptation without modifying the original model architecture. IA2U can be flexibly integrated into arbitrary in-air networks, offering high generalizability and low deployment costs. Specifically, IA2U incorporates three types of prior knowledge—water type, degradation pattern, and sample semantics—which are embedded into intermediate layers through feature injection and channel-wise modulation to guide the network’s response to underwater-specific features. Furthermore, a multi-scale feature alignment module is introduced to dynamically balance information across different resolution paths, enhancing consistency and contextual representation. Extensive experiments demonstrate that IA2U significantly improves both image enhancement and object detection performance. Specifically, on the UIEB dataset, IA2U boosts Shallow-UWNet by 5.2 dB in PSNR and reduces LPIPS by 52%; on the RUOD dataset, it increases AP by 1.8% when applied to the PAA detector. IA2U provides an effective and scalable solution for building robust underwater perception systems with minimal adaptation costs. Our code is available at https://github.com/zhoujingchun03/IA2U
水下数据本质上是稀缺的,并且呈现复杂的分布,这使得从头开始训练高性能模型具有挑战性。相比之下,空中模式结构成熟,资源丰富,具有很强的转移潜力。然而,当直接应用空中模型时,水下和空中环境之间的视觉特征和特征分布的显著差异往往导致严重的性能下降。为了解决这个问题,我们提出了IA2U,这是一个轻量级插件,旨在在不修改原始模型架构的情况下有效地适应水下环境。IA2U可灵活集成到任意空中网络中,具有通用性强、部署成本低的特点。具体来说,IA2U结合了三种类型的先验知识——水类型、退化模式和样本语义——它们通过特征注入和信道调制嵌入到中间层,以指导网络对水下特定特征的响应。此外,引入多尺度特征对齐模块,在不同分辨率路径上动态平衡信息,增强一致性和上下文表示。大量的实验表明,IA2U显著提高了图像增强和目标检测性能。具体来说,在UIEB数据集上,IA2U将Shallow-UWNet的PSNR提高了5.2 dB, LPIPS降低了52%;在RUOD数据集上,当应用于PAA检测器时,AP增加1.8%。IA2U以最小的适应成本为构建强大的水下感知系统提供了有效且可扩展的解决方案。我们的代码可在https://github.com/zhoujingchun03/IA2U上获得
{"title":"Multi-Prior Fusion Transfer Plugin for Adapting In-Air Models to Underwater Image Enhancement and Detection","authors":"Jingchun Zhou;Dehuan Zhang;Zongxin He;Qilin Gai;Qiuping Jiang","doi":"10.1109/TIP.2025.3634001","DOIUrl":"10.1109/TIP.2025.3634001","url":null,"abstract":"Underwater data is inherently scarce and exhibits complex distributions, making it challenging to train high-performance models from scratch. In contrast, in-air models are structurally mature, resource-rich, and offer strong potential for transfer. However, significant discrepancies in visual characteristics and feature distributions between underwater and in-air environments often lead to severe performance degradation when applying in-air models directly. To address this issue, we propose IA2U, a lightweight plugin designed for efficient underwater adaptation without modifying the original model architecture. IA2U can be flexibly integrated into arbitrary in-air networks, offering high generalizability and low deployment costs. Specifically, IA2U incorporates three types of prior knowledge—water type, degradation pattern, and sample semantics—which are embedded into intermediate layers through feature injection and channel-wise modulation to guide the network’s response to underwater-specific features. Furthermore, a multi-scale feature alignment module is introduced to dynamically balance information across different resolution paths, enhancing consistency and contextual representation. Extensive experiments demonstrate that IA2U significantly improves both image enhancement and object detection performance. Specifically, on the UIEB dataset, IA2U boosts Shallow-UWNet by 5.2 dB in PSNR and reduces LPIPS by 52%; on the RUOD dataset, it increases AP by 1.8% when applied to the PAA detector. IA2U provides an effective and scalable solution for building robust underwater perception systems with minimal adaptation costs. Our code is available at <uri>https://github.com/zhoujingchun03/IA2U</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7773-7785"},"PeriodicalIF":13.7,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145593068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SUIT: Spatial-Spectral Union-Intersection Interaction Network for Hyperspectral Object Tracking 题目:用于高光谱目标跟踪的空间-光谱联合-交集交互网络
IF 13.7 Pub Date : 2025-11-24 DOI: 10.1109/TIP.2025.3633177
Fengchao Xiong;Zhenxing Wu;Jun Zhou;Sen Jia;Yuntao Qian
Hyperspectral videos (HSVs), with their inherent spatial-spectral-temporal structure, offer distinct advantages in challenging tracking scenarios such as cluttered backgrounds and small objects. However, existing methods primarily focus on spatial interactions between the template and search regions, often overlooking spectral interactions, leading to suboptimal performance. To address this issue, this paper investigates spectral interactions from both the architectural and training perspectives. At the architectural level, we first establish band-wise long-range spatial relationships between the template and search regions using Transformers. We then model spectral interactions using the inclusion-exclusion principle from set theory, treating them as the union of spatial interactions across all bands. This enables the effective integration of both shared and band-specific spatial cues. At the training level, we introduce a spectral loss to enforce material distribution alignment between the template and predicted regions, enhancing robustness to shape deformation and appearance variations. Extensive experiments demonstrate that our tracker achieves state-of-the-art tracking performance. The source code, trained models and results will be publicly available via https://github.com/bearshng/suit to support reproducibility
高光谱视频(hsv)以其固有的空间-光谱-时间结构,在复杂背景和小物体等具有挑战性的跟踪场景中具有明显的优势。然而,现有的方法主要关注模板和搜索区域之间的空间相互作用,往往忽略了光谱相互作用,导致性能不理想。为了解决这个问题,本文从架构和培训的角度研究了谱相互作用。在架构级别,我们首先使用transformer在模板和搜索区域之间建立带方向的远程空间关系。然后,我们使用集合理论中的包容-排斥原理对光谱相互作用进行建模,将它们视为所有波段空间相互作用的结合。这使得共享和特定波段的空间线索能够有效整合。在训练层面,我们引入了光谱损失来强制模板和预测区域之间的材料分布对齐,增强了对形状变形和外观变化的鲁棒性。大量的实验表明,我们的跟踪器达到了最先进的跟踪性能。源代码、经过训练的模型和结果将通过https://github.com/bearshng/suit公开提供,以支持再现性
{"title":"SUIT: Spatial-Spectral Union-Intersection Interaction Network for Hyperspectral Object Tracking","authors":"Fengchao Xiong;Zhenxing Wu;Jun Zhou;Sen Jia;Yuntao Qian","doi":"10.1109/TIP.2025.3633177","DOIUrl":"10.1109/TIP.2025.3633177","url":null,"abstract":"Hyperspectral videos (HSVs), with their inherent spatial-spectral-temporal structure, offer distinct advantages in challenging tracking scenarios such as cluttered backgrounds and small objects. However, existing methods primarily focus on spatial interactions between the template and search regions, often overlooking spectral interactions, leading to suboptimal performance. To address this issue, this paper investigates spectral interactions from both the architectural and training perspectives. At the architectural level, we first establish band-wise long-range spatial relationships between the template and search regions using Transformers. We then model spectral interactions using the inclusion-exclusion principle from set theory, treating them as the union of spatial interactions across all bands. This enables the effective integration of both shared and band-specific spatial cues. At the training level, we introduce a spectral loss to enforce material distribution alignment between the template and predicted regions, enhancing robustness to shape deformation and appearance variations. Extensive experiments demonstrate that our tracker achieves state-of-the-art tracking performance. The source code, trained models and results will be publicly available via <uri>https://github.com/bearshng/suit</uri> to support reproducibility","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7786-7800"},"PeriodicalIF":13.7,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145593514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multidimensional Imaging Data Completion via Weighted Three-Directional Minimax Concave Penalty Regularization 基于加权三向极大极小凹罚正则化的多维成像数据补全
IF 13.7 Pub Date : 2025-11-24 DOI: 10.1109/TIP.2025.3633566
Haifei Zeng;Wen Li;Xiaofei Peng;Mingqing Xiao
In this paper, we present a novel non-convex tensor completion model specifically tailored for multidimensional data. Our approach introduces a three-directional non-convex tensor rank surrogate regularized by the Minimax Concave Penalty (MCP) function. Crucially, the method processes data by simultaneously exploiting low-rank structures across its three modal directions, with the MCP function effectively mitigating the over-penalization of large singular values—a common drawback in convex nuclear norm minimization. To address the inherent challenges of this non-convex optimization, we develop an innovative approximate convex model that accurately captures the original formulation’s essence. We then develop a robust convex Alternating Direction Method of Multipliers (ADMM)-based algorithm, supported by a rigorous convergence guarantee, ensuring both theoretical soundness and practical reliability. Extensive experiments on a variety of real-world datasets demonstrate the superior performance and robustness of the proposed method compared to state-of-the-art approaches.
在本文中,我们提出了一个新的非凸张量补全模型,专门针对多维数据。我们的方法引入了一个由Minimax凹惩罚(MCP)函数正则化的三方向非凸张量秩代理。至关重要的是,该方法通过在三个模态方向上同时利用低秩结构来处理数据,MCP函数有效地减轻了大奇异值的过度惩罚,这是凸核范数最小化的一个常见缺点。为了解决这种非凸优化的固有挑战,我们开发了一种创新的近似凸模型,可以准确地捕捉原始公式的本质。然后,我们开发了一种基于凸交替方向乘法器(ADMM)的鲁棒算法,该算法具有严格的收敛性保证,确保了理论的合理性和实际的可靠性。在各种真实世界数据集上进行的大量实验表明,与最先进的方法相比,所提出的方法具有优越的性能和鲁棒性。
{"title":"Multidimensional Imaging Data Completion via Weighted Three-Directional Minimax Concave Penalty Regularization","authors":"Haifei Zeng;Wen Li;Xiaofei Peng;Mingqing Xiao","doi":"10.1109/TIP.2025.3633566","DOIUrl":"10.1109/TIP.2025.3633566","url":null,"abstract":"In this paper, we present a novel non-convex tensor completion model specifically tailored for multidimensional data. Our approach introduces a three-directional non-convex tensor rank surrogate regularized by the Minimax Concave Penalty (MCP) function. Crucially, the method processes data by simultaneously exploiting low-rank structures across its three modal directions, with the MCP function effectively mitigating the over-penalization of large singular values—a common drawback in convex nuclear norm minimization. To address the inherent challenges of this non-convex optimization, we develop an innovative approximate convex model that accurately captures the original formulation’s essence. We then develop a robust convex Alternating Direction Method of Multipliers (ADMM)-based algorithm, supported by a rigorous convergence guarantee, ensuring both theoretical soundness and practical reliability. Extensive experiments on a variety of real-world datasets demonstrate the superior performance and robustness of the proposed method compared to state-of-the-art approaches.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"7801-7816"},"PeriodicalIF":13.7,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145593064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1