首页 > 最新文献

IEEE Transactions on Pattern Analysis and Machine Intelligence最新文献

英文 中文
Enhance Before Fusion: Multi-View Graph Clustering With Graph Trend Filter 融合前增强:基于图趋势过滤器的多视图图聚类
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-19 DOI: 10.1109/tpami.2026.3655829
Penglei Wang, Jitao Lu, Danyang Wu, Rong Wang, Feiping Nie
{"title":"Enhance Before Fusion: Multi-View Graph Clustering With Graph Trend Filter","authors":"Penglei Wang, Jitao Lu, Danyang Wu, Rong Wang, Feiping Nie","doi":"10.1109/tpami.2026.3655829","DOIUrl":"https://doi.org/10.1109/tpami.2026.3655829","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"272 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Accurate Image Generation via Dynamic Generative Image Transformer 通过动态生成图像转换器实现精确的图像生成
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-19 DOI: 10.1109/tpami.2026.3653620
Zhendong Mao, Mengqi Huang, Yijing Lin, Quan Wang, Lei Zhang, Yongdong Zhang
{"title":"Toward Accurate Image Generation via Dynamic Generative Image Transformer","authors":"Zhendong Mao, Mengqi Huang, Yijing Lin, Quan Wang, Lei Zhang, Yongdong Zhang","doi":"10.1109/tpami.2026.3653620","DOIUrl":"https://doi.org/10.1109/tpami.2026.3653620","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"383 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A General Image Fusion Approach Exploiting Gradient Transfer Learning and Fusion Rule Unfolding 一种利用梯度迁移学习和融合规则展开的通用图像融合方法
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-19 DOI: 10.1109/tpami.2026.3655694
Wu Wang, Liang-Jian Deng, Qi Cao, Gemine Vivone
{"title":"A General Image Fusion Approach Exploiting Gradient Transfer Learning and Fusion Rule Unfolding","authors":"Wu Wang, Liang-Jian Deng, Qi Cao, Gemine Vivone","doi":"10.1109/tpami.2026.3655694","DOIUrl":"https://doi.org/10.1109/tpami.2026.3655694","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"5 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable Semi-supervised Learning with Discriminative Label Propagation and Correction 具有判别标签传播和校正的可扩展半监督学习
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-19 DOI: 10.1109/tpami.2026.3655456
Bingbing Jiang, Jie Wen, Zidong Wang, Weiguo Sheng, Zhiwen Yu, Huanhuan Chen, Weiping Ding
{"title":"Scalable Semi-supervised Learning with Discriminative Label Propagation and Correction","authors":"Bingbing Jiang, Jie Wen, Zidong Wang, Weiguo Sheng, Zhiwen Yu, Huanhuan Chen, Weiping Ding","doi":"10.1109/tpami.2026.3655456","DOIUrl":"https://doi.org/10.1109/tpami.2026.3655456","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"38 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Defying Distractions in Multimodal Tasks: A Novel Benchmark for Large Vision-Language Models 在多模态任务中抵抗干扰:大型视觉语言模型的新基准
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-19 DOI: 10.1109/tpami.2026.3655641
Jinhui Yang, Ming Jiang, Qi Zhao
{"title":"Defying Distractions in Multimodal Tasks: A Novel Benchmark for Large Vision-Language Models","authors":"Jinhui Yang, Ming Jiang, Qi Zhao","doi":"10.1109/tpami.2026.3655641","DOIUrl":"https://doi.org/10.1109/tpami.2026.3655641","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"91 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Like Human Rethinking: Contour Transformer AutoRegression for Referring Remote Sensing Interpretation. 如人的反思:参考遥感解译的等高线变压器自回归。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1109/tpami.2026.3654392
Jinming Chai,Licheng Jiao,Xiaoqiang Lu,Lingling Li,Fang Liu,Long Sun,Xu Liu,Wenping Ma,Weibin Li
Referring remote sensing interpretation holds significant application value in various scenarios such as ecological protection, resource exploration, and emergency management. However, referring remote sensing expression comprehension and segmentation (RRSECS) faces critical challenges, including micro-target localization drift problem caused by insufficient extraction of boundary features in existing paradigms. Moreover, when transferred to remote sensing domains, polygon-based methods encounter issues such as contour-boundary misalignment and multi-task co-optimization conflicts problems. In this paper, we propose SeeFormer, a novel contour autoregressive paradigm specifically designed for RRSECS, which accurately locates and segments micro, irregular targets in remote sensing imagery. We first introduce a brain-inspired feature refocus learning (BIFRL) module that progressively attends to effective object features via a coarse-to-fine scheme, significantly boosting small-object localization and segmentation. Next, we present a language-contour enhancer (LCE) that injects shape-aware contour priors, and a corner-based contour sampler (CBCS) to improve mask-polygon reconstruction fidelity. Finally, we develop an autoregressive dual-decoder paradigm (ARDDP) that preserves sequence consistency while alleviating multi-task optimization conflicts. Extensive experiments on RefDIOR, RRSIS-D, and OPT-RSVG datasets under varying scenarios, scales, and task paradigms demonstrate transformative performance gains: compared to the baseline PolyFormer, our proposed SeeFormer improves oIoU and mIoU by 27.58% and 39.37% for referring image segmentation and by 18.94% and 28.90% for visual grounding on the RefDIOR dataset. The code will be publicly accessible at https://github.com/IPIU-XDU/RSFM.
参考遥感解译在生态保护、资源勘探、应急管理等多种场景中具有重要的应用价值。然而,参考遥感表达理解与分割(RRSECS)面临着严峻的挑战,包括现有范式中边界特征提取不足导致的微目标定位漂移问题。此外,当将基于多边形的方法应用到遥感领域时,会遇到轮廓边界不对准和多任务协同优化冲突等问题。在本文中,我们提出了一种新的轮廓自回归范式SeeFormer,它是专门为RRSECS设计的,可以准确地定位和分割遥感图像中的微小不规则目标。我们首先引入了一个由大脑启发的特征再聚焦学习(BIFRL)模块,该模块通过从粗到精的方案逐步关注有效的目标特征,显著提高了小目标的定位和分割。接下来,我们提出了一种注入形状感知轮廓先验的语言轮廓增强器(LCE)和一种基于角的轮廓采样器(CBCS),以提高掩模多边形重建的保真度。最后,我们开发了一种自回归双解码器范式(ARDDP),在保持序列一致性的同时减轻了多任务优化冲突。在不同场景、尺度和任务范式下,对RefDIOR、RRSIS-D和OPT-RSVG数据集进行了广泛的实验,结果表明,与基线PolyFormer相比,我们提出的SeeFormer在参考图像分割方面分别提高了27.58%和39.37%,在RefDIOR数据集上的视觉基础方面分别提高了18.94%和28.90%。该代码将在https://github.com/IPIU-XDU/RSFM上公开访问。
{"title":"Like Human Rethinking: Contour Transformer AutoRegression for Referring Remote Sensing Interpretation.","authors":"Jinming Chai,Licheng Jiao,Xiaoqiang Lu,Lingling Li,Fang Liu,Long Sun,Xu Liu,Wenping Ma,Weibin Li","doi":"10.1109/tpami.2026.3654392","DOIUrl":"https://doi.org/10.1109/tpami.2026.3654392","url":null,"abstract":"Referring remote sensing interpretation holds significant application value in various scenarios such as ecological protection, resource exploration, and emergency management. However, referring remote sensing expression comprehension and segmentation (RRSECS) faces critical challenges, including micro-target localization drift problem caused by insufficient extraction of boundary features in existing paradigms. Moreover, when transferred to remote sensing domains, polygon-based methods encounter issues such as contour-boundary misalignment and multi-task co-optimization conflicts problems. In this paper, we propose SeeFormer, a novel contour autoregressive paradigm specifically designed for RRSECS, which accurately locates and segments micro, irregular targets in remote sensing imagery. We first introduce a brain-inspired feature refocus learning (BIFRL) module that progressively attends to effective object features via a coarse-to-fine scheme, significantly boosting small-object localization and segmentation. Next, we present a language-contour enhancer (LCE) that injects shape-aware contour priors, and a corner-based contour sampler (CBCS) to improve mask-polygon reconstruction fidelity. Finally, we develop an autoregressive dual-decoder paradigm (ARDDP) that preserves sequence consistency while alleviating multi-task optimization conflicts. Extensive experiments on RefDIOR, RRSIS-D, and OPT-RSVG datasets under varying scenarios, scales, and task paradigms demonstrate transformative performance gains: compared to the baseline PolyFormer, our proposed SeeFormer improves oIoU and mIoU by 27.58% and 39.37% for referring image segmentation and by 18.94% and 28.90% for visual grounding on the RefDIOR dataset. The code will be publicly accessible at https://github.com/IPIU-XDU/RSFM.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"20 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-Gradient Hash Factor Learning for High-Dimensional and Incomplete Data Representation Learning. 高维和不完全数据表示学习的非梯度哈希因子学习。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1109/tpami.2026.3653780
Di Wu,Shihui Li,Yi He,Xin Luo,Xinbo Gao
High-dimensional and incomplete (HDI) data are ubiquitous in various Big Data-related industrial applications, such as drug innovation and recommender systems. Hash learning is the most efficient representation learning approach to extract hidden information from HDI data owing to its fast reasoning and low storage. However, an existing hash learning approach commonly employs gradient-based optimization techniques to address the discrete objective caused by the binary nature of hash factors, where the Quantization (i.e., quantizing the real values to binary codes) loss is inevitable, resulting in accuracy loss when representing HDI data. Motivated by these critical and vital issues, this paper proposes a non-gradient hash factor (NGHF) model with three-fold ideas: a) innovating a discrete differential evolution (DDE) algorithm able to simulate the continuous optimization via disabling bits of binary codes based on the projected Hamming dissimilarity, thus enabling an effective discrete optimizer, b) applying the proposed DDE algorithm to directly optimize the discrete learning objective of NGHF defined on HDI data, thereby facilitating its efficient and precise training without any Quantization loss, and c) theoretically proving the convergence of NGHF. As such, NGHF possesses high representation learning ability comparable to that of a real-valued model, making it able to achieve precise binary representation to HDI data. Extensive experimental results on nine real-world datasets demonstrate that NGHF significantly outperforms eight state-of-the-art hash learning models. Moreover, its accuracy is amazingly comparable to that of a real valued model for HDI data representation learning. Such results are inspiring for facilitating hash-learning models with both high accuracy and fast reasoning on HDI data, which is critical for industrial applications. Our source code is shared at the link: https://github.com/wudi1989/NGHF.
高维不完全数据(high dimensional and incomplete, HDI)在各种大数据相关的产业应用中无处不在,例如药物创新和推荐系统。哈希学习具有推理速度快、存储空间小等优点,是从HDI数据中提取隐藏信息最有效的表示学习方法。然而,现有的哈希学习方法通常采用基于梯度的优化技术来解决由哈希因子的二进制性质引起的离散目标,其中量化(即将实值量化为二进制代码)损失是不可避免的,从而导致表示HDI数据时的精度损失。在这些关键问题的激励下,本文提出了一个具有三重思想的非梯度哈希因子(NGHF)模型:a)创新了一种离散差分进化(DDE)算法,该算法能够模拟基于投影的汉明不相似度通过禁用二进制码位进行连续优化,从而实现有效的离散优化器;b)应用所提出的DDE算法直接优化HDI数据上定义的NGHF的离散学习目标,从而在没有任何量化损失的情况下实现NGHF的高效精确训练;c)从理论上证明了NGHF的收敛性。因此,NGHF具有与实值模型相当的高表示学习能力,能够实现对HDI数据的精确二进制表示。在9个真实数据集上的广泛实验结果表明,NGHF显著优于8个最先进的哈希学习模型。此外,它的准确性与HDI数据表示学习的实值模型惊人地相似。这样的结果对于促进对HDI数据进行高精度和快速推理的哈希学习模型是鼓舞人心的,这对于工业应用程序至关重要。我们的源代码共享在这个链接:https://github.com/wudi1989/NGHF。
{"title":"Non-Gradient Hash Factor Learning for High-Dimensional and Incomplete Data Representation Learning.","authors":"Di Wu,Shihui Li,Yi He,Xin Luo,Xinbo Gao","doi":"10.1109/tpami.2026.3653780","DOIUrl":"https://doi.org/10.1109/tpami.2026.3653780","url":null,"abstract":"High-dimensional and incomplete (HDI) data are ubiquitous in various Big Data-related industrial applications, such as drug innovation and recommender systems. Hash learning is the most efficient representation learning approach to extract hidden information from HDI data owing to its fast reasoning and low storage. However, an existing hash learning approach commonly employs gradient-based optimization techniques to address the discrete objective caused by the binary nature of hash factors, where the Quantization (i.e., quantizing the real values to binary codes) loss is inevitable, resulting in accuracy loss when representing HDI data. Motivated by these critical and vital issues, this paper proposes a non-gradient hash factor (NGHF) model with three-fold ideas: a) innovating a discrete differential evolution (DDE) algorithm able to simulate the continuous optimization via disabling bits of binary codes based on the projected Hamming dissimilarity, thus enabling an effective discrete optimizer, b) applying the proposed DDE algorithm to directly optimize the discrete learning objective of NGHF defined on HDI data, thereby facilitating its efficient and precise training without any Quantization loss, and c) theoretically proving the convergence of NGHF. As such, NGHF possesses high representation learning ability comparable to that of a real-valued model, making it able to achieve precise binary representation to HDI data. Extensive experimental results on nine real-world datasets demonstrate that NGHF significantly outperforms eight state-of-the-art hash learning models. Moreover, its accuracy is amazingly comparable to that of a real valued model for HDI data representation learning. Such results are inspiring for facilitating hash-learning models with both high accuracy and fast reasoning on HDI data, which is critical for industrial applications. Our source code is shared at the link: https://github.com/wudi1989/NGHF.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"270 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Isolating Interference Factors for Robust Cloth-Changing Person Re-Identification. 稳健换布人再识别的干扰因素分离。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-16 DOI: 10.1109/tpami.2026.3655110
De Cheng,Yubo Li,Chaowei Fang,Shizhou Zhang,Nannan Wang,Xinbo Gao
Cloth-Changing Person Re-Identification (CC-ReID) aims to recognize individuals across camera views despite clothing variations, a crucial task for surveillance and security systems. Existing methods typically frame it as a cross-modal alignment problem but often overlook explicit modeling of interference factors such as clothing, viewpoints, and pedestrian actions. This oversight can distort their impact, compromising the extraction of robust identity features. To address these challenges, we propose a novel framework that systematically disentangles interference factors from identity features while ensuring the robustness and discriminative power of identity representations. Our approach consists of two key components. First, a dual-stream identity feature learning framework leverages a raw image stream and a cloth-isolated stream, to extract identity representations independent of clothing textures. An adaptive cloth-irrelevant contrastive objective is introduced to mitigate identity feature variations caused by clothing differences. Second, we propose a Text-Driven Conditional Generative Adversarial Interference Disentanglement Network (T-CGAIDN), to further suppress interference factors beyond clothing textures, such as finer clothing patterns, viewpoint, background, and lighting conditions. This network incorporates a multi-granularity interference recognition branch to learn interference-related features, a conditional adversarial module for bidirectional transformation between identity and interference feature spaces, and an interference decoupling objective to eliminate interference dependencies in identity learning. Extensive experiments on public benchmarks demonstrate that our method significantly outperforms state-ofthe- art approaches, highlighting its effectiveness in CC-ReID. Our code is available at https://github.com/yblTech/IIFR-CCReID.
“换衣人再识别”(CC-ReID)的目标是,尽管穿着不同,也能通过摄像头识别个人,这是监控和安全系统的一项关键任务。现有的方法通常将其视为跨模态对齐问题,但往往忽略了诸如服装、视点和行人行为等干扰因素的显式建模。这种疏忽可能会扭曲它们的影响,损害健壮身份特征的提取。为了解决这些挑战,我们提出了一个新的框架,系统地从身份特征中分离干扰因素,同时确保身份表征的鲁棒性和判别能力。我们的方法由两个关键部分组成。首先,双流身份特征学习框架利用原始图像流和布料隔离流来提取独立于服装纹理的身份表征。引入了一种自适应的与服装无关的对比目标,以减轻服装差异引起的身份特征变化。其次,我们提出了一个文本驱动的条件生成对抗干扰解纠缠网络(T-CGAIDN),以进一步抑制服装纹理以外的干扰因素,如更精细的服装图案、视点、背景和照明条件。该网络采用多粒度干扰识别分支来学习干扰相关特征,采用条件对抗模块来实现身份与干扰特征空间的双向转换,采用干扰解耦目标来消除身份学习中的干扰依赖。在公共基准上进行的大量实验表明,我们的方法明显优于最先进的方法,突出了其在CC-ReID中的有效性。我们的代码可在https://github.com/yblTech/IIFR-CCReID上获得。
{"title":"Isolating Interference Factors for Robust Cloth-Changing Person Re-Identification.","authors":" De Cheng,Yubo Li,Chaowei Fang,Shizhou Zhang,Nannan Wang,Xinbo Gao","doi":"10.1109/tpami.2026.3655110","DOIUrl":"https://doi.org/10.1109/tpami.2026.3655110","url":null,"abstract":"Cloth-Changing Person Re-Identification (CC-ReID) aims to recognize individuals across camera views despite clothing variations, a crucial task for surveillance and security systems. Existing methods typically frame it as a cross-modal alignment problem but often overlook explicit modeling of interference factors such as clothing, viewpoints, and pedestrian actions. This oversight can distort their impact, compromising the extraction of robust identity features. To address these challenges, we propose a novel framework that systematically disentangles interference factors from identity features while ensuring the robustness and discriminative power of identity representations. Our approach consists of two key components. First, a dual-stream identity feature learning framework leverages a raw image stream and a cloth-isolated stream, to extract identity representations independent of clothing textures. An adaptive cloth-irrelevant contrastive objective is introduced to mitigate identity feature variations caused by clothing differences. Second, we propose a Text-Driven Conditional Generative Adversarial Interference Disentanglement Network (T-CGAIDN), to further suppress interference factors beyond clothing textures, such as finer clothing patterns, viewpoint, background, and lighting conditions. This network incorporates a multi-granularity interference recognition branch to learn interference-related features, a conditional adversarial module for bidirectional transformation between identity and interference feature spaces, and an interference decoupling objective to eliminate interference dependencies in identity learning. Extensive experiments on public benchmarks demonstrate that our method significantly outperforms state-ofthe- art approaches, highlighting its effectiveness in CC-ReID. Our code is available at https://github.com/yblTech/IIFR-CCReID.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"83 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-Photon Imaging in Complex Scenarios via Physics-Informed Deep Neural Networks. 基于物理信息的深度神经网络的复杂场景单光子成像。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-15 DOI: 10.1109/tpami.2026.3654264
Siao Cai,Zhicheng Yu,Shaobing Gao,Zeyu Chen,Yiguang Liu
Single-photon imaging uses single-photon-sensitive picosecond-resolution sensors to capture 3D structure and supports diverse applications, but success remains mostly limited to simple scenes. In complex scenarios, traditional methods degrade and deep learning methods lack flexibility and generalization. Here, we propose a physics-informed deep neural network (PIDNN) framework that effectively addresses both aspects, adapting to complex and variable sensing environments by embedding imaging physics into the deep neural network for unsupervised learning. Within this framework, by tailoring the number of U-Net skip connections, we impose multi-scale spatiotemporal priors that improve photon-utilization efficiency, laying the foundation for addressing the inherent low-signal-to-background ratio (SBR) problem in subsequent complex scenarios. Additionally, we introduce volume rendering into the PIDNN framework and design a dual-branch structure, further extending its applicability to multiple-depth and fog occlusion. We validated the performance of this method in various complex environments through numerical simulations and real-world experiments. The results of photon-efficient imaging with multiple returns show robust performance under low SBR and large fields of view. The method attains lower root mean-squared error than traditional methods and exhibits stronger generalization than supervised approaches. Further multiple depths and fog interference experiments confirm that its reconstruction quality surpasses existing techniques, demonstrating its flexibility and scalability. Both simulation and experimental results validate its exceptional reconstruction performance and flexibility.
单光子成像使用单光子敏感的皮秒分辨率传感器来捕获3D结构,并支持多种应用,但成功仍然主要局限于简单的场景。在复杂的场景下,传统方法会退化,深度学习方法缺乏灵活性和泛化能力。在这里,我们提出了一个物理信息深度神经网络(PIDNN)框架,该框架有效地解决了这两个方面,通过将成像物理嵌入到深度神经网络中进行无监督学习来适应复杂和可变的传感环境。在此框架内,通过调整U-Net跳变连接的数量,我们施加了多尺度时空先验,提高了光子利用效率,为解决后续复杂场景中固有的低信本比(SBR)问题奠定了基础。此外,我们将体绘制引入到PIDNN框架中,并设计了双分支结构,进一步扩展了其对多深度和雾遮挡的适用性。通过数值模拟和实际实验验证了该方法在各种复杂环境下的性能。结果表明,在低SBR和大视场条件下,具有多次回波的光子高效成像具有良好的性能。与传统方法相比,该方法具有更低的均方根误差和更强的泛化能力。进一步的深度和雾干扰实验证实了该方法的重建质量优于现有的技术,显示了其灵活性和可扩展性。仿真和实验结果验证了该方法具有良好的重构性能和灵活性。
{"title":"Single-Photon Imaging in Complex Scenarios via Physics-Informed Deep Neural Networks.","authors":"Siao Cai,Zhicheng Yu,Shaobing Gao,Zeyu Chen,Yiguang Liu","doi":"10.1109/tpami.2026.3654264","DOIUrl":"https://doi.org/10.1109/tpami.2026.3654264","url":null,"abstract":"Single-photon imaging uses single-photon-sensitive picosecond-resolution sensors to capture 3D structure and supports diverse applications, but success remains mostly limited to simple scenes. In complex scenarios, traditional methods degrade and deep learning methods lack flexibility and generalization. Here, we propose a physics-informed deep neural network (PIDNN) framework that effectively addresses both aspects, adapting to complex and variable sensing environments by embedding imaging physics into the deep neural network for unsupervised learning. Within this framework, by tailoring the number of U-Net skip connections, we impose multi-scale spatiotemporal priors that improve photon-utilization efficiency, laying the foundation for addressing the inherent low-signal-to-background ratio (SBR) problem in subsequent complex scenarios. Additionally, we introduce volume rendering into the PIDNN framework and design a dual-branch structure, further extending its applicability to multiple-depth and fog occlusion. We validated the performance of this method in various complex environments through numerical simulations and real-world experiments. The results of photon-efficient imaging with multiple returns show robust performance under low SBR and large fields of view. The method attains lower root mean-squared error than traditional methods and exhibits stronger generalization than supervised approaches. Further multiple depths and fog interference experiments confirm that its reconstruction quality surpasses existing techniques, demonstrating its flexibility and scalability. Both simulation and experimental results validate its exceptional reconstruction performance and flexibility.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"48 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Subgraph Extraction for Graph Invariant Learning via Graph Sinkhorn Attention 利用图下沉角注意改进图不变学习的子图提取
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-15 DOI: 10.1109/tpami.2026.3654544
Junchi Yan, Fangyu Ding, Jiawei Sun, Zhaoping Hu, Yunyi Zhou, Lei Zhu
{"title":"Improving Subgraph Extraction for Graph Invariant Learning via Graph Sinkhorn Attention","authors":"Junchi Yan, Fangyu Ding, Jiawei Sun, Zhaoping Hu, Yunyi Zhou, Lei Zhu","doi":"10.1109/tpami.2026.3654544","DOIUrl":"https://doi.org/10.1109/tpami.2026.3654544","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"68 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145972436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Pattern Analysis and Machine Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1