首页 > 最新文献

IEEE Transactions on Pattern Analysis and Machine Intelligence最新文献

英文 中文
Efficient Training of Large Vision Models via Advanced Automated Progressive Learning 基于先进自动渐进学习的大型视觉模型的有效训练
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-12 DOI: 10.1109/tpami.2026.3673336
Changlin Li, Jiawei Zhang, Sihao Lin, Zongxin Yang, Junwei Liang, Xiaodan Liang, Xiaojun Chang
{"title":"Efficient Training of Large Vision Models via Advanced Automated Progressive Learning","authors":"Changlin Li, Jiawei Zhang, Sihao Lin, Zongxin Yang, Junwei Liang, Xiaodan Liang, Xiaojun Chang","doi":"10.1109/tpami.2026.3673336","DOIUrl":"https://doi.org/10.1109/tpami.2026.3673336","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"48 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-Coupled Analysis 理解小样本状态下的对抗性模仿学习:阶段耦合分析
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-12 DOI: 10.1109/tpami.2026.3673238
Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo
{"title":"Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-Coupled Analysis","authors":"Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo","doi":"10.1109/tpami.2026.3673238","DOIUrl":"https://doi.org/10.1109/tpami.2026.3673238","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"96 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Complete Solution to Generalized Relative Pose Estimation from Affine Correspondences. 基于仿射对应的广义相对位姿估计的完整解。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-12 DOI: 10.1109/tpami.2026.3673525
Banglei Guan,Ji Zhao,Laurent Kneip
In recent years, affine correspondences (ACs) have emerged as widely adopted alternative to point correspondences (PCs) in geometric problems in computer vision. An AC is composed of a PC across two different views plus an affine transformation between the small patches around this PC. Prior studies have shown that a single affine correspondence (AC) generally yields three independent constraints for estimating relative pose. This work addresses relative pose estimation in multi-perspective camera systems, a relevant problem given their prevalence in modern technologies such as autonomous vehicles and augmented reality. More specifically, we introduce the first comprehensive suite of minimal solvers for 6DoF relative pose estimation across multiple cameras using only two ACs, which is notably valuable for robust model fitting scenarios. We analyze all possible configurations of two ACs in two views, and present minimal solvers covering all identified minimal cases. We make use of the hidden variable technique to eliminate the translation parameters, and represent rotation using either Cayley parameters or quaternions. We furthermore introduce novel constraints on the generalized relative pose problem that are beneficial in deriving more compact solvers with fewer solutions. Comprehensive experiments on synthetic and real-world data show that the proposed affine correspondence-based solvers are highly effective and computationally efficient.
近年来,仿射对应(ACs)作为点对应(pc)的替代方法被广泛应用于计算机视觉的几何问题中。AC由横跨两个不同视图的PC加上围绕该PC的小块之间的仿射变换组成。先前的研究表明,一个单一的仿射对应(AC)通常会产生三个独立的约束来估计相对位姿。这项工作解决了多视角相机系统中的相对姿态估计,这是一个相关的问题,因为它们在自动驾驶汽车和增强现实等现代技术中普遍存在。更具体地说,我们引入了第一个综合的最小解算器套件,仅使用两个ac就可以跨多个相机进行6DoF相对姿态估计,这对于鲁棒模型拟合场景非常有价值。我们在两个视图中分析了两个ac的所有可能配置,并给出了涵盖所有已确定的最小情况的最小解算器。我们使用隐变量技术来消除平移参数,并使用Cayley参数或四元数表示旋转。我们进一步在广义相对位姿问题上引入新的约束条件,这有利于用更少的解得到更紧凑的解。综合实验和实际数据表明,提出的基于仿射对应的求解器是高效的,计算效率高。
{"title":"A Complete Solution to Generalized Relative Pose Estimation from Affine Correspondences.","authors":"Banglei Guan,Ji Zhao,Laurent Kneip","doi":"10.1109/tpami.2026.3673525","DOIUrl":"https://doi.org/10.1109/tpami.2026.3673525","url":null,"abstract":"In recent years, affine correspondences (ACs) have emerged as widely adopted alternative to point correspondences (PCs) in geometric problems in computer vision. An AC is composed of a PC across two different views plus an affine transformation between the small patches around this PC. Prior studies have shown that a single affine correspondence (AC) generally yields three independent constraints for estimating relative pose. This work addresses relative pose estimation in multi-perspective camera systems, a relevant problem given their prevalence in modern technologies such as autonomous vehicles and augmented reality. More specifically, we introduce the first comprehensive suite of minimal solvers for 6DoF relative pose estimation across multiple cameras using only two ACs, which is notably valuable for robust model fitting scenarios. We analyze all possible configurations of two ACs in two views, and present minimal solvers covering all identified minimal cases. We make use of the hidden variable technique to eliminate the translation parameters, and represent rotation using either Cayley parameters or quaternions. We furthermore introduce novel constraints on the generalized relative pose problem that are beneficial in deriving more compact solvers with fewer solutions. Comprehensive experiments on synthetic and real-world data show that the proposed affine correspondence-based solvers are highly effective and computationally efficient.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"12 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging Generative and Discriminative Noisy-Label Learning via Direction-Agnostic EM Formulation. 通过方向不可知的EM公式桥接生成和判别噪声标签学习。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-11 DOI: 10.1109/tpami.2026.3673244
Fengbei Liu,Chong Wang,Yuanhong Chen,Yuyuan Liu,Gustavo Carneiro
Although noisy-label learning is often approached with discriminative methods for simplicity and speed, generative modeling offers a principled alternative by capturing the joint mechanism that produces features, clean labels, and corrupted observations. However, prior work typically (i) introduces extra latent variables and heavy image generators that bias training toward reconstruction, (ii) fixes a single data-generating direction (Y →X or X →Y), limiting adaptability, and (iii) assumes a uniform prior over clean labels, ignoring instance-level uncertainty. Here, we propose a single-stage, EM-style framework for generative noisy-label learning that is direction-agnostic and avoids explicit image synthesis. First, we derive a single Expectation Maximization (EM) objective whose E-step specializes to either causal orientation without changing the overall optimization objective. Second, we replace the intractable p(X | Y) with a dataset-normalized discriminative proxy computed using a discriminative classifier on the finite training set, retaining the structural benefits of generative modeling at much lower cost. Third, we introduce Partial-Label Supervision (PLS), an instance specific prior over clean labels that balances coverage and uncertainty, improving data-dependent regularization. Across standard vision and natural language processing (NLP) noisy label benchmarks, our method achieves state-of-the-art accuracy, lower transition-matrix estimation error, and substantially less training computation than current generative and discriminative baselines. Code: https://github.com/lfb-1/GNL.
虽然为了简单和快速,噪声标签学习通常采用判别方法,但生成建模通过捕获产生特征、干净标签和损坏观察的联合机制,提供了一个原则性的替代方案。然而,先前的工作通常(i)引入了额外的潜在变量和重图像生成器,使训练偏向重建,(ii)固定了单一的数据生成方向(Y→X或X→Y),限制了适应性,以及(iii)假设干净标签的统一先验,忽略了实例级的不确定性。在这里,我们提出了一个单阶段的em式框架,用于生成噪声标签学习,它是方向不可知的,避免了明确的图像合成。首先,我们推导了一个单一的期望最大化(EM)目标,其e步专门针对因果方向,而不改变总体优化目标。其次,我们用在有限训练集上使用判别分类器计算的数据集归一化判别代理替换了难以处理的p(X | Y),以更低的成本保留了生成建模的结构优势。第三,我们引入了部分标签监督(PLS),这是一种特定于干净标签的实例,它平衡了覆盖范围和不确定性,改善了依赖数据的正则化。在标准视觉和自然语言处理(NLP)噪声标签基准测试中,我们的方法达到了最先进的精度,较低的过渡矩阵估计误差,并且比当前的生成和判别基线大大减少了训练计算。代码:https://github.com/lfb-1/GNL。
{"title":"Bridging Generative and Discriminative Noisy-Label Learning via Direction-Agnostic EM Formulation.","authors":"Fengbei Liu,Chong Wang,Yuanhong Chen,Yuyuan Liu,Gustavo Carneiro","doi":"10.1109/tpami.2026.3673244","DOIUrl":"https://doi.org/10.1109/tpami.2026.3673244","url":null,"abstract":"Although noisy-label learning is often approached with discriminative methods for simplicity and speed, generative modeling offers a principled alternative by capturing the joint mechanism that produces features, clean labels, and corrupted observations. However, prior work typically (i) introduces extra latent variables and heavy image generators that bias training toward reconstruction, (ii) fixes a single data-generating direction (Y →X or X →Y), limiting adaptability, and (iii) assumes a uniform prior over clean labels, ignoring instance-level uncertainty. Here, we propose a single-stage, EM-style framework for generative noisy-label learning that is direction-agnostic and avoids explicit image synthesis. First, we derive a single Expectation Maximization (EM) objective whose E-step specializes to either causal orientation without changing the overall optimization objective. Second, we replace the intractable p(X | Y) with a dataset-normalized discriminative proxy computed using a discriminative classifier on the finite training set, retaining the structural benefits of generative modeling at much lower cost. Third, we introduce Partial-Label Supervision (PLS), an instance specific prior over clean labels that balances coverage and uncertainty, improving data-dependent regularization. Across standard vision and natural language processing (NLP) noisy label benchmarks, our method achieves state-of-the-art accuracy, lower transition-matrix estimation error, and substantially less training computation than current generative and discriminative baselines. Code: https://github.com/lfb-1/GNL.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"28 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147393776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Assessment of Accuracy in Video-based Monocular Human Pose Estimation Frameworks. 基于视频的单目人体姿态估计框架精度的比较评估。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-10 DOI: 10.1109/tpami.2026.3672463
Fabian Kahl,Philipp Wegner,Maximilian Kapsecker,Leon Nissen,Jennifer Faber,Stephan M Jonas,Lara Marie Reimer
In human pose estimation, a comprehensive evaluation of state-of-the-art frameworks is necessary to advance both research and practical applications. This paper presents a thorough review of state-of-the-art 2D and 3D human pose estimation frameworks, analyzing 118 papers and four GitHub repositories, with a focus on frameworks made since 2019. The following frameworks are chosen based on predefined inclusion criteria: AlphaPose, Detectron2, MediaPipe, MeTRAbs, MHFormer, MMPose, MoveNet, OpenPifPaf, OpenPifPaf-vita, OpenPose, PoseFormerV2, rtmlib, StridedTransformer-Pose3D, ultralytics (YOLOv8), ViTPose, and YOLOv7. This paper evaluates these 16 frameworks on an existing, unpublished dataset consisting of exercise videos recorded with a monocular RGB camera and synchronized gold-standard motion capture data. The dataset includes videos of nine individuals performing eight exercises, recorded from two camera views with different planar angles. The analysis evaluates joint angle performance of the frameworks using weighted mean absolute error and weighted intraclass correlation coefficient as quantitative metrics. MeTRAbs emerged as the best overall framework, while AlphaPose, rtmlib, and YOLOv7 were the top 2D performers.
在人体姿态估计中,有必要对最先进的框架进行综合评估,以推进研究和实际应用。本文对最先进的2D和3D人体姿态估计框架进行了全面回顾,分析了118篇论文和四个GitHub存储库,重点介绍了自2019年以来制作的框架。以下框架是根据预定义的纳入标准选择的:AlphaPose, Detectron2, MediaPipe, MeTRAbs, MHFormer, MMPose, MoveNet, OpenPifPaf, OpenPifPaf-vita, OpenPose, PoseFormerV2, rtmlib, StridedTransformer-Pose3D, ultralytics (YOLOv8), ViTPose和YOLOv7。本文在一个现有的、未发布的数据集上评估了这16个框架,该数据集由单眼RGB摄像机记录的运动视频和同步的金标准运动捕捉数据组成。该数据集包括9个人进行8种练习的视频,从两个不同平面角度的摄像机视图记录。该分析采用加权平均绝对误差和加权类内相关系数作为定量指标来评价框架的关节角性能。MeTRAbs成为最佳整体框架,而AlphaPose、rtmlib和YOLOv7是2D表现最好的框架。
{"title":"Comparative Assessment of Accuracy in Video-based Monocular Human Pose Estimation Frameworks.","authors":"Fabian Kahl,Philipp Wegner,Maximilian Kapsecker,Leon Nissen,Jennifer Faber,Stephan M Jonas,Lara Marie Reimer","doi":"10.1109/tpami.2026.3672463","DOIUrl":"https://doi.org/10.1109/tpami.2026.3672463","url":null,"abstract":"In human pose estimation, a comprehensive evaluation of state-of-the-art frameworks is necessary to advance both research and practical applications. This paper presents a thorough review of state-of-the-art 2D and 3D human pose estimation frameworks, analyzing 118 papers and four GitHub repositories, with a focus on frameworks made since 2019. The following frameworks are chosen based on predefined inclusion criteria: AlphaPose, Detectron2, MediaPipe, MeTRAbs, MHFormer, MMPose, MoveNet, OpenPifPaf, OpenPifPaf-vita, OpenPose, PoseFormerV2, rtmlib, StridedTransformer-Pose3D, ultralytics (YOLOv8), ViTPose, and YOLOv7. This paper evaluates these 16 frameworks on an existing, unpublished dataset consisting of exercise videos recorded with a monocular RGB camera and synchronized gold-standard motion capture data. The dataset includes videos of nine individuals performing eight exercises, recorded from two camera views with different planar angles. The analysis evaluates joint angle performance of the frameworks using weighted mean absolute error and weighted intraclass correlation coefficient as quantitative metrics. MeTRAbs emerged as the best overall framework, while AlphaPose, rtmlib, and YOLOv7 were the top 2D performers.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"67 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147383251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy Preserving Decentralized Learning with Positive-Incentive Noise. 基于正激励噪声的隐私保护分散学习。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-10 DOI: 10.1109/tpami.2026.3672569
Luqing Wang,Shaofu Yang,Yifan Wan,Wenying Xu,Min-Ling Zhang
Ensuring the privacy of local datasets has emerged as an important concern in decentralized learning. However, the inherent privacy-utility tradeoff remains a fundamental challenge for privacy preserving decentralized algorithms. To address this issue, we introduce Positive-Incentive Noise Generator (PING), a novel mechanism designed to eliminate negative impact of privacy noise on convergence while defending against powerful colluding inference attacks. PING leverages network topologies and lightweight encryption-decryption operations to generate correlated noise. Building upon PING, we propose PP-DPIN, a privacy preserving stochastic algorithm tailored for decentralized learning. By integrating differential privacy and differential information entropy, we provide a comprehensive privacy quantification for PP-DPIN, with at least half nodes achieving arbitrarily strong privacy guarantees. Furthermore, convergence rate of PP-DPIN is established under stochastic convex and nonconvex settings, which characterizes the impact of privacy noise and demonstrates the linear speedup relative to the network size. Experiments on computer vision tasks validate PP-DPIN's superior performance and robustness against attacks compared to state-of-the-art methods.
确保本地数据集的隐私性已成为分散学习中的一个重要问题。然而,固有的隐私效用权衡仍然是保护隐私的分散算法的根本挑战。为了解决这个问题,我们引入了积极激励噪声发生器(PING),这是一种新的机制,旨在消除隐私噪声对收敛的负面影响,同时防御强大的串通推理攻击。PING利用网络拓扑和轻量级加密-解密操作来生成相关噪声。在PING的基础上,我们提出了PP-DPIN,这是一种为分散学习量身定制的隐私保护随机算法。通过集成差分隐私和差分信息熵,我们为PP-DPIN提供了一个全面的隐私量化,至少有一半的节点实现了任意强的隐私保证。进一步,建立了PP-DPIN在随机凸和非凸设置下的收敛速率,表征了隐私噪声的影响,并证明了相对于网络规模的线性加速。计算机视觉任务实验验证了PP-DPIN与最先进的方法相比具有优越的性能和抗攻击的鲁棒性。
{"title":"Privacy Preserving Decentralized Learning with Positive-Incentive Noise.","authors":"Luqing Wang,Shaofu Yang,Yifan Wan,Wenying Xu,Min-Ling Zhang","doi":"10.1109/tpami.2026.3672569","DOIUrl":"https://doi.org/10.1109/tpami.2026.3672569","url":null,"abstract":"Ensuring the privacy of local datasets has emerged as an important concern in decentralized learning. However, the inherent privacy-utility tradeoff remains a fundamental challenge for privacy preserving decentralized algorithms. To address this issue, we introduce Positive-Incentive Noise Generator (PING), a novel mechanism designed to eliminate negative impact of privacy noise on convergence while defending against powerful colluding inference attacks. PING leverages network topologies and lightweight encryption-decryption operations to generate correlated noise. Building upon PING, we propose PP-DPIN, a privacy preserving stochastic algorithm tailored for decentralized learning. By integrating differential privacy and differential information entropy, we provide a comprehensive privacy quantification for PP-DPIN, with at least half nodes achieving arbitrarily strong privacy guarantees. Furthermore, convergence rate of PP-DPIN is established under stochastic convex and nonconvex settings, which characterizes the impact of privacy noise and demonstrates the linear speedup relative to the network size. Experiments on computer vision tasks validate PP-DPIN's superior performance and robustness against attacks compared to state-of-the-art methods.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"22 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147383508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Locally Linear Continual Learning for Time Series based on VC-Theoretical Generalization Bounds. 基于vc -理论泛化界的时间序列局部线性连续学习。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-10 DOI: 10.1109/tpami.2026.3672726
Yan V G Ferreira,Igor B Lima,Pedro H G Mapa S,Felipe V Campos,Antonio P Braga
Most machine learning methods assume fixed probability distributions, limiting their applicability in nonstationary real-world scenarios. While continual learning methods address this issue, current approaches often rely on black-box models or require extensive user intervention for interpretability. We propose SyMPLER (Systems Modeling through Piecewise Linear Evolving Regression), an explainable model for time series forecasting in nonstationary environments based on dynamic piecewise-linear approximations. Unlike other locally linear models, SyMPLER uses generalization bounds from Statistical Learning Theory to automatically determine when to add new local models based on prediction errors, eliminating the need for explicit clustering of the data. Experiments show that SyMPLER can achieve comparable performance to both black-box and existing explainable models while maintaining a human-interpretable structure that reveals insights about the system's behavior. In this sense, our approach conciliates accuracy and interpretability, offering a transparent and adaptive solution for forecasting nonstationary time series.
大多数机器学习方法假设固定的概率分布,限制了它们在非平稳现实场景中的适用性。虽然持续学习方法解决了这个问题,但当前的方法通常依赖于黑盒模型,或者需要大量的用户干预来实现可解释性。我们提出了SyMPLER (Systems Modeling through Piecewise Linear evolutionary Regression),这是一个基于动态分段线性近似的非平稳环境下时间序列预测的可解释模型。与其他局部线性模型不同,SyMPLER使用统计学习理论中的泛化界限来根据预测误差自动确定何时添加新的局部模型,从而消除了对数据进行显式聚类的需要。实验表明,SyMPLER可以达到与黑盒模型和现有可解释模型相当的性能,同时保持人类可解释的结构,揭示有关系统行为的见解。从这个意义上说,我们的方法调和了准确性和可解释性,为预测非平稳时间序列提供了透明和自适应的解决方案。
{"title":"Locally Linear Continual Learning for Time Series based on VC-Theoretical Generalization Bounds.","authors":"Yan V G Ferreira,Igor B Lima,Pedro H G Mapa S,Felipe V Campos,Antonio P Braga","doi":"10.1109/tpami.2026.3672726","DOIUrl":"https://doi.org/10.1109/tpami.2026.3672726","url":null,"abstract":"Most machine learning methods assume fixed probability distributions, limiting their applicability in nonstationary real-world scenarios. While continual learning methods address this issue, current approaches often rely on black-box models or require extensive user intervention for interpretability. We propose SyMPLER (Systems Modeling through Piecewise Linear Evolving Regression), an explainable model for time series forecasting in nonstationary environments based on dynamic piecewise-linear approximations. Unlike other locally linear models, SyMPLER uses generalization bounds from Statistical Learning Theory to automatically determine when to add new local models based on prediction errors, eliminating the need for explicit clustering of the data. Experiments show that SyMPLER can achieve comparable performance to both black-box and existing explainable models while maintaining a human-interpretable structure that reveals insights about the system's behavior. In this sense, our approach conciliates accuracy and interpretability, offering a transparent and adaptive solution for forecasting nonstationary time series.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"77 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147383507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
StarIR: Convolutional Image Restoration With Spatial-Frequency Fusion. 基于空间-频率融合的卷积图像恢复。
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-10 DOI: 10.1109/tpami.2026.3672465
Yuning Cui,Syed Waqas Zamir,Ming-Hsuan Yang,Alois Knoll,Fahad Shahbaz Khan,Salman Khan
Vision Transformer (ViT) has shown impressive performance in image restoration due to its ability to capture a large receptive field. However, its complexity grows quadratically with input resolution, limiting its applicability for high-resolution images. In contrast, Convolutional Neural Networks (CNNs) are computationally efficient but are constrained by their inherently local receptive fields, which limit their ability to capture long-range pixel relationships. To address these challenges, we propose StarIR, which possesses the efficiency of CNNs while also capturing a large receptive field, similar to Transformers. StarIR incorporates two key innovations: 1) a dual-domain representation learning framework, with one branch processing spatial details and the other focusing on mesoscale interactions in the frequency domain; and 2) a high-dimensional feature fusion mechanism, the Star operation, which fuses information from both domains through element-wise multiplication, thereby enhancing representational capacity without increasing network width and depth. Our Star operation is followed by a channel attention unit to facilitate global feature modeling and enhance channel-wise interactions. Building on our straightforward yet powerful design principles, StarIR achieves state-of-the-art performance across 21 datasets covering six single-degradation image restoration tasks. Furthermore, our model performs favorably against leading algorithms in two all-in-one settings and demonstrates robustness on two composite-degradation datasets. In addition, StarIR extends well to several domain-specific applications, including ultra-high-definition (UHD) imaging, remote sensing, medical imaging, and underwater image enhancement.
视觉转换器(Vision Transformer, ViT)由于能够捕获大的接受野,在图像恢复中表现出令人印象深刻的性能。然而,其复杂度随输入分辨率呈二次增长,限制了其对高分辨率图像的适用性。相比之下,卷积神经网络(cnn)的计算效率很高,但受到其固有的局部接受域的限制,这限制了它们捕捉远程像素关系的能力。为了解决这些挑战,我们提出了StarIR,它具有cnn的效率,同时也捕获了一个大的接受场,类似于变形金刚。StarIR包含两个关键创新:1)双域表示学习框架,一个分支处理空间细节,另一个分支关注频率域的中尺度相互作用;2)一种高维特征融合机制,即星型运算,通过元素智能乘法融合两个域的信息,从而在不增加网络宽度和深度的情况下增强表征能力。我们的Star操作之后是一个渠道关注单元,以促进全局特征建模和增强渠道交互。基于我们简单而强大的设计原则,StarIR在21个数据集上实现了最先进的性能,涵盖了6个单退化图像恢复任务。此外,我们的模型在两种一体化设置中表现优于领先的算法,并在两种复合退化数据集上表现出鲁棒性。此外,StarIR还可以很好地扩展到几个特定领域的应用,包括超高清(UHD)成像、遥感、医学成像和水下图像增强。
{"title":"StarIR: Convolutional Image Restoration With Spatial-Frequency Fusion.","authors":"Yuning Cui,Syed Waqas Zamir,Ming-Hsuan Yang,Alois Knoll,Fahad Shahbaz Khan,Salman Khan","doi":"10.1109/tpami.2026.3672465","DOIUrl":"https://doi.org/10.1109/tpami.2026.3672465","url":null,"abstract":"Vision Transformer (ViT) has shown impressive performance in image restoration due to its ability to capture a large receptive field. However, its complexity grows quadratically with input resolution, limiting its applicability for high-resolution images. In contrast, Convolutional Neural Networks (CNNs) are computationally efficient but are constrained by their inherently local receptive fields, which limit their ability to capture long-range pixel relationships. To address these challenges, we propose StarIR, which possesses the efficiency of CNNs while also capturing a large receptive field, similar to Transformers. StarIR incorporates two key innovations: 1) a dual-domain representation learning framework, with one branch processing spatial details and the other focusing on mesoscale interactions in the frequency domain; and 2) a high-dimensional feature fusion mechanism, the Star operation, which fuses information from both domains through element-wise multiplication, thereby enhancing representational capacity without increasing network width and depth. Our Star operation is followed by a channel attention unit to facilitate global feature modeling and enhance channel-wise interactions. Building on our straightforward yet powerful design principles, StarIR achieves state-of-the-art performance across 21 datasets covering six single-degradation image restoration tasks. Furthermore, our model performs favorably against leading algorithms in two all-in-one settings and demonstrates robustness on two composite-degradation datasets. In addition, StarIR extends well to several domain-specific applications, including ultra-high-definition (UHD) imaging, remote sensing, medical imaging, and underwater image enhancement.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"195 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147383509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning With Partial and Noisy Correspondence in Graph Matching 图匹配中部分和噪声对应的学习
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-09 DOI: 10.1109/tpami.2026.3670236
Yijie Lin, Mouxing Yang, Peng Hu, Jiancheng Lv, Hao Chen, Xi Peng
{"title":"Learning With Partial and Noisy Correspondence in Graph Matching","authors":"Yijie Lin, Mouxing Yang, Peng Hu, Jiancheng Lv, Hao Chen, Xi Peng","doi":"10.1109/tpami.2026.3670236","DOIUrl":"https://doi.org/10.1109/tpami.2026.3670236","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"75 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147380837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brightness-aware Synthetic-to-Real Learning for Nighttime Hazy Image Enhancement 夜间朦胧图像增强的亮度感知合成到真实学习
IF 23.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-09 DOI: 10.1109/tpami.2026.3671754
Jie Gui, Xiaofeng Cong, Yu-Xin Zhang, Junming Hou, Dacheng Tao
{"title":"Brightness-aware Synthetic-to-Real Learning for Nighttime Hazy Image Enhancement","authors":"Jie Gui, Xiaofeng Cong, Yu-Xin Zhang, Junming Hou, Dacheng Tao","doi":"10.1109/tpami.2026.3671754","DOIUrl":"https://doi.org/10.1109/tpami.2026.3671754","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"7 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147380836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Pattern Analysis and Machine Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1