Despite the fast progress of deep learning, one standing challenge is the gap of the observed training samples and the underlying true distribution. There are multiple reasons for the causing of this gap e.g., sampling bias, noise etc. In the era of foundation models, we show that when leveraging the off-the-shelf (vision) foundation models (e.g., CLIP, DINOv2) for feature extraction, the geometric shapes of the resulting feature distributions exhibit remarkable transferability across domains and datasets. To verify its practical usefulness, we embody our geometric knowledge-guided distribution calibration framework in two popular and challenging settings: federated learning and long-tailed recognition. In the federated setting, we devise a technique of acquiring the global geometric shape under privacy constraints, then leverage this knowledge to generate new samples for clients, in the aim of bridging the gap between local and global observations. In long-tailed learning, it utilizes the geometric knowledge transferred from sample-rich categories to recover the true distribution for sample-scarce tail classes. Comprehensive experiments show that our proposed geometric knowledge-guided distribution calibration effectively overcomes information deficits caused by data heterogeneity and sample imbalance, with boosted performance across benchmarks. Code published at: https://github.com/WeiDai-David/2025CVPR GGEUR.
{"title":"Calibrating Biased Distribution in VFM-derived Latent Space via Cross-Domain Geometric Consistency.","authors":"Yanbiao Ma, Wei Dai, Zhiwu Lu, Bowei Liu, Jiayi Chen, Wenke Huang, Junchi Yan, Guancheng Wan","doi":"10.1109/TPAMI.2026.3662389","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3662389","url":null,"abstract":"<p><p>Despite the fast progress of deep learning, one standing challenge is the gap of the observed training samples and the underlying true distribution. There are multiple reasons for the causing of this gap e.g., sampling bias, noise etc. In the era of foundation models, we show that when leveraging the off-the-shelf (vision) foundation models (e.g., CLIP, DINOv2) for feature extraction, the geometric shapes of the resulting feature distributions exhibit remarkable transferability across domains and datasets. To verify its practical usefulness, we embody our geometric knowledge-guided distribution calibration framework in two popular and challenging settings: federated learning and long-tailed recognition. In the federated setting, we devise a technique of acquiring the global geometric shape under privacy constraints, then leverage this knowledge to generate new samples for clients, in the aim of bridging the gap between local and global observations. In long-tailed learning, it utilizes the geometric knowledge transferred from sample-rich categories to recover the true distribution for sample-scarce tail classes. Comprehensive experiments show that our proposed geometric knowledge-guided distribution calibration effectively overcomes information deficits caused by data heterogeneity and sample imbalance, with boosted performance across benchmarks. Code published at: https://github.com/WeiDai-David/2025CVPR GGEUR.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146151591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/TPAMI.2026.3661424
Li Sun, Zhenhao Huang, Yujie Wang, Hongbo Lv, Chunyang Liu, Hao Peng, Philip S Yu
Graph clustering is a longstanding topic in machine learning. In recent years, deep learning methods have achieved encouraging results, but they still require predefined cluster numbers $K$, and typically struggle with imbalanced graphs, especially in identifying minority clusters. The limitations motivate us to study a challenging yet practical problem: deep graph clustering without $K$ considering the imbalance in reality. We approach this problem from a fresh perspective of information theory (i.e., structural information). In the literature, structural information has rarely been touched in deep clustering, and the classic definition falls short in its discrete formulation, neglecting node attributes and exhibiting prohibitive complexity. In this paper, we first establish a differentiable structural information, generalizing the discrete formalism to continuous realm, so that we design a hyperbolic deep model (LSEnet) to learn the neural partitioning tree in the Lorentz model of hyperbolic space. Theoretically, we demonstrate its capability in clustering without requiring $K$ and identifying minority clusters in imbalanced graphs. Second, we refine hyperbolic representations of the partitioning tree, enhancing graph semantics, for better clustering. Contrastive learning for tree structures is non-trivial and costs quadratic complexity. Instead, we further advance our theory by discovering an interesting fact that structural entropy indeed bounds the tree contrastive loss. Finally, with an efficient reformulation, we approach graph clustering through a novel augmented structural information learning (ASIL), which offers a simple yet effective objective of augmented structural entropy to seamlessly integrates hyperbolic partitioning tree construction and contrastive learning. With a provable improvement in graph conductance, ASIL achieves effective debiased graph clustering in linear complexity with respect to the graph size. Extensive experiments show the ASIL outperforms 20 strong baselines by an average of $+12.42%$ in NMI on Citeseer dataset.
{"title":"ASIL: Augmented Structural Information Learning for Deep Graph Clustering in Hyperbolic Space.","authors":"Li Sun, Zhenhao Huang, Yujie Wang, Hongbo Lv, Chunyang Liu, Hao Peng, Philip S Yu","doi":"10.1109/TPAMI.2026.3661424","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3661424","url":null,"abstract":"<p><p>Graph clustering is a longstanding topic in machine learning. In recent years, deep learning methods have achieved encouraging results, but they still require predefined cluster numbers $K$, and typically struggle with imbalanced graphs, especially in identifying minority clusters. The limitations motivate us to study a challenging yet practical problem: deep graph clustering without $K$ considering the imbalance in reality. We approach this problem from a fresh perspective of information theory (i.e., structural information). In the literature, structural information has rarely been touched in deep clustering, and the classic definition falls short in its discrete formulation, neglecting node attributes and exhibiting prohibitive complexity. In this paper, we first establish a differentiable structural information, generalizing the discrete formalism to continuous realm, so that we design a hyperbolic deep model (LSEnet) to learn the neural partitioning tree in the Lorentz model of hyperbolic space. Theoretically, we demonstrate its capability in clustering without requiring $K$ and identifying minority clusters in imbalanced graphs. Second, we refine hyperbolic representations of the partitioning tree, enhancing graph semantics, for better clustering. Contrastive learning for tree structures is non-trivial and costs quadratic complexity. Instead, we further advance our theory by discovering an interesting fact that structural entropy indeed bounds the tree contrastive loss. Finally, with an efficient reformulation, we approach graph clustering through a novel augmented structural information learning (ASIL), which offers a simple yet effective objective of augmented structural entropy to seamlessly integrates hyperbolic partitioning tree construction and contrastive learning. With a provable improvement in graph conductance, ASIL achieves effective debiased graph clustering in linear complexity with respect to the graph size. Extensive experiments show the ASIL outperforms 20 strong baselines by an average of $+12.42%$ in NMI on Citeseer dataset.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1109/TPAMI.2026.3661650
Xiaowei Zhao, Linrui Xie, Xiaojun Chang, Feiping Nie, Qiang Zhang
Bipartite graph-based co-clustering is efficient in modeling cluster manifold structures. However, existing methods decouple bipartite graph construction from the learning of pseudo-labels for samples and anchors, often leading to suboptimal clustering performance. Moreover, neglecting local manifold relationships among anchors yields inferior anchor pseudo-labels, which further degrades the quality of sample pseudo-labels. To overcome these limitations, we propose a novel model termed Fast Co-Clustering (FC$^{2}$), which jointly captures both local and global correlations between samples and anchors. Specifically, to model the coupling between the one-hot pseudo-labels of samples and anchors, we construct a bipartite graph with adaptively updated weights during the clustering process. To prevent severely imbalanced cluster assignments, we prove the equivalence between maximizing pseudo-label covariance and balancing cluster proportions, and incorporate a balanced regularization term to enhance the rationality of the resulting clusters. Furthermore, the local smoothness of anchor pseudo-labels is preserved via a low-rank decomposition of a compact anchor similarity graph. These two components jointly ensure that spatially adjacent anchors tend to share similar cluster identities, and that samples and anchors in close proximity are also assigned to similar clusters. We develop an efficient iterative optimization algorithm to update all model variables. Extensive experiments on benchmark and synthetic datasets validate the superior performance and efficiency of the proposed method compared with state-of-the-art approaches. Code is available at https://github.com/Vince-Doit/FC2.
{"title":"FC$^{2}$: Fast Co-Clustering With Small-Scale Similarity Graph and Bipartite Graph Learning.","authors":"Xiaowei Zhao, Linrui Xie, Xiaojun Chang, Feiping Nie, Qiang Zhang","doi":"10.1109/TPAMI.2026.3661650","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3661650","url":null,"abstract":"<p><p>Bipartite graph-based co-clustering is efficient in modeling cluster manifold structures. However, existing methods decouple bipartite graph construction from the learning of pseudo-labels for samples and anchors, often leading to suboptimal clustering performance. Moreover, neglecting local manifold relationships among anchors yields inferior anchor pseudo-labels, which further degrades the quality of sample pseudo-labels. To overcome these limitations, we propose a novel model termed Fast Co-Clustering (FC$^{2}$), which jointly captures both local and global correlations between samples and anchors. Specifically, to model the coupling between the one-hot pseudo-labels of samples and anchors, we construct a bipartite graph with adaptively updated weights during the clustering process. To prevent severely imbalanced cluster assignments, we prove the equivalence between maximizing pseudo-label covariance and balancing cluster proportions, and incorporate a balanced regularization term to enhance the rationality of the resulting clusters. Furthermore, the local smoothness of anchor pseudo-labels is preserved via a low-rank decomposition of a compact anchor similarity graph. These two components jointly ensure that spatially adjacent anchors tend to share similar cluster identities, and that samples and anchors in close proximity are also assigned to similar clusters. We develop an efficient iterative optimization algorithm to update all model variables. Extensive experiments on benchmark and synthetic datasets validate the superior performance and efficiency of the proposed method compared with state-of-the-art approaches. Code is available at https://github.com/Vince-Doit/FC2.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1109/TPAMI.2026.3659200
Yinjian Wang, Wei Li, James E Fowler, Gemine Vivone
The problem of robust matrix completion-the recovery of a low-rank matrix and a sparse matrix from a sampling of their superposition-has been addressed extensively in prior literature. Yet, much of this work has focused exclusively on the case in which the matrix sampling is done at random, as this scenario is amenable to theoretical analysis. In contrast, sampling with an arbitrary deterministic pattern is often more accommodating to hardware implementation; consequently, the problem of robust matrix completion under deterministic sampling is considered. To this end, a restricted approximate isometry property is proposed and used, along with a modified golfing scheme and a slightly strengthened incoherence condition, to prove that the latent low-rank and sparse matrices are uniquely recoverable via convex optimization with asymptotically high probability, providing the first exact-recovery theory for robust matrix completion with arbitrary deterministic sampling. A corresponding convex-optimization algorithm, driven by a traditional nuclear norm, is developed and then subsequently generalized by substituting a convolutional nuclear norm in order to cover a broader range of application scenarios. Empirical experiments on synthetic data verify the proposed theory while a battery of results on real-world images demonstrate the practical efficacy of the generalized algorithm for robust matrix recovery.
{"title":"Robust Matrix Completion With Deterministic Sampling Via Convex Optimization.","authors":"Yinjian Wang, Wei Li, James E Fowler, Gemine Vivone","doi":"10.1109/TPAMI.2026.3659200","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3659200","url":null,"abstract":"<p><p>The problem of robust matrix completion-the recovery of a low-rank matrix and a sparse matrix from a sampling of their superposition-has been addressed extensively in prior literature. Yet, much of this work has focused exclusively on the case in which the matrix sampling is done at random, as this scenario is amenable to theoretical analysis. In contrast, sampling with an arbitrary deterministic pattern is often more accommodating to hardware implementation; consequently, the problem of robust matrix completion under deterministic sampling is considered. To this end, a restricted approximate isometry property is proposed and used, along with a modified golfing scheme and a slightly strengthened incoherence condition, to prove that the latent low-rank and sparse matrices are uniquely recoverable via convex optimization with asymptotically high probability, providing the first exact-recovery theory for robust matrix completion with arbitrary deterministic sampling. A corresponding convex-optimization algorithm, driven by a traditional nuclear norm, is developed and then subsequently generalized by substituting a convolutional nuclear norm in order to cover a broader range of application scenarios. Empirical experiments on synthetic data verify the proposed theory while a battery of results on real-world images demonstrate the practical efficacy of the generalized algorithm for robust matrix recovery.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1109/TPAMI.2026.3659125
Yuanfei Huang, Hua Huang
Reversible image conversion (RIC) suffers from ill-posedness issues due to its forward conversion process being considered an underdetermined system. Despite employing invertible neural networks (INN), existing RIC methods intrinsically remain ill-posed as inevitably introducing uncertainty by incorporating randomly sampled variables. To tackle the ill-posedness dilemma, we focus on developing a reliable approximate left inverse for the underdetermined system by constructing an overdetermined system with a non-zero Gram determinant, thus ensuring a well-posed solution. Based on this principle, we propose a well-posed invertible $1times 1$ convolution (WIC), which eliminates the reliance on random variable sampling and enables the development of well-posed invertible networks. Furthermore, we design two innovative networks, WIN-Naïve and WIN, with the latter incorporating advanced skip-connections to enhance long-term memory. Our methods are evaluated across diverse RIC tasks, including reversible image hiding, image rescaling, and image decolorization, consistently achieving state-of-the-art performance. Extensive experiments validate the effectiveness of our approach, demonstrating its ability to overcome the bottlenecks of existing RIC solutions and setting a new benchmark in the field. Codes are available in https://github.com/BNU-ERC-ITEA/WIN.
{"title":"Tackling Ill-Posedness of Reversible Image Conversion With Well-Posed Invertible Network.","authors":"Yuanfei Huang, Hua Huang","doi":"10.1109/TPAMI.2026.3659125","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3659125","url":null,"abstract":"<p><p>Reversible image conversion (RIC) suffers from ill-posedness issues due to its forward conversion process being considered an underdetermined system. Despite employing invertible neural networks (INN), existing RIC methods intrinsically remain ill-posed as inevitably introducing uncertainty by incorporating randomly sampled variables. To tackle the ill-posedness dilemma, we focus on developing a reliable approximate left inverse for the underdetermined system by constructing an overdetermined system with a non-zero Gram determinant, thus ensuring a well-posed solution. Based on this principle, we propose a well-posed invertible $1times 1$ convolution (WIC), which eliminates the reliance on random variable sampling and enables the development of well-posed invertible networks. Furthermore, we design two innovative networks, WIN-Naïve and WIN, with the latter incorporating advanced skip-connections to enhance long-term memory. Our methods are evaluated across diverse RIC tasks, including reversible image hiding, image rescaling, and image decolorization, consistently achieving state-of-the-art performance. Extensive experiments validate the effectiveness of our approach, demonstrating its ability to overcome the bottlenecks of existing RIC solutions and setting a new benchmark in the field. Codes are available in https://github.com/BNU-ERC-ITEA/WIN.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1109/TPAMI.2026.3659041
HanQin Cai, Chandra Kundu, Jialin Liu, Wotao Yin
Robust matrix completion (RMC) is a widely used machine learning tool that simultaneously tackles two critical issues in low-rank data analysis: missing data entries and extreme outliers. This paper proposes a novel scalable and learnable non-convex approach, coined Learned Robust Matrix Completion (LRMC), for large-scale RMC problems. LRMC enjoys low computational complexity with linear convergence. Motivated by the proposed theorem, the free parameters of LRMC can be effectively learned via deep unfolding to achieve optimum performance. Furthermore, this paper proposes a flexible feedforward-recurrent-mixed neural network framework that extends deep unfolding from fixed-number iterations to infinite iterations. The superior empirical performance of LRMC is verified with extensive experiments against state-of-the-art on synthetic datasets and real applications, including video background subtraction, ultrasound imaging, face modeling, and cloud removal from satellite imagery.
{"title":"Deeply Learned Robust Matrix Completion for Large-scale Low-rank Data Recovery.","authors":"HanQin Cai, Chandra Kundu, Jialin Liu, Wotao Yin","doi":"10.1109/TPAMI.2026.3659041","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3659041","url":null,"abstract":"<p><p>Robust matrix completion (RMC) is a widely used machine learning tool that simultaneously tackles two critical issues in low-rank data analysis: missing data entries and extreme outliers. This paper proposes a novel scalable and learnable non-convex approach, coined Learned Robust Matrix Completion (LRMC), for large-scale RMC problems. LRMC enjoys low computational complexity with linear convergence. Motivated by the proposed theorem, the free parameters of LRMC can be effectively learned via deep unfolding to achieve optimum performance. Furthermore, this paper proposes a flexible feedforward-recurrent-mixed neural network framework that extends deep unfolding from fixed-number iterations to infinite iterations. The superior empirical performance of LRMC is verified with extensive experiments against state-of-the-art on synthetic datasets and real applications, including video background subtraction, ultrasound imaging, face modeling, and cloud removal from satellite imagery.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1109/TPAMI.2026.3659164
Yujia Liu, Dingquan Li, Zhixuan Li, Tiejun Huang
No-Reference Image Quality Assessment (NR-IQA) models play an important role in various real-world applications. Recently, adversarial attacks against NR-IQA models have attracted increasing attention, as they provide valuable insights for revealing model vulnerabilities and guiding robust system design. Some effective attacks have been proposed against NR-IQA models in white-box settings, where the attacker has full access to the target model. However, these attacks often suffer from poor transferability to unknown target models in more realistic black-box scenarios, where the target model is inaccessible. This work makes the first attempt to address the challenge of low transferability in attacking NR-IQA models by proposing a transferable Signed Ensemble Gaussian black-box Attack (SEGA). The main idea is to approximate the gradient of the target model by applying Gaussian smoothing to source models and ensembling their smoothed gradients. To ensure the imperceptibility of adversarial perturbations, SEGA further removes inappropriate perturbations using a specially designed perturbation filter mask. Experimental results demonstrate the superior transferability of SEGA, validating its effectiveness in enabling successful transfer-based black-box attacks against NR-IQA models. Code for this paper is available at https://github.com/YogaLYJ/SEGA_IQA.
{"title":"SEGA: A Transferable Signed Ensemble Gaussian Black-Box Attack Against No-Reference Image Quality Assessment Models.","authors":"Yujia Liu, Dingquan Li, Zhixuan Li, Tiejun Huang","doi":"10.1109/TPAMI.2026.3659164","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3659164","url":null,"abstract":"<p><p>No-Reference Image Quality Assessment (NR-IQA) models play an important role in various real-world applications. Recently, adversarial attacks against NR-IQA models have attracted increasing attention, as they provide valuable insights for revealing model vulnerabilities and guiding robust system design. Some effective attacks have been proposed against NR-IQA models in white-box settings, where the attacker has full access to the target model. However, these attacks often suffer from poor transferability to unknown target models in more realistic black-box scenarios, where the target model is inaccessible. This work makes the first attempt to address the challenge of low transferability in attacking NR-IQA models by proposing a transferable Signed Ensemble Gaussian black-box Attack (SEGA). The main idea is to approximate the gradient of the target model by applying Gaussian smoothing to source models and ensembling their smoothed gradients. To ensure the imperceptibility of adversarial perturbations, SEGA further removes inappropriate perturbations using a specially designed perturbation filter mask. Experimental results demonstrate the superior transferability of SEGA, validating its effectiveness in enabling successful transfer-based black-box attacks against NR-IQA models. Code for this paper is available at https://github.com/YogaLYJ/SEGA_IQA.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1109/TPAMI.2026.3659110
Bochao Liu, Shiming Ge, Pengju Wang, Shikun Li, Tongliang Liu
While many deep learning models trained on private datasets have been deployed in various practical tasks, they may pose a privacy leakage risk as attackers could recover informative data or label knowledge from models. In this work, we present privacy-preserving model transcription, a data-free model-to-model conversion solution to facilitate model deployment with a privacy guarantee. To this end, we propose a cooperative-competitive learning approach termed differentially private synthetic distillation that learns to convert a pretrained model (teacher) into its privacy-preserving counterpart (student) via a trainable generator without access to private data. The learning collaborates with three players in a unified framework and performs alternate optimization: i) the generator is learned to generate synthetic data, ii) the teacher and student accept the synthetic data and compute differential private labels by flexible data or label noisy perturbation, and iii) the student is updated with noisy labels and the generator is updated by taking the student as a discriminator for adversarial training. We theoretically prove that our approach can guarantee differential privacy and convergence. The transcribed student has good performance and privacy protection, while the resulting generator can generate private synthetic data for downstream tasks. Extensive experiments clearly demonstrate that our approach outperforms 26 state-of-the-arts.
{"title":"Privacy-Preserving Model Transcription With Differentially Private Synthetic Distillation.","authors":"Bochao Liu, Shiming Ge, Pengju Wang, Shikun Li, Tongliang Liu","doi":"10.1109/TPAMI.2026.3659110","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3659110","url":null,"abstract":"<p><p>While many deep learning models trained on private datasets have been deployed in various practical tasks, they may pose a privacy leakage risk as attackers could recover informative data or label knowledge from models. In this work, we present privacy-preserving model transcription, a data-free model-to-model conversion solution to facilitate model deployment with a privacy guarantee. To this end, we propose a cooperative-competitive learning approach termed differentially private synthetic distillation that learns to convert a pretrained model (teacher) into its privacy-preserving counterpart (student) via a trainable generator without access to private data. The learning collaborates with three players in a unified framework and performs alternate optimization: i) the generator is learned to generate synthetic data, ii) the teacher and student accept the synthetic data and compute differential private labels by flexible data or label noisy perturbation, and iii) the student is updated with noisy labels and the generator is updated by taking the student as a discriminator for adversarial training. We theoretically prove that our approach can guarantee differential privacy and convergence. The transcribed student has good performance and privacy protection, while the resulting generator can generate private synthetic data for downstream tasks. Extensive experiments clearly demonstrate that our approach outperforms 26 state-of-the-arts.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adversarial imitation learning (AIL), a prominent approach in imitation learning, has achieved significant practical success powered by neural network approximation. However, existing theoretical analyses of AIL are primarily confined to simplified settings-such as tabular and linear function approximation-and involve complex algorithmic designs that impede practical implementation. This creates a substantial gap between theory and practice. This paper bridges this gap by exploring the theoretical underpinnings of online AIL with general function approximation. We introduce a novel framework called optimization-based AIL (OPT-AIL), which performs online optimization for reward learning coupled with optimism-regularized optimization for policy learning. Within this framework, we develop two concrete methods: model-free OPT-AIL and model-based OPT-AIL. Our theoretical analysis demonstrates that both variants achieve polynomial expert sample complexity and interaction complexity for learning near-expert policies. To the best of our knowledge, they represent the first provably efficient AIL methods under general function approximation. From a practical standpoint, OPT-AIL requires only the approximate optimization of two objectives, thereby facilitating practical implementation. Empirical studies demonstrate that OPT-AIL outperforms previous state-of-the-art deep AIL methods across several challenging tasks.
{"title":"Adversarial Imitation Learning with General Function Approximation: Theoretical Analysis and Practical Algorithms.","authors":"Tian Xu, Zhilong Zhang, Zexuan Chen, Ruishuo Chen, Yihao Sun, Yang Yu","doi":"10.1109/TPAMI.2026.3657578","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3657578","url":null,"abstract":"<p><p>Adversarial imitation learning (AIL), a prominent approach in imitation learning, has achieved significant practical success powered by neural network approximation. However, existing theoretical analyses of AIL are primarily confined to simplified settings-such as tabular and linear function approximation-and involve complex algorithmic designs that impede practical implementation. This creates a substantial gap between theory and practice. This paper bridges this gap by exploring the theoretical underpinnings of online AIL with general function approximation. We introduce a novel framework called optimization-based AIL (OPT-AIL), which performs online optimization for reward learning coupled with optimism-regularized optimization for policy learning. Within this framework, we develop two concrete methods: model-free OPT-AIL and model-based OPT-AIL. Our theoretical analysis demonstrates that both variants achieve polynomial expert sample complexity and interaction complexity for learning near-expert policies. To the best of our knowledge, they represent the first provably efficient AIL methods under general function approximation. From a practical standpoint, OPT-AIL requires only the approximate optimization of two objectives, thereby facilitating practical implementation. Empirical studies demonstrate that OPT-AIL outperforms previous state-of-the-art deep AIL methods across several challenging tasks.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-26DOI: 10.1109/TPAMI.2026.3657354
Lingling Xu, Haoran Xie, S Joe Qin, Xiaohui Tao, Fu Lee Wang
With the continuous growth in the number of parameters of the Transformer-based pretrained language models (PLMs), particularly the emergence of large language models (LLMs) with billions of parameters, many natural language processing (NLP) tasks have demonstrated remarkable success. However, the enormous size and computational demands of these models pose significant challenges for adapting them to specific downstream tasks, especially in environments with limited computational resources. Parameter-Efficient Fine-Tuning (PEFT) offers an effective solution by reducing the number of fine-tuning parameters and memory usage while achieving comparable performance to full fine-tuning. The demands for fine-tuning PLMs, especially LLMs, have led to a surge in the development of PEFT methods, as depicted in Fig. 1. In this paper, we present a comprehensive and systematic review of PEFT methods for PLMs. We summarize these PEFT methods, discuss their applications, and outline future directions. Furthermore, extensive experiments are conducted using several representative PEFT methods to better understand their effectiveness in parameter efficiency and memory efficiency. By offering insights into the latest advancements and practical applications, this survey serves as an invaluable resource for researchers and practitioners seeking to navigate the challenges and opportunities presented by PEFT in the context of PLMs.
{"title":"Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment.","authors":"Lingling Xu, Haoran Xie, S Joe Qin, Xiaohui Tao, Fu Lee Wang","doi":"10.1109/TPAMI.2026.3657354","DOIUrl":"https://doi.org/10.1109/TPAMI.2026.3657354","url":null,"abstract":"<p><p>With the continuous growth in the number of parameters of the Transformer-based pretrained language models (PLMs), particularly the emergence of large language models (LLMs) with billions of parameters, many natural language processing (NLP) tasks have demonstrated remarkable success. However, the enormous size and computational demands of these models pose significant challenges for adapting them to specific downstream tasks, especially in environments with limited computational resources. Parameter-Efficient Fine-Tuning (PEFT) offers an effective solution by reducing the number of fine-tuning parameters and memory usage while achieving comparable performance to full fine-tuning. The demands for fine-tuning PLMs, especially LLMs, have led to a surge in the development of PEFT methods, as depicted in Fig. 1. In this paper, we present a comprehensive and systematic review of PEFT methods for PLMs. We summarize these PEFT methods, discuss their applications, and outline future directions. Furthermore, extensive experiments are conducted using several representative PEFT methods to better understand their effectiveness in parameter efficiency and memory efficiency. By offering insights into the latest advancements and practical applications, this survey serves as an invaluable resource for researchers and practitioners seeking to navigate the challenges and opportunities presented by PEFT in the context of PLMs.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146055668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}