The performance of deep neural networks (DNNs) in accomplishing tasks heavily relies on feature selection and sparse representation of high-dimensional data. Previous work has treated feature selection and sparse representation as separate mechanisms for improving DNNs performance, focusing on identifying and leveraging informative features to enhance task-specific outcomes. However, few studies have established a connection between feature selection and sparse representation. To address this gap, this article proposes an optimization framework termed informative sparse transport (IST), which integrates feature selection and sparse coding into a unified multiobjective optimization framework. Using optimal transport as a bridge, the IST framework harmonizes the relationship between feature selection and sparse representation, offering an informational advantage. In the IST framework, feature selection aims to identify an optimal subset of features to maximize mutual information or minimize redundancy, while sparse representation seeks to approximate data with the fewest possible features. Although these objectives differ, they are fundamentally complementary, as both emphasize extracting task-relevant information while eliminating redundancy. By unifying feature selection and sparse representation, the IST framework effectively mitigates challenges posed by high-dimensional data, delivering a robust solution for enhanced feature extraction and representation. We validate the IST framework on generative and classification tasks, demonstrating IST framework improves model performance through the complementary synergy of feature selection and sparse representation.
{"title":"A Deep Neural Network Optimization Framework Based on Optimal Transport Bridge Feature Selection and Sparse Representation.","authors":"Guipeng Lan,Shuai Xiao,Jiabao Wen,Jiachen Yang,Wen Lu,Baihua Li,Qinggang Meng,Xinbo Gao","doi":"10.1109/tnnls.2026.3678220","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3678220","url":null,"abstract":"The performance of deep neural networks (DNNs) in accomplishing tasks heavily relies on feature selection and sparse representation of high-dimensional data. Previous work has treated feature selection and sparse representation as separate mechanisms for improving DNNs performance, focusing on identifying and leveraging informative features to enhance task-specific outcomes. However, few studies have established a connection between feature selection and sparse representation. To address this gap, this article proposes an optimization framework termed informative sparse transport (IST), which integrates feature selection and sparse coding into a unified multiobjective optimization framework. Using optimal transport as a bridge, the IST framework harmonizes the relationship between feature selection and sparse representation, offering an informational advantage. In the IST framework, feature selection aims to identify an optimal subset of features to maximize mutual information or minimize redundancy, while sparse representation seeks to approximate data with the fewest possible features. Although these objectives differ, they are fundamentally complementary, as both emphasize extracting task-relevant information while eliminating redundancy. By unifying feature selection and sparse representation, the IST framework effectively mitigates challenges posed by high-dimensional data, delivering a robust solution for enhanced feature extraction and representation. We validate the IST framework on generative and classification tasks, demonstrating IST framework improves model performance through the complementary synergy of feature selection and sparse representation.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"242 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147702139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-16DOI: 10.1109/tnnls.2026.3682921
Zhiru Yang,Mengmeng Zhang,Junjie Wang,Yunhao Gao,Wenzhi Liao,Wei Li
Hyperspectral target detection (HTD) involves identifying target pixels from complex backgrounds using known or inferred spectral signatures. With advances in hyperspectral imaging technology, HTD has found widespread applications in both military and civilian domains. However, it still faces challenges such as sample imbalance and spectral variability. To address these challenges, we propose a coherent pipeline that couples data, representation, and modeling. First, we develop AdvGMM, which fits a Gaussian mixture model (GMM) to high-confidence target spectra and applies adversarial reweighting against hard backgrounds to synthesize diverse, structurally constrained pseudotargets, thereby alleviating sample scarcity. Building on this, a frequency-domain adaptive fusion and Mamba-based enhanced encoder network (FAME-Net) is proposed to address the spectral variation and improve the discriminability of targets and backgrounds. FAME-Net comprises two key modules: a frequency-domain feature adaptive fusion (FDFAF) module that adaptively amplifies information-rich bands and integrates complementary frequency components while preserving the overall reflectance trend; and an efficient Mamba block that captures long-range spectral dependencies, avoids class confusion caused by similar local features, and converts the frequency-enhanced spectra into scalable, robust features. Extensive experiments on six benchmark datasets demonstrate that the proposed method outperforms state-of-the-art approaches under limited supervision, achieving superior detection robustness. The code will be available at https://github.com/Zhiru-Yang/AdvGMM-FAME-Net.
{"title":"A Dual-Network Framework With Adversarial GMM Augmentation and Frequency-Mamba Fusion for Hyperspectral Target Detection.","authors":"Zhiru Yang,Mengmeng Zhang,Junjie Wang,Yunhao Gao,Wenzhi Liao,Wei Li","doi":"10.1109/tnnls.2026.3682921","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3682921","url":null,"abstract":"Hyperspectral target detection (HTD) involves identifying target pixels from complex backgrounds using known or inferred spectral signatures. With advances in hyperspectral imaging technology, HTD has found widespread applications in both military and civilian domains. However, it still faces challenges such as sample imbalance and spectral variability. To address these challenges, we propose a coherent pipeline that couples data, representation, and modeling. First, we develop AdvGMM, which fits a Gaussian mixture model (GMM) to high-confidence target spectra and applies adversarial reweighting against hard backgrounds to synthesize diverse, structurally constrained pseudotargets, thereby alleviating sample scarcity. Building on this, a frequency-domain adaptive fusion and Mamba-based enhanced encoder network (FAME-Net) is proposed to address the spectral variation and improve the discriminability of targets and backgrounds. FAME-Net comprises two key modules: a frequency-domain feature adaptive fusion (FDFAF) module that adaptively amplifies information-rich bands and integrates complementary frequency components while preserving the overall reflectance trend; and an efficient Mamba block that captures long-range spectral dependencies, avoids class confusion caused by similar local features, and converts the frequency-enhanced spectra into scalable, robust features. Extensive experiments on six benchmark datasets demonstrate that the proposed method outperforms state-of-the-art approaches under limited supervision, achieving superior detection robustness. The code will be available at https://github.com/Zhiru-Yang/AdvGMM-FAME-Net.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"242 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147695210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph neural networks (GNNs) have achieved strong results on homophilic graphs with complete node attributes, yet their performance significantly deteriorates when faced with the combined challenges of heterophily and feature missingness. Heterophily introduces semantic inconsistency in neighborhoods, while feature missingness obscures node identity, which together constitute a complex problem we define as the heterophily-missing coupling (HMC). Under HMC, information exchanged between nodes becomes less reliable, and the usual assumptions that support message propagation no longer hold. To address this, we propose a novel adaptive prototype-guided personalized propagation (APP) framework. Specifically, it first leverages semantic rectification via prototypes (SRPs) to align neighborhood information with prototype semantics, reducing noise from inconsistent neighbors. Subsequently, personalized virtual propagation (PVP) builds upon this by clustering to construct prototype-aligned virtual edges, enabling effective feature imputation through minimizing Dirichlet energy across both real and virtual graphs. Finally, adaptive representation synergy (ARS) consolidates the propagated and imputed features by employing prototype-guided confidence weighting and enhancing representation quality via a contrastive training objective. Extensive experiments on multiple benchmark datasets demonstrate that APP consistently improves node classification performance on heterophilic graphs with missing features, achieving up to 11.22% improvement over state-of-the-art baselines while significantly reducing imputation error. The implementation is publicly available at https://github.com/limengran98/APP.
{"title":"Adaptive Prototype-Guided Personalized Propagation for Heterophilic Graphs With Missing Data.","authors":"Mengran Li,Wenbin Xing,Zelin Zang,Bo Li,Chengyang Zhang,Yong Zhang,Junzhou Chen,Ronghui Zhang,Yongfu Li,Chuan Hu,Xiaolei Ma,Zibin Zheng","doi":"10.1109/tnnls.2026.3676197","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3676197","url":null,"abstract":"Graph neural networks (GNNs) have achieved strong results on homophilic graphs with complete node attributes, yet their performance significantly deteriorates when faced with the combined challenges of heterophily and feature missingness. Heterophily introduces semantic inconsistency in neighborhoods, while feature missingness obscures node identity, which together constitute a complex problem we define as the heterophily-missing coupling (HMC). Under HMC, information exchanged between nodes becomes less reliable, and the usual assumptions that support message propagation no longer hold. To address this, we propose a novel adaptive prototype-guided personalized propagation (APP) framework. Specifically, it first leverages semantic rectification via prototypes (SRPs) to align neighborhood information with prototype semantics, reducing noise from inconsistent neighbors. Subsequently, personalized virtual propagation (PVP) builds upon this by clustering to construct prototype-aligned virtual edges, enabling effective feature imputation through minimizing Dirichlet energy across both real and virtual graphs. Finally, adaptive representation synergy (ARS) consolidates the propagated and imputed features by employing prototype-guided confidence weighting and enhancing representation quality via a contrastive training objective. Extensive experiments on multiple benchmark datasets demonstrate that APP consistently improves node classification performance on heterophilic graphs with missing features, achieving up to 11.22% improvement over state-of-the-art baselines while significantly reducing imputation error. The implementation is publicly available at https://github.com/limengran98/APP.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"33 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147680364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Object state changes (OSCs) play a critical role in video understanding, as they focus on localizing the stages of state transitions within temporal sequences. However, existing methods face two key challenges in open-world scenarios. First, there is a significant background-causal scene imbalance due to dataset bias. This leads to reliance on irrelevant features and degrades prediction capability. Second, existing methods have poor generalization performance on unseen objects. They typically focus on a single state change of a specific object, which limits them to understand the state change of an unseen object in a generalized way as humans do. To address these challenges, we first introduce a structural causal model (SCM) to formally structure the OSC task, which explicitly defines the confounding effect of dataset bias and the lack of generalization. Guided by this SCM, we propose CCI-Net, a causal counterfactual inference-based video OSC neural network. CCI-Net employs a causal inference network for backdoor adjustment to effectively eliminate confounders. In addition, it integrates counterfactual inference to enhance understanding in open-world scenarios. Specifically, CCI-Net comprises two key components: the backdoor scene classifier (BSC) and the counterfactual module (CM). The BSC controls potential confounders and mitigates spurious correlations. The CM enhances generalization to unseen objects and their state changes by constructing counterfactual scenes during training. Furthermore, we design two loss functions for causal and counterfactual scenes to optimize the learning process. Experimental results on three benchmark datasets demonstrate that, compared with existing methods, CCI-Net significantly improves both precision and generalization in open-world scenarios.
{"title":"Causal Counterfactual Inference Network for Video Object State Changes in Open-World Scenarios.","authors":"Zhichao Wang,Shucheng Huang,Mingxing Li,Yifan Jiao","doi":"10.1109/tnnls.2026.3678945","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3678945","url":null,"abstract":"Object state changes (OSCs) play a critical role in video understanding, as they focus on localizing the stages of state transitions within temporal sequences. However, existing methods face two key challenges in open-world scenarios. First, there is a significant background-causal scene imbalance due to dataset bias. This leads to reliance on irrelevant features and degrades prediction capability. Second, existing methods have poor generalization performance on unseen objects. They typically focus on a single state change of a specific object, which limits them to understand the state change of an unseen object in a generalized way as humans do. To address these challenges, we first introduce a structural causal model (SCM) to formally structure the OSC task, which explicitly defines the confounding effect of dataset bias and the lack of generalization. Guided by this SCM, we propose CCI-Net, a causal counterfactual inference-based video OSC neural network. CCI-Net employs a causal inference network for backdoor adjustment to effectively eliminate confounders. In addition, it integrates counterfactual inference to enhance understanding in open-world scenarios. Specifically, CCI-Net comprises two key components: the backdoor scene classifier (BSC) and the counterfactual module (CM). The BSC controls potential confounders and mitigates spurious correlations. The CM enhances generalization to unseen objects and their state changes by constructing counterfactual scenes during training. Furthermore, we design two loss functions for causal and counterfactual scenes to optimize the learning process. Experimental results on three benchmark datasets demonstrate that, compared with existing methods, CCI-Net significantly improves both precision and generalization in open-world scenarios.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"223 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147680366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph neural networks (GNNs) have excelled in handling graph-structured data, attracting significant research interest. However, two primary challenges have emerged: interference between topology and attributes distorting node representations, and the low-pass filtering nature of most GNNs leading to the oversight of valuable high-frequency information in graph signals. These issues are particularly pronounced in heterophilic graphs. To address these challenges, we propose attribute-topology cross-frequency aligned (ATCFA) GNNs. ATCFA combines low- and high-pass filters to capture both smooth and detailed representations from topological and attribute perspectives. It also enforces frequency-specific constraints to reduce noise and redundancy in each frequency band. The model can dynamically adjust the filtering ratios for both homophilic and heterophilic graphs. Crucially, ATCFA establishes dynamic associations between corresponding frequency components of topology and attribute, achieving systematic alignment and interactive fusion that explicitly mitigates interference and promotes complementary information utilization across domains. Extensive experiments on standard datasets show that ATCFA delivers higher classification accuracy than state-of-the-art methods, proving its capability to handle both homophilic and heterophilic graphs in node classification.
{"title":"Attribute-Topology Cross-Frequency Aligned Graph Neural Networks for Homophilic and Heterophilic Graphs in Node Classification.","authors":"Yachao Yang,Yanfeng Sun,Jipeng Guo,Jinlu Wang,Shaofan Wang,Junbin Gao,Fujiao Ju,Baocai Yin","doi":"10.1109/tnnls.2026.3678135","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3678135","url":null,"abstract":"Graph neural networks (GNNs) have excelled in handling graph-structured data, attracting significant research interest. However, two primary challenges have emerged: interference between topology and attributes distorting node representations, and the low-pass filtering nature of most GNNs leading to the oversight of valuable high-frequency information in graph signals. These issues are particularly pronounced in heterophilic graphs. To address these challenges, we propose attribute-topology cross-frequency aligned (ATCFA) GNNs. ATCFA combines low- and high-pass filters to capture both smooth and detailed representations from topological and attribute perspectives. It also enforces frequency-specific constraints to reduce noise and redundancy in each frequency band. The model can dynamically adjust the filtering ratios for both homophilic and heterophilic graphs. Crucially, ATCFA establishes dynamic associations between corresponding frequency components of topology and attribute, achieving systematic alignment and interactive fusion that explicitly mitigates interference and promotes complementary information utilization across domains. Extensive experiments on standard datasets show that ATCFA delivers higher classification accuracy than state-of-the-art methods, proving its capability to handle both homophilic and heterophilic graphs in node classification.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"52 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147666671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Image hazing is crucial for enhancing the image visibility and mitigating the weather degradations. However, most existing approaches rely on the paired hazy and clean images, which are challenging to obtain in real-world scenarios. To this end, we propose an oriented Bayesian-regularized consistent optimal transport (OBCOT) framework, which formulates the unpaired image dehazing task as an optimal transport (OT) problem. Specifically, we introduce a structure-preserving transport cost, incorporating the structural similarity (SSIM) constraint to minimize the duality gap between the primal and dual formulations, while preserving the structural details of reconstructed images. Furthermore, we derive the Bayesian frequency-domain regularization (BFR) to balance the spectral consistency with clean References and repulsion from hazy patterns. In addition, we employ a pretrained one-step stable diffusion model as the restoration network, which is fine-tuned using the low-rank adaptation (LoRA) adapters and zero convolutional layers, while integrating the domain-specific text prompts for both degraded and clean images to guide the generation process. Extensive experiments demonstrate that our method surpasses the existing well-performing unpaired learning approaches, achieving notable improvements in both the fidelity and photo-realism.
{"title":"When Optimal Transport Meets Photo-Realistic Image Dehazing With Unpaired Training.","authors":"Yuanbo Wen,Tao Gao,Shan Liang,Dena Zhang,Ziqi Li,Jing Qin,Ting Chen","doi":"10.1109/tnnls.2026.3673760","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3673760","url":null,"abstract":"Image hazing is crucial for enhancing the image visibility and mitigating the weather degradations. However, most existing approaches rely on the paired hazy and clean images, which are challenging to obtain in real-world scenarios. To this end, we propose an oriented Bayesian-regularized consistent optimal transport (OBCOT) framework, which formulates the unpaired image dehazing task as an optimal transport (OT) problem. Specifically, we introduce a structure-preserving transport cost, incorporating the structural similarity (SSIM) constraint to minimize the duality gap between the primal and dual formulations, while preserving the structural details of reconstructed images. Furthermore, we derive the Bayesian frequency-domain regularization (BFR) to balance the spectral consistency with clean References and repulsion from hazy patterns. In addition, we employ a pretrained one-step stable diffusion model as the restoration network, which is fine-tuned using the low-rank adaptation (LoRA) adapters and zero convolutional layers, while integrating the domain-specific text prompts for both degraded and clean images to guide the generation process. Extensive experiments demonstrate that our method surpasses the existing well-performing unpaired learning approaches, achieving notable improvements in both the fidelity and photo-realism.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"62 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147641445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-09DOI: 10.1109/TNNLS.2026.3679684
Suping Xu, Chuyi Dai, Ye Liu, Lin Shang, Xibei Yang, Witold Pedrycz
Feature selection is crucial for fuzzy decision systems (FDSs), as it identifies informative features and eliminates rule redundancy, thereby enhancing predictive performance and interpretability. Most existing methods either fail to directly align evaluation criteria with learning performance or rely solely on nondirectional Euclidean distances to capture relationships among decision classes, which limits their ability to clarify decision boundaries. However, the spatial distribution of instances has a potential impact on the clarity of such boundaries. Motivated by this, we propose spatially-aware separability-driven feature selection (S2FS), a novel framework for FDSs guided by a spatially-aware separability criterion. This criterion jointly considers within-class compactness and between-class separation by integrating scalar-distances with spatial directional information, providing a more comprehensive characterization of class structures. S2FS employs a forward greedy strategy to iteratively select the most discriminative features. Extensive experiments on 11 real-world datasets demonstrate that S2FS consistently outperforms ten state-of-the-art feature selection algorithms in both classification accuracy and clustering performance, while feature visualizations further confirm the interpretability of the selected features.
{"title":"S<sup>2</sup>FS: Spatially-Aware Separability-Driven Feature Selection in Fuzzy Decision Systems.","authors":"Suping Xu, Chuyi Dai, Ye Liu, Lin Shang, Xibei Yang, Witold Pedrycz","doi":"10.1109/TNNLS.2026.3679684","DOIUrl":"https://doi.org/10.1109/TNNLS.2026.3679684","url":null,"abstract":"<p><p>Feature selection is crucial for fuzzy decision systems (FDSs), as it identifies informative features and eliminates rule redundancy, thereby enhancing predictive performance and interpretability. Most existing methods either fail to directly align evaluation criteria with learning performance or rely solely on nondirectional Euclidean distances to capture relationships among decision classes, which limits their ability to clarify decision boundaries. However, the spatial distribution of instances has a potential impact on the clarity of such boundaries. Motivated by this, we propose spatially-aware separability-driven feature selection (S<sup>2</sup>FS), a novel framework for FDSs guided by a spatially-aware separability criterion. This criterion jointly considers within-class compactness and between-class separation by integrating scalar-distances with spatial directional information, providing a more comprehensive characterization of class structures. S<sup>2</sup>FS employs a forward greedy strategy to iteratively select the most discriminative features. Extensive experiments on 11 real-world datasets demonstrate that S<sup>2</sup>FS consistently outperforms ten state-of-the-art feature selection algorithms in both classification accuracy and clustering performance, while feature visualizations further confirm the interpretability of the selected features.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147645103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-09DOI: 10.1109/tnnls.2026.3669538
Bo-Jian Zhang,Guang-Hai Liu,Zuoyong Li,Shu-Xiang Song
Generating compact and robust feature representations using principal component analysis (PCA) is crucial for image retrieval tasks. However, most existing methods require PCA parameters to be learned from auxiliary datasets, which inevitably increases computational cost and limits generalization. To address this issue, we propose a novel dimensionality reduction learning method, namely multistage PCA whitening (MSPW), for image retrieval. Three main highlights are: 1) we propose a feature self-learning (FSL) method to learn PCA whitening (PW) parameters. This method can reconstruct the features of the retrieval dataset via singular value decomposition (SVD) and noise perturbation, which eliminates dependence on auxiliary datasets and alleviates performance degradation in high-dimensional features; 2) unlike existing single-stage dimensionality reduction methods, we introduce an online query self-learning (QSL) method that dynamically learns PCA parameters by incorporating query features, significantly improving the retrieval performance of using short-vector features; and 3) we propose a feature fusion (FF) method via using dimensional weighting to balance the contributions of various heterogeneous features, thereby enhancing the robustness of features across different dimensions. Experimental results on six benchmark datasets demonstrated that our MSPW method significantly outperforms existing state-of-the-art methods used for dimensionality reduction. Notably, our MSPW method using 4-D features can achieve more than 10% relative improvement over the previous best methods in terms of mean average precision (mAP) on two large-scale datasets.
{"title":"Multistage PCA Whitening: A Robust Method to Dimensionality Reduction in Image Retrieval.","authors":"Bo-Jian Zhang,Guang-Hai Liu,Zuoyong Li,Shu-Xiang Song","doi":"10.1109/tnnls.2026.3669538","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3669538","url":null,"abstract":"Generating compact and robust feature representations using principal component analysis (PCA) is crucial for image retrieval tasks. However, most existing methods require PCA parameters to be learned from auxiliary datasets, which inevitably increases computational cost and limits generalization. To address this issue, we propose a novel dimensionality reduction learning method, namely multistage PCA whitening (MSPW), for image retrieval. Three main highlights are: 1) we propose a feature self-learning (FSL) method to learn PCA whitening (PW) parameters. This method can reconstruct the features of the retrieval dataset via singular value decomposition (SVD) and noise perturbation, which eliminates dependence on auxiliary datasets and alleviates performance degradation in high-dimensional features; 2) unlike existing single-stage dimensionality reduction methods, we introduce an online query self-learning (QSL) method that dynamically learns PCA parameters by incorporating query features, significantly improving the retrieval performance of using short-vector features; and 3) we propose a feature fusion (FF) method via using dimensional weighting to balance the contributions of various heterogeneous features, thereby enhancing the robustness of features across different dimensions. Experimental results on six benchmark datasets demonstrated that our MSPW method significantly outperforms existing state-of-the-art methods used for dimensionality reduction. Notably, our MSPW method using 4-D features can achieve more than 10% relative improvement over the previous best methods in terms of mean average precision (mAP) on two large-scale datasets.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"21 5 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147641446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-08DOI: 10.1109/tnnls.2026.3679789
Shengwen Li,Zhouzheng Xu,Renyao Chen,Jiarui Zhu,Yaqin Ye,Shunping Zhou,Hong Yao
Geographic entity representation learning (GERL) is an emerging method that represents natural, administrative divisions, road networks, and points of interest (POIs) in a low-dimensional continuous vector space. GERL provides a fundamental approach for geographic entities to underpin a variety of intelligent applications by learning their representation vectors to capture the semantics and interactions of the entities. Previous GERL methods mainly focus on the representation learning of the geographic entities that are seen at the time of training, which struggle to accurately generate representation vectors for the growing number of unseen geographic entities that were not involved in model training. To address this issue, this article proposes spatial meta-learning-based representation learning (SMRL), which integrates spatial subgraphs and meta-learning to improve the representation vectors of unseen geographic entities. Specifically, SMRL first designs a spatial-aware subgraph sampling module based on attributes and relationships of geographic entities to divide entities into spatial subgraphs. It develops a local-level representation module to learn entity features at the subgraph level. Finally, SMRL proposes a meta-learning-driven representation strategy that integrates meta-learning to learn the representation of unseen geographic entities. Extensive experiments show that the proposed SMRL method outperforms baselines with both higher accuracy and higher computational efficiency. This study provides new explorations for the representation of unseen geographic entities and offers methodological References for the various geographic applications.
{"title":"Spatial Meta-Learning-Based Representation for Unseen Geographic Entities.","authors":"Shengwen Li,Zhouzheng Xu,Renyao Chen,Jiarui Zhu,Yaqin Ye,Shunping Zhou,Hong Yao","doi":"10.1109/tnnls.2026.3679789","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3679789","url":null,"abstract":"Geographic entity representation learning (GERL) is an emerging method that represents natural, administrative divisions, road networks, and points of interest (POIs) in a low-dimensional continuous vector space. GERL provides a fundamental approach for geographic entities to underpin a variety of intelligent applications by learning their representation vectors to capture the semantics and interactions of the entities. Previous GERL methods mainly focus on the representation learning of the geographic entities that are seen at the time of training, which struggle to accurately generate representation vectors for the growing number of unseen geographic entities that were not involved in model training. To address this issue, this article proposes spatial meta-learning-based representation learning (SMRL), which integrates spatial subgraphs and meta-learning to improve the representation vectors of unseen geographic entities. Specifically, SMRL first designs a spatial-aware subgraph sampling module based on attributes and relationships of geographic entities to divide entities into spatial subgraphs. It develops a local-level representation module to learn entity features at the subgraph level. Finally, SMRL proposes a meta-learning-driven representation strategy that integrates meta-learning to learn the representation of unseen geographic entities. Extensive experiments show that the proposed SMRL method outperforms baselines with both higher accuracy and higher computational efficiency. This study provides new explorations for the representation of unseen geographic entities and offers methodological References for the various geographic applications.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"22 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147636118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}