Pub Date : 2026-04-21DOI: 10.1109/tnnls.2026.3684128
Jianqi Zhong,Junyu Shi,Wenming Cao
Graph Convolutional Networks (GCNs) have exhibited considerable promise in 3-D skeleton-based human motion prediction. Based on the intuitive observation that human motion can be delineated through the physical interconnections among human joints, many previous works have designed multiscale graphs to learn the relationships and constraints between different graph scales, obtaining encouraging results for human motion prediction. However, these fixed multiscale graphs obtain new scale graphs by merging adjacent human joint information, ignoring implicit semantic information during dynamic movements. Furthermore, human joint correlations tend to vary randomly as the depth of the multiscale clustering graph increases, which contradicts the design concept of fixed multiscale graphs. To address these limitations, we explore a novel correlation-based multiscale graph clustering network (CMGC) for adaptive multiscale graph representation learning. Given a human joints graph, the goal of CMGC is first to generate more new graphs representing motion correlations adaptively at different scale levels and then selectively restore the derived graph scales to the original human joints graphs, which enables various motion features extraction. Moreover, we introduce the discrete wavelet transform (DWT) to compensate for the signal loss caused by discrete cosine transform (DCT) domain modeling from human motion. The CMGC gives rise to gratifying performances with the adaptive multiscale graph. Extensive experiments reveal that CMGC outperforms state-of-the-art methods by 11.2%, 10.1%, and 11.2% of 3-D mean per joint position error (MPJPE) on average on Human 3.6M, CMU Mocap, and 3DPW datasets, respectively. We also test the mean angle error (MAE) on Human3.6M, which is lower by 6.5% than previous methods. Our code is released at https://github.com/JunyuShi02/CMGC.
{"title":"Multiscale Graph Redefining: Correlation-Based Multiscale Graph Clustering Network for Human Motion Prediction.","authors":"Jianqi Zhong,Junyu Shi,Wenming Cao","doi":"10.1109/tnnls.2026.3684128","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3684128","url":null,"abstract":"Graph Convolutional Networks (GCNs) have exhibited considerable promise in 3-D skeleton-based human motion prediction. Based on the intuitive observation that human motion can be delineated through the physical interconnections among human joints, many previous works have designed multiscale graphs to learn the relationships and constraints between different graph scales, obtaining encouraging results for human motion prediction. However, these fixed multiscale graphs obtain new scale graphs by merging adjacent human joint information, ignoring implicit semantic information during dynamic movements. Furthermore, human joint correlations tend to vary randomly as the depth of the multiscale clustering graph increases, which contradicts the design concept of fixed multiscale graphs. To address these limitations, we explore a novel correlation-based multiscale graph clustering network (CMGC) for adaptive multiscale graph representation learning. Given a human joints graph, the goal of CMGC is first to generate more new graphs representing motion correlations adaptively at different scale levels and then selectively restore the derived graph scales to the original human joints graphs, which enables various motion features extraction. Moreover, we introduce the discrete wavelet transform (DWT) to compensate for the signal loss caused by discrete cosine transform (DCT) domain modeling from human motion. The CMGC gives rise to gratifying performances with the adaptive multiscale graph. Extensive experiments reveal that CMGC outperforms state-of-the-art methods by 11.2%, 10.1%, and 11.2% of 3-D mean per joint position error (MPJPE) on average on Human 3.6M, CMU Mocap, and 3DPW datasets, respectively. We also test the mean angle error (MAE) on Human3.6M, which is lower by 6.5% than previous methods. Our code is released at https://github.com/JunyuShi02/CMGC.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"322 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147731259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The performance of deep neural networks (DNNs) in accomplishing tasks heavily relies on feature selection and sparse representation of high-dimensional data. Previous work has treated feature selection and sparse representation as separate mechanisms for improving DNNs performance, focusing on identifying and leveraging informative features to enhance task-specific outcomes. However, few studies have established a connection between feature selection and sparse representation. To address this gap, this article proposes an optimization framework termed informative sparse transport (IST), which integrates feature selection and sparse coding into a unified multiobjective optimization framework. Using optimal transport as a bridge, the IST framework harmonizes the relationship between feature selection and sparse representation, offering an informational advantage. In the IST framework, feature selection aims to identify an optimal subset of features to maximize mutual information or minimize redundancy, while sparse representation seeks to approximate data with the fewest possible features. Although these objectives differ, they are fundamentally complementary, as both emphasize extracting task-relevant information while eliminating redundancy. By unifying feature selection and sparse representation, the IST framework effectively mitigates challenges posed by high-dimensional data, delivering a robust solution for enhanced feature extraction and representation. We validate the IST framework on generative and classification tasks, demonstrating IST framework improves model performance through the complementary synergy of feature selection and sparse representation.
{"title":"A Deep Neural Network Optimization Framework Based on Optimal Transport Bridge Feature Selection and Sparse Representation.","authors":"Guipeng Lan,Shuai Xiao,Jiabao Wen,Jiachen Yang,Wen Lu,Baihua Li,Qinggang Meng,Xinbo Gao","doi":"10.1109/tnnls.2026.3678220","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3678220","url":null,"abstract":"The performance of deep neural networks (DNNs) in accomplishing tasks heavily relies on feature selection and sparse representation of high-dimensional data. Previous work has treated feature selection and sparse representation as separate mechanisms for improving DNNs performance, focusing on identifying and leveraging informative features to enhance task-specific outcomes. However, few studies have established a connection between feature selection and sparse representation. To address this gap, this article proposes an optimization framework termed informative sparse transport (IST), which integrates feature selection and sparse coding into a unified multiobjective optimization framework. Using optimal transport as a bridge, the IST framework harmonizes the relationship between feature selection and sparse representation, offering an informational advantage. In the IST framework, feature selection aims to identify an optimal subset of features to maximize mutual information or minimize redundancy, while sparse representation seeks to approximate data with the fewest possible features. Although these objectives differ, they are fundamentally complementary, as both emphasize extracting task-relevant information while eliminating redundancy. By unifying feature selection and sparse representation, the IST framework effectively mitigates challenges posed by high-dimensional data, delivering a robust solution for enhanced feature extraction and representation. We validate the IST framework on generative and classification tasks, demonstrating IST framework improves model performance through the complementary synergy of feature selection and sparse representation.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"242 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147702139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-16DOI: 10.1109/tnnls.2026.3682921
Zhiru Yang,Mengmeng Zhang,Junjie Wang,Yunhao Gao,Wenzhi Liao,Wei Li
Hyperspectral target detection (HTD) involves identifying target pixels from complex backgrounds using known or inferred spectral signatures. With advances in hyperspectral imaging technology, HTD has found widespread applications in both military and civilian domains. However, it still faces challenges such as sample imbalance and spectral variability. To address these challenges, we propose a coherent pipeline that couples data, representation, and modeling. First, we develop AdvGMM, which fits a Gaussian mixture model (GMM) to high-confidence target spectra and applies adversarial reweighting against hard backgrounds to synthesize diverse, structurally constrained pseudotargets, thereby alleviating sample scarcity. Building on this, a frequency-domain adaptive fusion and Mamba-based enhanced encoder network (FAME-Net) is proposed to address the spectral variation and improve the discriminability of targets and backgrounds. FAME-Net comprises two key modules: a frequency-domain feature adaptive fusion (FDFAF) module that adaptively amplifies information-rich bands and integrates complementary frequency components while preserving the overall reflectance trend; and an efficient Mamba block that captures long-range spectral dependencies, avoids class confusion caused by similar local features, and converts the frequency-enhanced spectra into scalable, robust features. Extensive experiments on six benchmark datasets demonstrate that the proposed method outperforms state-of-the-art approaches under limited supervision, achieving superior detection robustness. The code will be available at https://github.com/Zhiru-Yang/AdvGMM-FAME-Net.
{"title":"A Dual-Network Framework With Adversarial GMM Augmentation and Frequency-Mamba Fusion for Hyperspectral Target Detection.","authors":"Zhiru Yang,Mengmeng Zhang,Junjie Wang,Yunhao Gao,Wenzhi Liao,Wei Li","doi":"10.1109/tnnls.2026.3682921","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3682921","url":null,"abstract":"Hyperspectral target detection (HTD) involves identifying target pixels from complex backgrounds using known or inferred spectral signatures. With advances in hyperspectral imaging technology, HTD has found widespread applications in both military and civilian domains. However, it still faces challenges such as sample imbalance and spectral variability. To address these challenges, we propose a coherent pipeline that couples data, representation, and modeling. First, we develop AdvGMM, which fits a Gaussian mixture model (GMM) to high-confidence target spectra and applies adversarial reweighting against hard backgrounds to synthesize diverse, structurally constrained pseudotargets, thereby alleviating sample scarcity. Building on this, a frequency-domain adaptive fusion and Mamba-based enhanced encoder network (FAME-Net) is proposed to address the spectral variation and improve the discriminability of targets and backgrounds. FAME-Net comprises two key modules: a frequency-domain feature adaptive fusion (FDFAF) module that adaptively amplifies information-rich bands and integrates complementary frequency components while preserving the overall reflectance trend; and an efficient Mamba block that captures long-range spectral dependencies, avoids class confusion caused by similar local features, and converts the frequency-enhanced spectra into scalable, robust features. Extensive experiments on six benchmark datasets demonstrate that the proposed method outperforms state-of-the-art approaches under limited supervision, achieving superior detection robustness. The code will be available at https://github.com/Zhiru-Yang/AdvGMM-FAME-Net.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"242 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147695210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph neural networks (GNNs) have achieved strong results on homophilic graphs with complete node attributes, yet their performance significantly deteriorates when faced with the combined challenges of heterophily and feature missingness. Heterophily introduces semantic inconsistency in neighborhoods, while feature missingness obscures node identity, which together constitute a complex problem we define as the heterophily-missing coupling (HMC). Under HMC, information exchanged between nodes becomes less reliable, and the usual assumptions that support message propagation no longer hold. To address this, we propose a novel adaptive prototype-guided personalized propagation (APP) framework. Specifically, it first leverages semantic rectification via prototypes (SRPs) to align neighborhood information with prototype semantics, reducing noise from inconsistent neighbors. Subsequently, personalized virtual propagation (PVP) builds upon this by clustering to construct prototype-aligned virtual edges, enabling effective feature imputation through minimizing Dirichlet energy across both real and virtual graphs. Finally, adaptive representation synergy (ARS) consolidates the propagated and imputed features by employing prototype-guided confidence weighting and enhancing representation quality via a contrastive training objective. Extensive experiments on multiple benchmark datasets demonstrate that APP consistently improves node classification performance on heterophilic graphs with missing features, achieving up to 11.22% improvement over state-of-the-art baselines while significantly reducing imputation error. The implementation is publicly available at https://github.com/limengran98/APP.
{"title":"Adaptive Prototype-Guided Personalized Propagation for Heterophilic Graphs With Missing Data.","authors":"Mengran Li,Wenbin Xing,Zelin Zang,Bo Li,Chengyang Zhang,Yong Zhang,Junzhou Chen,Ronghui Zhang,Yongfu Li,Chuan Hu,Xiaolei Ma,Zibin Zheng","doi":"10.1109/tnnls.2026.3676197","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3676197","url":null,"abstract":"Graph neural networks (GNNs) have achieved strong results on homophilic graphs with complete node attributes, yet their performance significantly deteriorates when faced with the combined challenges of heterophily and feature missingness. Heterophily introduces semantic inconsistency in neighborhoods, while feature missingness obscures node identity, which together constitute a complex problem we define as the heterophily-missing coupling (HMC). Under HMC, information exchanged between nodes becomes less reliable, and the usual assumptions that support message propagation no longer hold. To address this, we propose a novel adaptive prototype-guided personalized propagation (APP) framework. Specifically, it first leverages semantic rectification via prototypes (SRPs) to align neighborhood information with prototype semantics, reducing noise from inconsistent neighbors. Subsequently, personalized virtual propagation (PVP) builds upon this by clustering to construct prototype-aligned virtual edges, enabling effective feature imputation through minimizing Dirichlet energy across both real and virtual graphs. Finally, adaptive representation synergy (ARS) consolidates the propagated and imputed features by employing prototype-guided confidence weighting and enhancing representation quality via a contrastive training objective. Extensive experiments on multiple benchmark datasets demonstrate that APP consistently improves node classification performance on heterophilic graphs with missing features, achieving up to 11.22% improvement over state-of-the-art baselines while significantly reducing imputation error. The implementation is publicly available at https://github.com/limengran98/APP.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"33 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147680364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Object state changes (OSCs) play a critical role in video understanding, as they focus on localizing the stages of state transitions within temporal sequences. However, existing methods face two key challenges in open-world scenarios. First, there is a significant background-causal scene imbalance due to dataset bias. This leads to reliance on irrelevant features and degrades prediction capability. Second, existing methods have poor generalization performance on unseen objects. They typically focus on a single state change of a specific object, which limits them to understand the state change of an unseen object in a generalized way as humans do. To address these challenges, we first introduce a structural causal model (SCM) to formally structure the OSC task, which explicitly defines the confounding effect of dataset bias and the lack of generalization. Guided by this SCM, we propose CCI-Net, a causal counterfactual inference-based video OSC neural network. CCI-Net employs a causal inference network for backdoor adjustment to effectively eliminate confounders. In addition, it integrates counterfactual inference to enhance understanding in open-world scenarios. Specifically, CCI-Net comprises two key components: the backdoor scene classifier (BSC) and the counterfactual module (CM). The BSC controls potential confounders and mitigates spurious correlations. The CM enhances generalization to unseen objects and their state changes by constructing counterfactual scenes during training. Furthermore, we design two loss functions for causal and counterfactual scenes to optimize the learning process. Experimental results on three benchmark datasets demonstrate that, compared with existing methods, CCI-Net significantly improves both precision and generalization in open-world scenarios.
{"title":"Causal Counterfactual Inference Network for Video Object State Changes in Open-World Scenarios.","authors":"Zhichao Wang,Shucheng Huang,Mingxing Li,Yifan Jiao","doi":"10.1109/tnnls.2026.3678945","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3678945","url":null,"abstract":"Object state changes (OSCs) play a critical role in video understanding, as they focus on localizing the stages of state transitions within temporal sequences. However, existing methods face two key challenges in open-world scenarios. First, there is a significant background-causal scene imbalance due to dataset bias. This leads to reliance on irrelevant features and degrades prediction capability. Second, existing methods have poor generalization performance on unseen objects. They typically focus on a single state change of a specific object, which limits them to understand the state change of an unseen object in a generalized way as humans do. To address these challenges, we first introduce a structural causal model (SCM) to formally structure the OSC task, which explicitly defines the confounding effect of dataset bias and the lack of generalization. Guided by this SCM, we propose CCI-Net, a causal counterfactual inference-based video OSC neural network. CCI-Net employs a causal inference network for backdoor adjustment to effectively eliminate confounders. In addition, it integrates counterfactual inference to enhance understanding in open-world scenarios. Specifically, CCI-Net comprises two key components: the backdoor scene classifier (BSC) and the counterfactual module (CM). The BSC controls potential confounders and mitigates spurious correlations. The CM enhances generalization to unseen objects and their state changes by constructing counterfactual scenes during training. Furthermore, we design two loss functions for causal and counterfactual scenes to optimize the learning process. Experimental results on three benchmark datasets demonstrate that, compared with existing methods, CCI-Net significantly improves both precision and generalization in open-world scenarios.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"223 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147680366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph neural networks (GNNs) have excelled in handling graph-structured data, attracting significant research interest. However, two primary challenges have emerged: interference between topology and attributes distorting node representations, and the low-pass filtering nature of most GNNs leading to the oversight of valuable high-frequency information in graph signals. These issues are particularly pronounced in heterophilic graphs. To address these challenges, we propose attribute-topology cross-frequency aligned (ATCFA) GNNs. ATCFA combines low- and high-pass filters to capture both smooth and detailed representations from topological and attribute perspectives. It also enforces frequency-specific constraints to reduce noise and redundancy in each frequency band. The model can dynamically adjust the filtering ratios for both homophilic and heterophilic graphs. Crucially, ATCFA establishes dynamic associations between corresponding frequency components of topology and attribute, achieving systematic alignment and interactive fusion that explicitly mitigates interference and promotes complementary information utilization across domains. Extensive experiments on standard datasets show that ATCFA delivers higher classification accuracy than state-of-the-art methods, proving its capability to handle both homophilic and heterophilic graphs in node classification.
{"title":"Attribute-Topology Cross-Frequency Aligned Graph Neural Networks for Homophilic and Heterophilic Graphs in Node Classification.","authors":"Yachao Yang,Yanfeng Sun,Jipeng Guo,Jinlu Wang,Shaofan Wang,Junbin Gao,Fujiao Ju,Baocai Yin","doi":"10.1109/tnnls.2026.3678135","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3678135","url":null,"abstract":"Graph neural networks (GNNs) have excelled in handling graph-structured data, attracting significant research interest. However, two primary challenges have emerged: interference between topology and attributes distorting node representations, and the low-pass filtering nature of most GNNs leading to the oversight of valuable high-frequency information in graph signals. These issues are particularly pronounced in heterophilic graphs. To address these challenges, we propose attribute-topology cross-frequency aligned (ATCFA) GNNs. ATCFA combines low- and high-pass filters to capture both smooth and detailed representations from topological and attribute perspectives. It also enforces frequency-specific constraints to reduce noise and redundancy in each frequency band. The model can dynamically adjust the filtering ratios for both homophilic and heterophilic graphs. Crucially, ATCFA establishes dynamic associations between corresponding frequency components of topology and attribute, achieving systematic alignment and interactive fusion that explicitly mitigates interference and promotes complementary information utilization across domains. Extensive experiments on standard datasets show that ATCFA delivers higher classification accuracy than state-of-the-art methods, proving its capability to handle both homophilic and heterophilic graphs in node classification.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"52 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147666671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Image hazing is crucial for enhancing the image visibility and mitigating the weather degradations. However, most existing approaches rely on the paired hazy and clean images, which are challenging to obtain in real-world scenarios. To this end, we propose an oriented Bayesian-regularized consistent optimal transport (OBCOT) framework, which formulates the unpaired image dehazing task as an optimal transport (OT) problem. Specifically, we introduce a structure-preserving transport cost, incorporating the structural similarity (SSIM) constraint to minimize the duality gap between the primal and dual formulations, while preserving the structural details of reconstructed images. Furthermore, we derive the Bayesian frequency-domain regularization (BFR) to balance the spectral consistency with clean References and repulsion from hazy patterns. In addition, we employ a pretrained one-step stable diffusion model as the restoration network, which is fine-tuned using the low-rank adaptation (LoRA) adapters and zero convolutional layers, while integrating the domain-specific text prompts for both degraded and clean images to guide the generation process. Extensive experiments demonstrate that our method surpasses the existing well-performing unpaired learning approaches, achieving notable improvements in both the fidelity and photo-realism.
{"title":"When Optimal Transport Meets Photo-Realistic Image Dehazing With Unpaired Training.","authors":"Yuanbo Wen,Tao Gao,Shan Liang,Dena Zhang,Ziqi Li,Jing Qin,Ting Chen","doi":"10.1109/tnnls.2026.3673760","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3673760","url":null,"abstract":"Image hazing is crucial for enhancing the image visibility and mitigating the weather degradations. However, most existing approaches rely on the paired hazy and clean images, which are challenging to obtain in real-world scenarios. To this end, we propose an oriented Bayesian-regularized consistent optimal transport (OBCOT) framework, which formulates the unpaired image dehazing task as an optimal transport (OT) problem. Specifically, we introduce a structure-preserving transport cost, incorporating the structural similarity (SSIM) constraint to minimize the duality gap between the primal and dual formulations, while preserving the structural details of reconstructed images. Furthermore, we derive the Bayesian frequency-domain regularization (BFR) to balance the spectral consistency with clean References and repulsion from hazy patterns. In addition, we employ a pretrained one-step stable diffusion model as the restoration network, which is fine-tuned using the low-rank adaptation (LoRA) adapters and zero convolutional layers, while integrating the domain-specific text prompts for both degraded and clean images to guide the generation process. Extensive experiments demonstrate that our method surpasses the existing well-performing unpaired learning approaches, achieving notable improvements in both the fidelity and photo-realism.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"62 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147641445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}