Pub Date : 2026-04-22DOI: 10.1109/tnnls.2026.3684886
Binbin Huang,Teng Bao,Feiyi Chen,Lingbin Wang,Xunqing Huang,Yuyu Yin,Xiaoying Shi,Shangguang Wang,Shuiguang Deng
The growth of large models demands multinode cooperation during training and inference processes. The computing node failures can interrupt these processes, subsequently causing information loss and prolonging the execution time. To reduce the prohibitively large overhead incurred by the computing nodes failures, the accurate prediction of computing node failure is vital, which can help to avert potential large overhead, service interruptions, and negative customer experiences. Existing solutions of computing nodes failure prediction mainly focus on utilizing state-of-the-art time-series models to enhance the performance of computing node failure prediction. However, on the one hand, they could not capture the causal relationship between device over-utilization and node failures; On the other hand, they fail to extract the complex spatial-temporal cascading correlations among computing node failure events. These limits can degrade the performance of computing node failure prediction. To address these above problems, this article makes an effort to focus on designing a continuous-time dynamic graphs-based computing node failures prediction (CTDG-NFP) scheme, to accurately predict in dynamic cluster environments. Specifically, the CTDG-NFP scheme first designs a novel multiple-dimensional feature-biased neighbor sampling method, which jointly considers CPU utilization-biased, memory utilization-biased, temporal-biased and spatial-biased, to sample relevant context. Then, the CTDG-NFP scheme extracts diverse computing node failure motifs by multiple-dimensional feature-biased-based long-short-path walk method and set-based anonymization method. Finally, the CTDG-NFP scheme adopts time encoder to encode these motifs, and thereby extracting the complex spatial-temporal correlations among computing node failure events. On this basis, contrastive learning is adopted to train the computing node failure prediction model. Extensive evaluations with various real-world failure traces demonstrate the CTDG-NFP scheme can achieve superior performance in terms of six widely used performance metrics compared with the SOTA node failure prediction methods.
{"title":"Computing Node Failure Prediction Based on Continuous-Time Dynamic Graph.","authors":"Binbin Huang,Teng Bao,Feiyi Chen,Lingbin Wang,Xunqing Huang,Yuyu Yin,Xiaoying Shi,Shangguang Wang,Shuiguang Deng","doi":"10.1109/tnnls.2026.3684886","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3684886","url":null,"abstract":"The growth of large models demands multinode cooperation during training and inference processes. The computing node failures can interrupt these processes, subsequently causing information loss and prolonging the execution time. To reduce the prohibitively large overhead incurred by the computing nodes failures, the accurate prediction of computing node failure is vital, which can help to avert potential large overhead, service interruptions, and negative customer experiences. Existing solutions of computing nodes failure prediction mainly focus on utilizing state-of-the-art time-series models to enhance the performance of computing node failure prediction. However, on the one hand, they could not capture the causal relationship between device over-utilization and node failures; On the other hand, they fail to extract the complex spatial-temporal cascading correlations among computing node failure events. These limits can degrade the performance of computing node failure prediction. To address these above problems, this article makes an effort to focus on designing a continuous-time dynamic graphs-based computing node failures prediction (CTDG-NFP) scheme, to accurately predict in dynamic cluster environments. Specifically, the CTDG-NFP scheme first designs a novel multiple-dimensional feature-biased neighbor sampling method, which jointly considers CPU utilization-biased, memory utilization-biased, temporal-biased and spatial-biased, to sample relevant context. Then, the CTDG-NFP scheme extracts diverse computing node failure motifs by multiple-dimensional feature-biased-based long-short-path walk method and set-based anonymization method. Finally, the CTDG-NFP scheme adopts time encoder to encode these motifs, and thereby extracting the complex spatial-temporal correlations among computing node failure events. On this basis, contrastive learning is adopted to train the computing node failure prediction model. Extensive evaluations with various real-world failure traces demonstrate the CTDG-NFP scheme can achieve superior performance in terms of six widely used performance metrics compared with the SOTA node failure prediction methods.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"246 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147733983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-22DOI: 10.1109/tnnls.2026.3683398
Ruinan Jin, Minghui Chen, Qiong Zhang, Xiaoxiao Li
{"title":"Forgettable Federated Linear Learning With Certified Data Unlearning","authors":"Ruinan Jin, Minghui Chen, Qiong Zhang, Xiaoxiao Li","doi":"10.1109/tnnls.2026.3683398","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3683398","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"22 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147735977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-22DOI: 10.1109/tnnls.2026.3680732
Zhili Zhao,Li Wan,Xupeng Liu,Ruiyi Yan,Shaomeng Wang
In node classification, traditional graph neural networks (GNNs) typically assume implicit homophily, indicating that intraclass nodes are likely connected. However, real-world graphs frequently exhibit heterophily, in which interclass nodes are also commonly connected. To address this challenge, recent methods have adopted approaches such as expanding local neighborhoods and employing adaptive message aggregation to enhance the GNN performance on heterophily graphs. Nevertheless, these methods are restricted by the homophily assumption and fail to effectively capture long-range dependencies (e.g., widely separated intraclass nodes) and insufficiently leverage the graph topology. This study investigates the performance differences of GNN when it is applied to both homophily and heterophily graphs and finds that the distinguishability of neighborhood label distributions (NLDs) exhibits a significant correlation with the accuracy of node classification. To assess the impact of NLD on node classification, this study proposes a novel homophily metric based on node distinguishability. Subsequently, this study introduces a new GNN model named NLD-based GNN (NLDGNN) for node classification. First, NLDGNN initializes node representations by integrating node features with node NLDs. To address long-range dependencies in heterophily graphs, NLDGNN utilizes the global label relationship matrix with low-rank characteristics for global message passing. By combining the attention scores derived from the initial node representations, NLDGNN constructs the global label relationship matrix for enhanced message passing, thereby improving the expressiveness of node representations. Experimental results indicate that NLDGNN outperforms existing GNN models on both real-world homophily and heterophily graphs. The code of this study is available at https://github.com/wanli6/NLDGNN.
{"title":"Node Classification in GNNs: Impact of Neighborhood Label Distribution on Homophily and Heterophily.","authors":"Zhili Zhao,Li Wan,Xupeng Liu,Ruiyi Yan,Shaomeng Wang","doi":"10.1109/tnnls.2026.3680732","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3680732","url":null,"abstract":"In node classification, traditional graph neural networks (GNNs) typically assume implicit homophily, indicating that intraclass nodes are likely connected. However, real-world graphs frequently exhibit heterophily, in which interclass nodes are also commonly connected. To address this challenge, recent methods have adopted approaches such as expanding local neighborhoods and employing adaptive message aggregation to enhance the GNN performance on heterophily graphs. Nevertheless, these methods are restricted by the homophily assumption and fail to effectively capture long-range dependencies (e.g., widely separated intraclass nodes) and insufficiently leverage the graph topology. This study investigates the performance differences of GNN when it is applied to both homophily and heterophily graphs and finds that the distinguishability of neighborhood label distributions (NLDs) exhibits a significant correlation with the accuracy of node classification. To assess the impact of NLD on node classification, this study proposes a novel homophily metric based on node distinguishability. Subsequently, this study introduces a new GNN model named NLD-based GNN (NLDGNN) for node classification. First, NLDGNN initializes node representations by integrating node features with node NLDs. To address long-range dependencies in heterophily graphs, NLDGNN utilizes the global label relationship matrix with low-rank characteristics for global message passing. By combining the attention scores derived from the initial node representations, NLDGNN constructs the global label relationship matrix for enhanced message passing, thereby improving the expressiveness of node representations. Experimental results indicate that NLDGNN outperforms existing GNN models on both real-world homophily and heterophily graphs. The code of this study is available at https://github.com/wanli6/NLDGNN.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"25 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147733981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Formal verification using temporal logics such as computation tree logic (CTL) is essential for validating safety and correctness in complex systems. However, traditional model-checking techniques face severe scalability limitations due to the state explosion problem and their reliance on exhaustive symbolic traversal. Moreover, existing learning-based verification methods often lack formal guarantees and interpretability. These challenges create a pressing need for scalable, learning-based verification methods that preserve verification reliability while improving computational efficiency. This article introduces a novel deep reinforcement learning (DRL)-based model checking framework that learns to verify CTL formulas directly through interaction with system models. Unlike traditional symbolic model checkers such as NuSMV, the proposed DRL-CTL checker trained using proximal policy optimization (PPO) interprets CTL semantics over system models represented as Kripke structures without performing symbolic state-space traversal at inference time. Reward functions are designed for individual CTL operators, and fixed-point reasoning is incorporated to handle global temporal properties such as $AG(phi)$ and $EG(phi)$ . Experimental results show that the proposed method achieves near-constant inference time of approximately 2 ms per formula on an Intel Core i9-13900K CPU (24 cores, 3.0 GHz), 64 GB RAM, NVIDIA RTX 4090 GPU (24 GB VRAM), reduces verification time by up to 90% compared with traditional model checkers, and scales to models with more than $10^{1192}$ reachable states. The framework also produces witnesses and counterexamples and yields verification outcomes identical to those of symbolic checkers in our experiments. These results highlight the potential of DRL to serve as a scalable, efficient, and explainable alternative to classical CTL model checking.
{"title":"Scalable and Efficient Deep Reinforcement Learning-Based Model Checker for Computation Tree Logic.","authors":"Ghalya Alwhishi,Jamal Bentahar,Amine Andam,Ahmed Elwhishi,Mustapha Hedabou","doi":"10.1109/tnnls.2026.3683573","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3683573","url":null,"abstract":"Formal verification using temporal logics such as computation tree logic (CTL) is essential for validating safety and correctness in complex systems. However, traditional model-checking techniques face severe scalability limitations due to the state explosion problem and their reliance on exhaustive symbolic traversal. Moreover, existing learning-based verification methods often lack formal guarantees and interpretability. These challenges create a pressing need for scalable, learning-based verification methods that preserve verification reliability while improving computational efficiency. This article introduces a novel deep reinforcement learning (DRL)-based model checking framework that learns to verify CTL formulas directly through interaction with system models. Unlike traditional symbolic model checkers such as NuSMV, the proposed DRL-CTL checker trained using proximal policy optimization (PPO) interprets CTL semantics over system models represented as Kripke structures without performing symbolic state-space traversal at inference time. Reward functions are designed for individual CTL operators, and fixed-point reasoning is incorporated to handle global temporal properties such as $AG(phi)$ and $EG(phi)$ . Experimental results show that the proposed method achieves near-constant inference time of approximately 2 ms per formula on an Intel Core i9-13900K CPU (24 cores, 3.0 GHz), 64 GB RAM, NVIDIA RTX 4090 GPU (24 GB VRAM), reduces verification time by up to 90% compared with traditional model checkers, and scales to models with more than $10^{1192}$ reachable states. The framework also produces witnesses and counterexamples and yields verification outcomes identical to those of symbolic checkers in our experiments. These results highlight the potential of DRL to serve as a scalable, efficient, and explainable alternative to classical CTL model checking.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"13 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147731258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-21DOI: 10.1109/tnnls.2026.3675892
Leilei Cui, Zhong-Ping Jiang, Petter N. Kolm, Grégoire G. Macqueron
{"title":"A Fully Data-Driven Value Iteration for Stochastic LQR: Convergence, Robustness, and Stability","authors":"Leilei Cui, Zhong-Ping Jiang, Petter N. Kolm, Grégoire G. Macqueron","doi":"10.1109/tnnls.2026.3675892","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3675892","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"21 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147731795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-04-21DOI: 10.1109/tnnls.2026.3684128
Jianqi Zhong,Junyu Shi,Wenming Cao
Graph Convolutional Networks (GCNs) have exhibited considerable promise in 3-D skeleton-based human motion prediction. Based on the intuitive observation that human motion can be delineated through the physical interconnections among human joints, many previous works have designed multiscale graphs to learn the relationships and constraints between different graph scales, obtaining encouraging results for human motion prediction. However, these fixed multiscale graphs obtain new scale graphs by merging adjacent human joint information, ignoring implicit semantic information during dynamic movements. Furthermore, human joint correlations tend to vary randomly as the depth of the multiscale clustering graph increases, which contradicts the design concept of fixed multiscale graphs. To address these limitations, we explore a novel correlation-based multiscale graph clustering network (CMGC) for adaptive multiscale graph representation learning. Given a human joints graph, the goal of CMGC is first to generate more new graphs representing motion correlations adaptively at different scale levels and then selectively restore the derived graph scales to the original human joints graphs, which enables various motion features extraction. Moreover, we introduce the discrete wavelet transform (DWT) to compensate for the signal loss caused by discrete cosine transform (DCT) domain modeling from human motion. The CMGC gives rise to gratifying performances with the adaptive multiscale graph. Extensive experiments reveal that CMGC outperforms state-of-the-art methods by 11.2%, 10.1%, and 11.2% of 3-D mean per joint position error (MPJPE) on average on Human 3.6M, CMU Mocap, and 3DPW datasets, respectively. We also test the mean angle error (MAE) on Human3.6M, which is lower by 6.5% than previous methods. Our code is released at https://github.com/JunyuShi02/CMGC.
{"title":"Multiscale Graph Redefining: Correlation-Based Multiscale Graph Clustering Network for Human Motion Prediction.","authors":"Jianqi Zhong,Junyu Shi,Wenming Cao","doi":"10.1109/tnnls.2026.3684128","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3684128","url":null,"abstract":"Graph Convolutional Networks (GCNs) have exhibited considerable promise in 3-D skeleton-based human motion prediction. Based on the intuitive observation that human motion can be delineated through the physical interconnections among human joints, many previous works have designed multiscale graphs to learn the relationships and constraints between different graph scales, obtaining encouraging results for human motion prediction. However, these fixed multiscale graphs obtain new scale graphs by merging adjacent human joint information, ignoring implicit semantic information during dynamic movements. Furthermore, human joint correlations tend to vary randomly as the depth of the multiscale clustering graph increases, which contradicts the design concept of fixed multiscale graphs. To address these limitations, we explore a novel correlation-based multiscale graph clustering network (CMGC) for adaptive multiscale graph representation learning. Given a human joints graph, the goal of CMGC is first to generate more new graphs representing motion correlations adaptively at different scale levels and then selectively restore the derived graph scales to the original human joints graphs, which enables various motion features extraction. Moreover, we introduce the discrete wavelet transform (DWT) to compensate for the signal loss caused by discrete cosine transform (DCT) domain modeling from human motion. The CMGC gives rise to gratifying performances with the adaptive multiscale graph. Extensive experiments reveal that CMGC outperforms state-of-the-art methods by 11.2%, 10.1%, and 11.2% of 3-D mean per joint position error (MPJPE) on average on Human 3.6M, CMU Mocap, and 3DPW datasets, respectively. We also test the mean angle error (MAE) on Human3.6M, which is lower by 6.5% than previous methods. Our code is released at https://github.com/JunyuShi02/CMGC.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"322 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147731259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The performance of deep neural networks (DNNs) in accomplishing tasks heavily relies on feature selection and sparse representation of high-dimensional data. Previous work has treated feature selection and sparse representation as separate mechanisms for improving DNNs performance, focusing on identifying and leveraging informative features to enhance task-specific outcomes. However, few studies have established a connection between feature selection and sparse representation. To address this gap, this article proposes an optimization framework termed informative sparse transport (IST), which integrates feature selection and sparse coding into a unified multiobjective optimization framework. Using optimal transport as a bridge, the IST framework harmonizes the relationship between feature selection and sparse representation, offering an informational advantage. In the IST framework, feature selection aims to identify an optimal subset of features to maximize mutual information or minimize redundancy, while sparse representation seeks to approximate data with the fewest possible features. Although these objectives differ, they are fundamentally complementary, as both emphasize extracting task-relevant information while eliminating redundancy. By unifying feature selection and sparse representation, the IST framework effectively mitigates challenges posed by high-dimensional data, delivering a robust solution for enhanced feature extraction and representation. We validate the IST framework on generative and classification tasks, demonstrating IST framework improves model performance through the complementary synergy of feature selection and sparse representation.
{"title":"A Deep Neural Network Optimization Framework Based on Optimal Transport Bridge Feature Selection and Sparse Representation.","authors":"Guipeng Lan,Shuai Xiao,Jiabao Wen,Jiachen Yang,Wen Lu,Baihua Li,Qinggang Meng,Xinbo Gao","doi":"10.1109/tnnls.2026.3678220","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3678220","url":null,"abstract":"The performance of deep neural networks (DNNs) in accomplishing tasks heavily relies on feature selection and sparse representation of high-dimensional data. Previous work has treated feature selection and sparse representation as separate mechanisms for improving DNNs performance, focusing on identifying and leveraging informative features to enhance task-specific outcomes. However, few studies have established a connection between feature selection and sparse representation. To address this gap, this article proposes an optimization framework termed informative sparse transport (IST), which integrates feature selection and sparse coding into a unified multiobjective optimization framework. Using optimal transport as a bridge, the IST framework harmonizes the relationship between feature selection and sparse representation, offering an informational advantage. In the IST framework, feature selection aims to identify an optimal subset of features to maximize mutual information or minimize redundancy, while sparse representation seeks to approximate data with the fewest possible features. Although these objectives differ, they are fundamentally complementary, as both emphasize extracting task-relevant information while eliminating redundancy. By unifying feature selection and sparse representation, the IST framework effectively mitigates challenges posed by high-dimensional data, delivering a robust solution for enhanced feature extraction and representation. We validate the IST framework on generative and classification tasks, demonstrating IST framework improves model performance through the complementary synergy of feature selection and sparse representation.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"242 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147702139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}