Pub Date : 2026-01-16DOI: 10.1109/TBDATA.2026.3652336
{"title":"2025 Reviewers List*","authors":"","doi":"10.1109/TBDATA.2026.3652336","DOIUrl":"https://doi.org/10.1109/TBDATA.2026.3652336","url":null,"abstract":"","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"301-306"},"PeriodicalIF":5.7,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11357242","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1109/TBDATA.2025.3639954
Xiaoxuan Zhang;Xiujuan Lei;Ling Guo;Ming Chen;Fang-Xiang Wu;Yi Pan
MicroRNAs (miRNAs) play a vital role in regulating a wide range of biological functions and are key players in the development of many complex human diseases, making them novel therapeutic targets for drug development. Given the high expenses and time demands of traditional experimental methods, it is essential to develop efficient computational approaches for predicting miRNA-drug interactions (MDIs). This article presents a dual-channel learning framework, SSMDI, based on structural features and Signed Bipartite Graph Neural Network (SBGNN) for predicting MDIs. Firstly, Graph Isomorphism Networks (GIN) is employed to extract molecular graph features of drugs. Meanwhile, a combined framework of Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (BiLSTM) network and Self-attention Mechanism is utilized to capture sequence features of miRNAs. Compared with traditional networks, signed networks can deliver richer semantic information in drugs and miRNAs. Therefore, SBGNN is then used to aggregate and update the signed topological features of miRNAs and drugs. Finally, structural and signed topological features are integrated to predict MDIs. The predictive performance of the model is evaluated using 5-fold cross-validation (CV), achieving AUC of 0.9447 and AUPR of 0.9238. The case study further demonstrates the effectiveness of SSMDI in predicting MDIs. In summary, the SSMDI model proves to be an accurate tool for predicting MDIs, which holds significant implications for drug development and miRNA-based therapeutic research.
{"title":"Dual-Channel Learning Framework for miRNA-Drug Interaction Prediction Based on Structural Features and Signed Bipartite Graph Neural Network","authors":"Xiaoxuan Zhang;Xiujuan Lei;Ling Guo;Ming Chen;Fang-Xiang Wu;Yi Pan","doi":"10.1109/TBDATA.2025.3639954","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3639954","url":null,"abstract":"MicroRNAs (miRNAs) play a vital role in regulating a wide range of biological functions and are key players in the development of many complex human diseases, making them novel therapeutic targets for drug development. Given the high expenses and time demands of traditional experimental methods, it is essential to develop efficient computational approaches for predicting miRNA-drug interactions (MDIs). This article presents a dual-channel learning framework, SSMDI, based on structural features and Signed Bipartite Graph Neural Network (SBGNN) for predicting MDIs. Firstly, Graph Isomorphism Networks (GIN) is employed to extract molecular graph features of drugs. Meanwhile, a combined framework of Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (BiLSTM) network and Self-attention Mechanism is utilized to capture sequence features of miRNAs. Compared with traditional networks, signed networks can deliver richer semantic information in drugs and miRNAs. Therefore, SBGNN is then used to aggregate and update the signed topological features of miRNAs and drugs. Finally, structural and signed topological features are integrated to predict MDIs. The predictive performance of the model is evaluated using 5-fold cross-validation (CV), achieving AUC of 0.9447 and AUPR of 0.9238. The case study further demonstrates the effectiveness of SSMDI in predicting MDIs. In summary, the SSMDI model proves to be an accurate tool for predicting MDIs, which holds significant implications for drug development and miRNA-based therapeutic research.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 2","pages":"688-701"},"PeriodicalIF":5.7,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rapid growth of deep learning models and the increasing demand for large-scale datasets have posed unprecedented challenges for data loading and memory management. Existing frameworks (e.g., PyTorch, TensorFlow) often encounter performance bottlenecks when handling large datasets resulting in inefficiencies and excessive memory usage. To address these issues, we propose Lafa, a dynamic metadata loading mechanism optimized for efficient large-scale dataset processing. Lafa introduces the. Lafa format and an adaptive loading strategy with three modes to balance memory usage and loading performance, along with a local shuffle approach that reduces memory overhead and computational complexity while preserving data randomness. Experimental results on GPU (RTX 3090) and Ascend (910 A) platforms demonstrate that Lafa significantly improves memory efficiency compared to existing frameworks. Specifically, for every 10 million samples loaded, Lafa reduces additional memory consumption by a factor of 1.33× to 31.34× across various dataset types, relative to the most memory-efficient baseline among PyTorch, TensorFlow, and MindSpore.
{"title":"Lafa: Unlocking Superior Memory Efficiency via Adaptive Metadata Strategy for Scalable Large-Scale Dataset Loading","authors":"Cong Wang;Yang Luo;Ke Wang;Hui Zhang;Naijie Gu;Ran Zhang;Wenzhuo Du;Fan Yu;Jun Yu","doi":"10.1109/TBDATA.2025.3640011","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3640011","url":null,"abstract":"The rapid growth of deep learning models and the increasing demand for large-scale datasets have posed unprecedented challenges for data loading and memory management. Existing frameworks (e.g., PyTorch, TensorFlow) often encounter performance bottlenecks when handling large datasets resulting in inefficiencies and excessive memory usage. To address these issues, we propose Lafa, a dynamic metadata loading mechanism optimized for efficient large-scale dataset processing. Lafa introduces the. Lafa format and an adaptive loading strategy with three modes to balance memory usage and loading performance, along with a local shuffle approach that reduces memory overhead and computational complexity while preserving data randomness. Experimental results on GPU (RTX 3090) and Ascend (910 A) platforms demonstrate that Lafa significantly improves memory efficiency compared to existing frameworks. Specifically, for every 10 million samples loaded, Lafa reduces additional memory consumption by a factor of 1.33× to 31.34× across various dataset types, relative to the most memory-efficient baseline among PyTorch, TensorFlow, and MindSpore.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 2","pages":"674-687"},"PeriodicalIF":5.7,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1109/TBDATA.2025.3639917
Mohammad Maksood Akhter;Abdul Atif Khan;Rashmi Maheshwari;Sraban Kumar Mohanty
With the exponential growth of Big Data in domains such as healthcare, genomics, and sensor networks, computationally efficient and effective clustering techniques have become essential for uncovering meaningful patterns. Traditional clustering methods face fundamental limitations in Big Data analysis. K-means is among the fastest known approaches, but it fails to capture non-spherical clusters. Hierarchical clustering can detect arbitrary shapes but suffers from sub-cubic complexity, while many state-of-the-art methods still incur quadratic complexity. Moreover, most existing approaches fail to capture the intrinsic structure of data. In this context, graph-based clustering has emerged as a powerful alternative due to its ability to model geometric relationships and reveal underlying structures. However, existing graph-based techniques typically incur quadratic complexity, limiting their scalability. The objective of this work is to develop a scalable graph-based clustering framework that reduces complexity while preserving clustering quality in large, noisy, and high-dimensional datasets. To achieve this, we propose a fast graph clustering framework with overall complexity $mathcal {O}(N lg N)$, where $N$ denotes the number of data points. The method employs a two-stage dispersion-based partitioning to generate cohesive sub-clusters, followed by the construction of a sparse graph on sub-cluster centers to efficiently capture adjacency. Sub-clusters are then merged iteratively using a gravitational-force-inspired attraction model, enabling the discovery of coherent structures with reduced computation. Extensive experiments on 41 multi-scale datasets demonstrate that our method consistently outperforms traditional and state-of-the-art approaches, achieving average 27.33% higher clustering accuracy while reducing runtime by more than 86.64% on average. These results highlight both the innovation and the effectiveness of the proposed approach, making it highly suitable for Big Data analytics.
随着大数据在医疗保健、基因组学和传感器网络等领域的指数级增长,计算效率高且有效的聚类技术对于发现有意义的模式至关重要。传统的聚类方法在大数据分析中面临着根本性的局限性。K-means是已知最快的方法之一,但它无法捕获非球形簇。分层聚类可以检测任意形状,但存在次立方复杂度,而许多最先进的方法仍然存在二次复杂度。此外,大多数现有方法都无法捕捉数据的内在结构。在这种情况下,基于图的聚类由于其建模几何关系和揭示底层结构的能力而成为一种强大的替代方案。然而,现有的基于图的技术通常会产生二次复杂度,限制了它们的可扩展性。这项工作的目标是开发一个可扩展的基于图的聚类框架,以降低复杂性,同时保持大型、嘈杂和高维数据集的聚类质量。为了实现这一点,我们提出了一个整体复杂度为$mathcal {O}(N lg N)$的快速图聚类框架,其中$N$表示数据点的数量。该方法采用基于分散的两阶段划分来生成内聚子簇,然后在子簇中心构造稀疏图以有效捕获邻接关系。然后使用引力启发的吸引力模型迭代合并子簇,从而减少计算量,从而发现相干结构。在41个多尺度数据集上的大量实验表明,我们的方法始终优于传统和最先进的方法,平均提高27.33%的聚类精度,平均减少86.64%以上的运行时间。这些结果突出了所提出方法的创新性和有效性,使其非常适合大数据分析。
{"title":"A Fast Linearithmic Graph Clustering Approach for Big Data Using Gravitational Attraction Principle","authors":"Mohammad Maksood Akhter;Abdul Atif Khan;Rashmi Maheshwari;Sraban Kumar Mohanty","doi":"10.1109/TBDATA.2025.3639917","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3639917","url":null,"abstract":"With the exponential growth of Big Data in domains such as healthcare, genomics, and sensor networks, computationally efficient and effective clustering techniques have become essential for uncovering meaningful patterns. Traditional clustering methods face fundamental limitations in Big Data analysis. K-means is among the fastest known approaches, but it fails to capture non-spherical clusters. Hierarchical clustering can detect arbitrary shapes but suffers from sub-cubic complexity, while many state-of-the-art methods still incur quadratic complexity. Moreover, most existing approaches fail to capture the intrinsic structure of data. In this context, graph-based clustering has emerged as a powerful alternative due to its ability to model geometric relationships and reveal underlying structures. However, existing graph-based techniques typically incur quadratic complexity, limiting their scalability. The objective of this work is to develop a scalable graph-based clustering framework that reduces complexity while preserving clustering quality in large, noisy, and high-dimensional datasets. To achieve this, we propose a fast graph clustering framework with overall complexity <inline-formula><tex-math>$mathcal {O}(N lg N)$</tex-math></inline-formula>, where <inline-formula><tex-math>$N$</tex-math></inline-formula> denotes the number of data points. The method employs a two-stage dispersion-based partitioning to generate cohesive sub-clusters, followed by the construction of a sparse graph on sub-cluster centers to efficiently capture adjacency. Sub-clusters are then merged iteratively using a gravitational-force-inspired attraction model, enabling the discovery of coherent structures with reduced computation. Extensive experiments on 41 multi-scale datasets demonstrate that our method consistently outperforms traditional and state-of-the-art approaches, achieving average 27.33% higher clustering accuracy while reducing runtime by more than 86.64% on average. These results highlight both the innovation and the effectiveness of the proposed approach, making it highly suitable for Big Data analytics.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 2","pages":"661-673"},"PeriodicalIF":5.7,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13DOI: 10.1109/TBDATA.2025.3621144
Yali Feng;Zhifeng Hao;Wen Wen;Ruichu Cai
Temporal recommendation is an important class of tasks in recommender systems, which focuses on modeling and capturing temporal patterns in user behavior to achieve finer-grained and higher-quality recommendations. In real-world scenario, users’ temporal behaviors are not only characterized by sequential dependencies among consecutive items, but also by periodic correlations of different items and time-varying similarity of different users. In this paper, we propose an Adaptive Temporal Recommendation (AdaTR) algorithm to capture the inherent features of temporal behaviors and dynamic collaborative signals. Firstly, based on the periodic characteristics of user behaviors, the user-item interactions are counted and aggregated in different time segments across multiple periods, which forms the temporal user-item interaction matrix. Then, in order to capture the time-varying collaborative signals between different users, a deep spectral clustering (DSC) method is implemented on the temporal user-item interaction matrix, where the original representation of user-item interaction is projected into a latent space, and users’ temporal behaviors are clustered into different groups. Furthermore, an Adaptive Deep Matrix Factorization (AdaDMF) module is designed to learn the time-varying representations of user preferences on each cluster of temporal user behaviors, which incoporate dynamic collaborative signals among different users. Finally, we combine users’ short-term and long-term preferences to generate personalized temporal recommendations. Extensive experiments on four datasets demonstrate that AdaTR performs significantly better than the state-of-the-art baselines.
{"title":"Temporal Recommendation Based on Adaptive Deep Matrix Factorization","authors":"Yali Feng;Zhifeng Hao;Wen Wen;Ruichu Cai","doi":"10.1109/TBDATA.2025.3621144","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3621144","url":null,"abstract":"Temporal recommendation is an important class of tasks in recommender systems, which focuses on modeling and capturing temporal patterns in user behavior to achieve finer-grained and higher-quality recommendations. In real-world scenario, users’ temporal behaviors are not only characterized by sequential dependencies among consecutive items, but also by periodic correlations of different items and time-varying similarity of different users. In this paper, we propose an Adaptive Temporal Recommendation (AdaTR) algorithm to capture the inherent features of temporal behaviors and dynamic collaborative signals. Firstly, based on the periodic characteristics of user behaviors, the user-item interactions are counted and aggregated in different time segments across multiple periods, which forms the temporal user-item interaction matrix. Then, in order to capture the time-varying collaborative signals between different users, a deep spectral clustering (DSC) method is implemented on the temporal user-item interaction matrix, where the original representation of user-item interaction is projected into a latent space, and users’ temporal behaviors are clustered into different groups. Furthermore, an Adaptive Deep Matrix Factorization (AdaDMF) module is designed to learn the time-varying representations of user preferences on each cluster of temporal user behaviors, which incoporate dynamic collaborative signals among different users. Finally, we combine users’ short-term and long-term preferences to generate personalized temporal recommendations. Extensive experiments on four datasets demonstrate that AdaTR performs significantly better than the state-of-the-art baselines.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"288-300"},"PeriodicalIF":5.7,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-07DOI: 10.1109/TBDATA.2025.3618453
Jiankai Zuo;Zihao Yao;Yaying Zhang
Next POI recommendation plays a crucial role in delivering personalized location-based services, but it faces significant challenges in capturing complex user behavior and adapting to dynamic interest distributions. Most methods often provide insufficient modeling of implicit features in user trajectories, such as directional transitions and latent edge relationships, which are essential for understanding user behavior. Moreover, existing diffusion models, constrained by Gaussian priors, struggle to handle the diverse and evolving nature of user preferences. The lack of a unified scheduling for noise and sampling also limits the flexibility of diffusion models. In this paper, we propose a Unified Bridge-based Diffusion model (UB-Diff) for the next POI recommendation. UB-Diff incorporates a direction-aware POI transition graph learning, which jointly captures spatio-temporal and directional features. To overcome the limitations of Gaussian priors, we introduce a bridge-based diffusion POI generative model. It can achieve distribution translation from the user’s historical distribution to the target distribution by learning a bridge to associate user behavior with POI recommendation, adapting to dynamic user interests. In the end, we design a novel intermediate function to unify the diffusion process, enabling precise control over noise scheduling and modular optimization. Extensive experiments on five real-world datasets demonstrate the superiority of UB-Diff over advanced baseline methods.
Next POI推荐在提供个性化的基于位置的服务中起着至关重要的作用,但它在捕获复杂的用户行为和适应动态兴趣分布方面面临着重大挑战。大多数方法往往对用户轨迹中的隐式特征建模不足,例如方向转换和潜在边缘关系,而这些特征对于理解用户行为至关重要。此外,现有的扩散模型受到高斯先验的约束,难以处理用户偏好的多样性和不断演变的本质。缺乏对噪声和采样的统一调度也限制了扩散模型的灵活性。在本文中,我们为下一个POI建议提出了一个统一的基于桥的扩散模型(UB-Diff)。UB-Diff结合了一个方向感知的POI过渡图学习,它可以联合捕获时空和方向特征。为了克服高斯先验的局限性,我们引入了一种基于桥的扩散POI生成模型。通过学习将用户行为与POI推荐相关联的桥梁,适应动态用户兴趣,实现从用户历史分布到目标分布的分布转换。最后,我们设计了一个新的中间函数来统一扩散过程,实现对噪声调度的精确控制和模块化优化。在五个真实数据集上进行的大量实验表明,UB-Diff优于先进的基线方法。
{"title":"Bridging User Dynamic Preferences: A Unified Bridge-Based Diffusion Model for Next POI Recommendation","authors":"Jiankai Zuo;Zihao Yao;Yaying Zhang","doi":"10.1109/TBDATA.2025.3618453","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618453","url":null,"abstract":"Next POI recommendation plays a crucial role in delivering personalized location-based services, but it faces significant challenges in capturing complex user behavior and adapting to dynamic interest distributions. Most methods often provide insufficient modeling of implicit features in user trajectories, such as directional transitions and latent edge relationships, which are essential for understanding user behavior. Moreover, existing diffusion models, constrained by Gaussian priors, struggle to handle the diverse and evolving nature of user preferences. The lack of a unified scheduling for noise and sampling also limits the flexibility of diffusion models. In this paper, we propose a Unified Bridge-based Diffusion model (UB-Diff) for the next POI recommendation. UB-Diff incorporates a direction-aware POI transition graph learning, which jointly captures spatio-temporal and directional features. To overcome the limitations of Gaussian priors, we introduce a bridge-based diffusion POI generative model. It can achieve distribution translation from the user’s historical distribution to the target distribution by learning a bridge to associate user behavior with POI recommendation, adapting to dynamic user interests. In the end, we design a novel intermediate function to unify the diffusion process, enabling precise control over noise scheduling and modular optimization. Extensive experiments on five real-world datasets demonstrate the superiority of UB-Diff over advanced baseline methods.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"261-275"},"PeriodicalIF":5.7,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-07DOI: 10.1109/TBDATA.2025.3618472
Li He;Hong Zhang
Nyström approximation is one of the most popular approximation methods to accelerate kernel analysis on large-scale data sets. Nyström employs one single landmark set to obtain eigenvectors (low-rank decomposition) and projects the entire data set to the eigenvectors (embedding). Most existing methods focus on accelerating landmark selection. For extremely large-scale data sets, however, the embedding time cost, rather than that of low-rank decomposition, is critical. In addition, both accuracy and embedding time cost are dominated by the landmark set size. As a result, using more landmarks is the only way to improve accuracy at the cost of extremely high embedding costs. In this paper, we propose a method for the first time to decouple embedding cost from that of low-rank decomposition. We first obtain the eigenvectors from a large landmark set for a low error, and then optimize a small landmark set that minimizes the landmark-set-embedding error to ensure a low embedding cost. In return, our accuracy is close to that of the large landmark set but the small one dominates the embedding time cost. Our method can deal with popular kernels and be plugged into most existing methods. Experimental results demonstrate the superiority of the proposed method.
{"title":"Two-Step Nyström Sampling for Large-Scale Kernel Approximation","authors":"Li He;Hong Zhang","doi":"10.1109/TBDATA.2025.3618472","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618472","url":null,"abstract":"Nyström approximation is one of the most popular approximation methods to accelerate kernel analysis on large-scale data sets. Nyström employs one single landmark set to obtain eigenvectors (low-rank decomposition) and projects the entire data set to the eigenvectors (embedding). Most existing methods focus on accelerating landmark selection. For extremely large-scale data sets, however, the embedding time cost, rather than that of low-rank decomposition, is critical. In addition, both accuracy and embedding time cost are dominated by the landmark set size. As a result, using more landmarks is the <italic>only</i> way to improve accuracy at the cost of extremely high embedding costs. In this paper, we propose a method for the first time to decouple embedding cost from that of low-rank decomposition. We first obtain the eigenvectors from a large landmark set for a low error, and then optimize a small landmark set that minimizes the landmark-set-embedding error to ensure a low embedding cost. In return, our accuracy is close to that of the large landmark set but the small one dominates the embedding time cost. Our method can deal with popular kernels and be plugged into most existing methods. Experimental results demonstrate the superiority of the proposed method.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"249-260"},"PeriodicalIF":5.7,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-07DOI: 10.1109/TBDATA.2025.3618447
Haimin Zhang;Jiaohao Xia;Min Xu
Combining the message-passing paradigm with the global attention mechanism has emerged as an effective framework for learning over graphs. The message-passing paradigm and the global attention mechanism basically generate embeddings of nodes by taking the sum of information from a node’s local neighbourhood and from the entire graph, respectively. However, this simple summation aggregation approach fails to distinguish between the information from a node itself or from the node’s neighbours. Therefore, there exists information lost at each layer of embedding generation, and this information lost could be accumulated and become more serious in deeper model layers. In this paper, we present a differential encoding method to address the issue of information lost. Instead of simply taking the sum to aggregate local or global information, we explicitly encode the difference between the information from a node itself and that from the node’s local neighbours (or from the rest of the entire graph nodes). The obtained differential encoding is then combined with the original aggregated representation to generate the updated node embedding. By combining differential encodings, the representational ability of generated node embeddings is improved, and therefore the model performance is improved. The differential encoding method is empirically evaluated on different graph tasks on seven benchmark datasets. The results show that it is a general method that improves the message-passing update and the global attention update, advancing the state-of-the-art performance for graph representation learning on these benchmark datasets.
{"title":"Differential Encoding for Improved Representation Learning Over Graphs","authors":"Haimin Zhang;Jiaohao Xia;Min Xu","doi":"10.1109/TBDATA.2025.3618447","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618447","url":null,"abstract":"Combining the message-passing paradigm with the global attention mechanism has emerged as an effective framework for learning over graphs. The message-passing paradigm and the global attention mechanism basically generate embeddings of nodes by taking the sum of information from a node’s local neighbourhood and from the entire graph, respectively. However, this simple summation aggregation approach fails to distinguish between the information from a node itself or from the node’s neighbours. Therefore, there exists information lost at each layer of embedding generation, and this information lost could be accumulated and become more serious in deeper model layers. In this paper, we present a differential encoding method to address the issue of information lost. Instead of simply taking the sum to aggregate local or global information, we explicitly encode the difference between the information from a node itself and that from the node’s local neighbours (or from the rest of the entire graph nodes). The obtained differential encoding is then combined with the original aggregated representation to generate the updated node embedding. By combining differential encodings, the representational ability of generated node embeddings is improved, and therefore the model performance is improved. The differential encoding method is empirically evaluated on different graph tasks on seven benchmark datasets. The results show that it is a general method that improves the message-passing update and the global attention update, advancing the state-of-the-art performance for graph representation learning on these benchmark datasets.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"276-287"},"PeriodicalIF":5.7,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated Learning (FL), a groundbreaking approach for collaborative model training across decentralized devices, maintains data privacy while constructing a decent global machine learning model. Conventional FL methods typically demand more communication rounds to achieve convergence in non-Independent and non-Identically Distributed (non-IID) data scenarios due to their reliance on fixed Stochastic Gradient Descent (SGD) updates at each Communication Round (CR). In this paper, we introduce a novel strategy to expedite the convergence of FL models, inspired by the insights from McMahan et al.’s seminal work. We focus on FL convergence via traditional SGD decay by introducing a dynamic adjusting mechanism for local epochs and local batch size. Our method adapts the decay of SGD updates during the training process, akin to decaying learning rates in classical optimization. Particularly, by adaptively reducing local epochs and increasing local batch size using their ongoing values and the CR as the model progresses, our method enhances convergence speed without compromising accuracy, specifically by effectively addressing challenges posed by non-IID data. We provide theoretical results of the benefits of the dynamic decay of SGD updates in FL scenarios. We demonstrate our method’s consistent outperformance regarding the global model’s communication speedup and convergence behavior through comprehensive experiments.
{"title":"Fast Convergent Federated Learning via Decaying SGD Updates","authors":"Md Palash Uddin;Yong Xiang;Mahmudul Hasan;Yao Zhao;Youyang Qu;Longxiang Gao","doi":"10.1109/TBDATA.2025.3618454","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618454","url":null,"abstract":"Federated Learning (FL), a groundbreaking approach for collaborative model training across decentralized devices, maintains data privacy while constructing a decent global machine learning model. Conventional FL methods typically demand more communication rounds to achieve convergence in non-Independent and non-Identically Distributed (non-IID) data scenarios due to their reliance on fixed Stochastic Gradient Descent (SGD) updates at each Communication Round (CR). In this paper, we introduce a novel strategy to expedite the convergence of FL models, inspired by the insights from McMahan et al.’s seminal work. We focus on FL convergence via traditional SGD decay by introducing a dynamic adjusting mechanism for local epochs and local batch size. Our method adapts the decay of SGD updates during the training process, akin to decaying learning rates in classical optimization. Particularly, by adaptively reducing local epochs and increasing local batch size using their ongoing values and the CR as the model progresses, our method enhances convergence speed without compromising accuracy, specifically by effectively addressing challenges posed by non-IID data. We provide theoretical results of the benefits of the dynamic decay of SGD updates in FL scenarios. We demonstrate our method’s consistent outperformance regarding the global model’s communication speedup and convergence behavior through comprehensive experiments.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"186-199"},"PeriodicalIF":5.7,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.1109/TBDATA.2025.3618482
Chun-Wei Shen;Jia-Wei Jiang;Hsun-Ping Hsieh
The rapid advancement of spatio-temporal domain has led to a surge of novel models. These models can typically be decomposed into different modules, such as various types of graph neural networks and temporal networks. Notably, many of these models share identical or similar modules. However, the existing literature often relies on fragmented and self-constructed experimental frameworks. This fragmentation hinders a comprehensive understanding of model interrelationships and makes fair comparisons difficult due to inconsistent training and evaluation processes. To address these issues, we introduce Spatio-Temporal Gym (STGym), an innovative modular benchmark that provides a platform for exploring various spatio-temporal models and supports research for developers. The modular design of STGym facilitates an in-depth analysis of model components and promotes the seamless adoption and extension of existing methods. By standardizing the training and evaluation processes, STGym ensures reproducibility and scalability, enabling fair comparisons across different models. In this paper, we use traffic forecasting, a popular research topic in the spatio-temporal domain, as a case to demonstrate the capabilities of the STGym. Our detailed survey systematically utilizes the modular framework of STGym to organize key modules into various models, thereby facilitating deeper insights into their structures and mechanisms. We also evaluate 18 models on six widely used traffic forecasting datasets and analyze critical hyperparameters to reveal their impact on performance. This study provides valuable resources and insights for developers and researchers.
{"title":"STGym: A Modular Benchmark for Spatio-Temporal Networks With a Survey and Case Study on Traffic Forecasting","authors":"Chun-Wei Shen;Jia-Wei Jiang;Hsun-Ping Hsieh","doi":"10.1109/TBDATA.2025.3618482","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618482","url":null,"abstract":"The rapid advancement of spatio-temporal domain has led to a surge of novel models. These models can typically be decomposed into different modules, such as various types of graph neural networks and temporal networks. Notably, many of these models share identical or similar modules. However, the existing literature often relies on fragmented and self-constructed experimental frameworks. This fragmentation hinders a comprehensive understanding of model interrelationships and makes fair comparisons difficult due to inconsistent training and evaluation processes. To address these issues, we introduce Spatio-Temporal Gym (STGym), an innovative modular benchmark that provides a platform for exploring various spatio-temporal models and supports research for developers. The modular design of STGym facilitates an in-depth analysis of model components and promotes the seamless adoption and extension of existing methods. By standardizing the training and evaluation processes, STGym ensures reproducibility and scalability, enabling fair comparisons across different models. In this paper, we use traffic forecasting, a popular research topic in the spatio-temporal domain, as a case to demonstrate the capabilities of the STGym. Our detailed survey systematically utilizes the modular framework of STGym to organize key modules into various models, thereby facilitating deeper insights into their structures and mechanisms. We also evaluate 18 models on six widely used traffic forecasting datasets and analyze critical hyperparameters to reveal their impact on performance. This study provides valuable resources and insights for developers and researchers.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"15-33"},"PeriodicalIF":5.7,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}