Pub Date : 2026-01-16DOI: 10.1109/TBDATA.2026.3652336
{"title":"2025 Reviewers List*","authors":"","doi":"10.1109/TBDATA.2026.3652336","DOIUrl":"https://doi.org/10.1109/TBDATA.2026.3652336","url":null,"abstract":"","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"301-306"},"PeriodicalIF":5.7,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11357242","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13DOI: 10.1109/TBDATA.2025.3621144
Yali Feng;Zhifeng Hao;Wen Wen;Ruichu Cai
Temporal recommendation is an important class of tasks in recommender systems, which focuses on modeling and capturing temporal patterns in user behavior to achieve finer-grained and higher-quality recommendations. In real-world scenario, users’ temporal behaviors are not only characterized by sequential dependencies among consecutive items, but also by periodic correlations of different items and time-varying similarity of different users. In this paper, we propose an Adaptive Temporal Recommendation (AdaTR) algorithm to capture the inherent features of temporal behaviors and dynamic collaborative signals. Firstly, based on the periodic characteristics of user behaviors, the user-item interactions are counted and aggregated in different time segments across multiple periods, which forms the temporal user-item interaction matrix. Then, in order to capture the time-varying collaborative signals between different users, a deep spectral clustering (DSC) method is implemented on the temporal user-item interaction matrix, where the original representation of user-item interaction is projected into a latent space, and users’ temporal behaviors are clustered into different groups. Furthermore, an Adaptive Deep Matrix Factorization (AdaDMF) module is designed to learn the time-varying representations of user preferences on each cluster of temporal user behaviors, which incoporate dynamic collaborative signals among different users. Finally, we combine users’ short-term and long-term preferences to generate personalized temporal recommendations. Extensive experiments on four datasets demonstrate that AdaTR performs significantly better than the state-of-the-art baselines.
{"title":"Temporal Recommendation Based on Adaptive Deep Matrix Factorization","authors":"Yali Feng;Zhifeng Hao;Wen Wen;Ruichu Cai","doi":"10.1109/TBDATA.2025.3621144","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3621144","url":null,"abstract":"Temporal recommendation is an important class of tasks in recommender systems, which focuses on modeling and capturing temporal patterns in user behavior to achieve finer-grained and higher-quality recommendations. In real-world scenario, users’ temporal behaviors are not only characterized by sequential dependencies among consecutive items, but also by periodic correlations of different items and time-varying similarity of different users. In this paper, we propose an Adaptive Temporal Recommendation (AdaTR) algorithm to capture the inherent features of temporal behaviors and dynamic collaborative signals. Firstly, based on the periodic characteristics of user behaviors, the user-item interactions are counted and aggregated in different time segments across multiple periods, which forms the temporal user-item interaction matrix. Then, in order to capture the time-varying collaborative signals between different users, a deep spectral clustering (DSC) method is implemented on the temporal user-item interaction matrix, where the original representation of user-item interaction is projected into a latent space, and users’ temporal behaviors are clustered into different groups. Furthermore, an Adaptive Deep Matrix Factorization (AdaDMF) module is designed to learn the time-varying representations of user preferences on each cluster of temporal user behaviors, which incoporate dynamic collaborative signals among different users. Finally, we combine users’ short-term and long-term preferences to generate personalized temporal recommendations. Extensive experiments on four datasets demonstrate that AdaTR performs significantly better than the state-of-the-art baselines.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"288-300"},"PeriodicalIF":5.7,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-07DOI: 10.1109/TBDATA.2025.3618453
Jiankai Zuo;Zihao Yao;Yaying Zhang
Next POI recommendation plays a crucial role in delivering personalized location-based services, but it faces significant challenges in capturing complex user behavior and adapting to dynamic interest distributions. Most methods often provide insufficient modeling of implicit features in user trajectories, such as directional transitions and latent edge relationships, which are essential for understanding user behavior. Moreover, existing diffusion models, constrained by Gaussian priors, struggle to handle the diverse and evolving nature of user preferences. The lack of a unified scheduling for noise and sampling also limits the flexibility of diffusion models. In this paper, we propose a Unified Bridge-based Diffusion model (UB-Diff) for the next POI recommendation. UB-Diff incorporates a direction-aware POI transition graph learning, which jointly captures spatio-temporal and directional features. To overcome the limitations of Gaussian priors, we introduce a bridge-based diffusion POI generative model. It can achieve distribution translation from the user’s historical distribution to the target distribution by learning a bridge to associate user behavior with POI recommendation, adapting to dynamic user interests. In the end, we design a novel intermediate function to unify the diffusion process, enabling precise control over noise scheduling and modular optimization. Extensive experiments on five real-world datasets demonstrate the superiority of UB-Diff over advanced baseline methods.
Next POI推荐在提供个性化的基于位置的服务中起着至关重要的作用,但它在捕获复杂的用户行为和适应动态兴趣分布方面面临着重大挑战。大多数方法往往对用户轨迹中的隐式特征建模不足,例如方向转换和潜在边缘关系,而这些特征对于理解用户行为至关重要。此外,现有的扩散模型受到高斯先验的约束,难以处理用户偏好的多样性和不断演变的本质。缺乏对噪声和采样的统一调度也限制了扩散模型的灵活性。在本文中,我们为下一个POI建议提出了一个统一的基于桥的扩散模型(UB-Diff)。UB-Diff结合了一个方向感知的POI过渡图学习,它可以联合捕获时空和方向特征。为了克服高斯先验的局限性,我们引入了一种基于桥的扩散POI生成模型。通过学习将用户行为与POI推荐相关联的桥梁,适应动态用户兴趣,实现从用户历史分布到目标分布的分布转换。最后,我们设计了一个新的中间函数来统一扩散过程,实现对噪声调度的精确控制和模块化优化。在五个真实数据集上进行的大量实验表明,UB-Diff优于先进的基线方法。
{"title":"Bridging User Dynamic Preferences: A Unified Bridge-Based Diffusion Model for Next POI Recommendation","authors":"Jiankai Zuo;Zihao Yao;Yaying Zhang","doi":"10.1109/TBDATA.2025.3618453","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618453","url":null,"abstract":"Next POI recommendation plays a crucial role in delivering personalized location-based services, but it faces significant challenges in capturing complex user behavior and adapting to dynamic interest distributions. Most methods often provide insufficient modeling of implicit features in user trajectories, such as directional transitions and latent edge relationships, which are essential for understanding user behavior. Moreover, existing diffusion models, constrained by Gaussian priors, struggle to handle the diverse and evolving nature of user preferences. The lack of a unified scheduling for noise and sampling also limits the flexibility of diffusion models. In this paper, we propose a Unified Bridge-based Diffusion model (UB-Diff) for the next POI recommendation. UB-Diff incorporates a direction-aware POI transition graph learning, which jointly captures spatio-temporal and directional features. To overcome the limitations of Gaussian priors, we introduce a bridge-based diffusion POI generative model. It can achieve distribution translation from the user’s historical distribution to the target distribution by learning a bridge to associate user behavior with POI recommendation, adapting to dynamic user interests. In the end, we design a novel intermediate function to unify the diffusion process, enabling precise control over noise scheduling and modular optimization. Extensive experiments on five real-world datasets demonstrate the superiority of UB-Diff over advanced baseline methods.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"261-275"},"PeriodicalIF":5.7,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-07DOI: 10.1109/TBDATA.2025.3618472
Li He;Hong Zhang
Nyström approximation is one of the most popular approximation methods to accelerate kernel analysis on large-scale data sets. Nyström employs one single landmark set to obtain eigenvectors (low-rank decomposition) and projects the entire data set to the eigenvectors (embedding). Most existing methods focus on accelerating landmark selection. For extremely large-scale data sets, however, the embedding time cost, rather than that of low-rank decomposition, is critical. In addition, both accuracy and embedding time cost are dominated by the landmark set size. As a result, using more landmarks is the only way to improve accuracy at the cost of extremely high embedding costs. In this paper, we propose a method for the first time to decouple embedding cost from that of low-rank decomposition. We first obtain the eigenvectors from a large landmark set for a low error, and then optimize a small landmark set that minimizes the landmark-set-embedding error to ensure a low embedding cost. In return, our accuracy is close to that of the large landmark set but the small one dominates the embedding time cost. Our method can deal with popular kernels and be plugged into most existing methods. Experimental results demonstrate the superiority of the proposed method.
{"title":"Two-Step Nyström Sampling for Large-Scale Kernel Approximation","authors":"Li He;Hong Zhang","doi":"10.1109/TBDATA.2025.3618472","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618472","url":null,"abstract":"Nyström approximation is one of the most popular approximation methods to accelerate kernel analysis on large-scale data sets. Nyström employs one single landmark set to obtain eigenvectors (low-rank decomposition) and projects the entire data set to the eigenvectors (embedding). Most existing methods focus on accelerating landmark selection. For extremely large-scale data sets, however, the embedding time cost, rather than that of low-rank decomposition, is critical. In addition, both accuracy and embedding time cost are dominated by the landmark set size. As a result, using more landmarks is the <italic>only</i> way to improve accuracy at the cost of extremely high embedding costs. In this paper, we propose a method for the first time to decouple embedding cost from that of low-rank decomposition. We first obtain the eigenvectors from a large landmark set for a low error, and then optimize a small landmark set that minimizes the landmark-set-embedding error to ensure a low embedding cost. In return, our accuracy is close to that of the large landmark set but the small one dominates the embedding time cost. Our method can deal with popular kernels and be plugged into most existing methods. Experimental results demonstrate the superiority of the proposed method.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"249-260"},"PeriodicalIF":5.7,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-07DOI: 10.1109/TBDATA.2025.3618447
Haimin Zhang;Jiaohao Xia;Min Xu
Combining the message-passing paradigm with the global attention mechanism has emerged as an effective framework for learning over graphs. The message-passing paradigm and the global attention mechanism basically generate embeddings of nodes by taking the sum of information from a node’s local neighbourhood and from the entire graph, respectively. However, this simple summation aggregation approach fails to distinguish between the information from a node itself or from the node’s neighbours. Therefore, there exists information lost at each layer of embedding generation, and this information lost could be accumulated and become more serious in deeper model layers. In this paper, we present a differential encoding method to address the issue of information lost. Instead of simply taking the sum to aggregate local or global information, we explicitly encode the difference between the information from a node itself and that from the node’s local neighbours (or from the rest of the entire graph nodes). The obtained differential encoding is then combined with the original aggregated representation to generate the updated node embedding. By combining differential encodings, the representational ability of generated node embeddings is improved, and therefore the model performance is improved. The differential encoding method is empirically evaluated on different graph tasks on seven benchmark datasets. The results show that it is a general method that improves the message-passing update and the global attention update, advancing the state-of-the-art performance for graph representation learning on these benchmark datasets.
{"title":"Differential Encoding for Improved Representation Learning Over Graphs","authors":"Haimin Zhang;Jiaohao Xia;Min Xu","doi":"10.1109/TBDATA.2025.3618447","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618447","url":null,"abstract":"Combining the message-passing paradigm with the global attention mechanism has emerged as an effective framework for learning over graphs. The message-passing paradigm and the global attention mechanism basically generate embeddings of nodes by taking the sum of information from a node’s local neighbourhood and from the entire graph, respectively. However, this simple summation aggregation approach fails to distinguish between the information from a node itself or from the node’s neighbours. Therefore, there exists information lost at each layer of embedding generation, and this information lost could be accumulated and become more serious in deeper model layers. In this paper, we present a differential encoding method to address the issue of information lost. Instead of simply taking the sum to aggregate local or global information, we explicitly encode the difference between the information from a node itself and that from the node’s local neighbours (or from the rest of the entire graph nodes). The obtained differential encoding is then combined with the original aggregated representation to generate the updated node embedding. By combining differential encodings, the representational ability of generated node embeddings is improved, and therefore the model performance is improved. The differential encoding method is empirically evaluated on different graph tasks on seven benchmark datasets. The results show that it is a general method that improves the message-passing update and the global attention update, advancing the state-of-the-art performance for graph representation learning on these benchmark datasets.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"276-287"},"PeriodicalIF":5.7,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated Learning (FL), a groundbreaking approach for collaborative model training across decentralized devices, maintains data privacy while constructing a decent global machine learning model. Conventional FL methods typically demand more communication rounds to achieve convergence in non-Independent and non-Identically Distributed (non-IID) data scenarios due to their reliance on fixed Stochastic Gradient Descent (SGD) updates at each Communication Round (CR). In this paper, we introduce a novel strategy to expedite the convergence of FL models, inspired by the insights from McMahan et al.’s seminal work. We focus on FL convergence via traditional SGD decay by introducing a dynamic adjusting mechanism for local epochs and local batch size. Our method adapts the decay of SGD updates during the training process, akin to decaying learning rates in classical optimization. Particularly, by adaptively reducing local epochs and increasing local batch size using their ongoing values and the CR as the model progresses, our method enhances convergence speed without compromising accuracy, specifically by effectively addressing challenges posed by non-IID data. We provide theoretical results of the benefits of the dynamic decay of SGD updates in FL scenarios. We demonstrate our method’s consistent outperformance regarding the global model’s communication speedup and convergence behavior through comprehensive experiments.
{"title":"Fast Convergent Federated Learning via Decaying SGD Updates","authors":"Md Palash Uddin;Yong Xiang;Mahmudul Hasan;Yao Zhao;Youyang Qu;Longxiang Gao","doi":"10.1109/TBDATA.2025.3618454","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618454","url":null,"abstract":"Federated Learning (FL), a groundbreaking approach for collaborative model training across decentralized devices, maintains data privacy while constructing a decent global machine learning model. Conventional FL methods typically demand more communication rounds to achieve convergence in non-Independent and non-Identically Distributed (non-IID) data scenarios due to their reliance on fixed Stochastic Gradient Descent (SGD) updates at each Communication Round (CR). In this paper, we introduce a novel strategy to expedite the convergence of FL models, inspired by the insights from McMahan et al.’s seminal work. We focus on FL convergence via traditional SGD decay by introducing a dynamic adjusting mechanism for local epochs and local batch size. Our method adapts the decay of SGD updates during the training process, akin to decaying learning rates in classical optimization. Particularly, by adaptively reducing local epochs and increasing local batch size using their ongoing values and the CR as the model progresses, our method enhances convergence speed without compromising accuracy, specifically by effectively addressing challenges posed by non-IID data. We provide theoretical results of the benefits of the dynamic decay of SGD updates in FL scenarios. We demonstrate our method’s consistent outperformance regarding the global model’s communication speedup and convergence behavior through comprehensive experiments.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"186-199"},"PeriodicalIF":5.7,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.1109/TBDATA.2025.3618482
Chun-Wei Shen;Jia-Wei Jiang;Hsun-Ping Hsieh
The rapid advancement of spatio-temporal domain has led to a surge of novel models. These models can typically be decomposed into different modules, such as various types of graph neural networks and temporal networks. Notably, many of these models share identical or similar modules. However, the existing literature often relies on fragmented and self-constructed experimental frameworks. This fragmentation hinders a comprehensive understanding of model interrelationships and makes fair comparisons difficult due to inconsistent training and evaluation processes. To address these issues, we introduce Spatio-Temporal Gym (STGym), an innovative modular benchmark that provides a platform for exploring various spatio-temporal models and supports research for developers. The modular design of STGym facilitates an in-depth analysis of model components and promotes the seamless adoption and extension of existing methods. By standardizing the training and evaluation processes, STGym ensures reproducibility and scalability, enabling fair comparisons across different models. In this paper, we use traffic forecasting, a popular research topic in the spatio-temporal domain, as a case to demonstrate the capabilities of the STGym. Our detailed survey systematically utilizes the modular framework of STGym to organize key modules into various models, thereby facilitating deeper insights into their structures and mechanisms. We also evaluate 18 models on six widely used traffic forecasting datasets and analyze critical hyperparameters to reveal their impact on performance. This study provides valuable resources and insights for developers and researchers.
{"title":"STGym: A Modular Benchmark for Spatio-Temporal Networks With a Survey and Case Study on Traffic Forecasting","authors":"Chun-Wei Shen;Jia-Wei Jiang;Hsun-Ping Hsieh","doi":"10.1109/TBDATA.2025.3618482","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618482","url":null,"abstract":"The rapid advancement of spatio-temporal domain has led to a surge of novel models. These models can typically be decomposed into different modules, such as various types of graph neural networks and temporal networks. Notably, many of these models share identical or similar modules. However, the existing literature often relies on fragmented and self-constructed experimental frameworks. This fragmentation hinders a comprehensive understanding of model interrelationships and makes fair comparisons difficult due to inconsistent training and evaluation processes. To address these issues, we introduce Spatio-Temporal Gym (STGym), an innovative modular benchmark that provides a platform for exploring various spatio-temporal models and supports research for developers. The modular design of STGym facilitates an in-depth analysis of model components and promotes the seamless adoption and extension of existing methods. By standardizing the training and evaluation processes, STGym ensures reproducibility and scalability, enabling fair comparisons across different models. In this paper, we use traffic forecasting, a popular research topic in the spatio-temporal domain, as a case to demonstrate the capabilities of the STGym. Our detailed survey systematically utilizes the modular framework of STGym to organize key modules into various models, thereby facilitating deeper insights into their structures and mechanisms. We also evaluate 18 models on six widely used traffic forecasting datasets and analyze critical hyperparameters to reveal their impact on performance. This study provides valuable resources and insights for developers and researchers.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"15-33"},"PeriodicalIF":5.7,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.1109/TBDATA.2025.3618463
Zhida Qin;Wenhao Xue;Haotian He;Haoyao Zhang;Shixiao Yang;Enjun Du;Tianyu Huang;John C.S. Lui
Multi-behavior Session Based Recommendations (MBSBRs) have achieved remarkable results due to considering behavioral heterogeneity in sessions. Yet most existing works only consider binary or continuous behavior dependencies and aim to predict the next item under the target behavior, neglecting users’ inherent behavior habits, resulting in learning inaccurate intentions. To tackle the above issues, we propose a novel Behavior Habits Enhanced Intention Learning framework for Session Based Recommendation (BHSBR). Specifically, we focus on the next item recommendation and design a global item transition graph to learn the behavior-aware semantic relationships between items, in order to mine the underlying similarity between items beyond the session. In addition, we construct a hypergraph to extract the diverse behavior habits of users and break through the limitations of temporal relationships in the session. Compared to the existing works, our behavior habit learning method learns behavior dependencies at the user level, which could capture the user’s more accurate long-term intentions and reduce the impact of noise behaviors. Extensive experiments on three datasets demonstrate that the performance of our proposed BHSBR is superior to SOTA. Further ablation experiments fully illustrate the effectiveness of our various modules.
{"title":"Behavior Habits Enhanced Intention Learning for Session Based Recommendation","authors":"Zhida Qin;Wenhao Xue;Haotian He;Haoyao Zhang;Shixiao Yang;Enjun Du;Tianyu Huang;John C.S. Lui","doi":"10.1109/TBDATA.2025.3618463","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618463","url":null,"abstract":"Multi-behavior Session Based Recommendations (MBSBRs) have achieved remarkable results due to considering behavioral heterogeneity in sessions. Yet most existing works only consider binary or continuous behavior dependencies and aim to predict the next item under the target behavior, neglecting users’ inherent behavior habits, resulting in learning inaccurate intentions. To tackle the above issues, we propose a novel <underline>B</u>ehavior <underline>H</u>abits Enhanced Intention Learning framework for <underline>S</u>ession <underline>B</u>ased <underline>R</u>ecommendation (<bold>BHSBR</b>). Specifically, we focus on the next item recommendation and design a global item transition graph to learn the behavior-aware semantic relationships between items, in order to mine the underlying similarity between items beyond the session. In addition, we construct a hypergraph to extract the diverse behavior habits of users and break through the limitations of temporal relationships in the session. Compared to the existing works, our behavior habit learning method learns behavior dependencies at the user level, which could capture the user’s more accurate long-term intentions and reduce the impact of noise behaviors. Extensive experiments on three datasets demonstrate that the performance of our proposed <bold>BHSBR</b> is superior to SOTA. Further ablation experiments fully illustrate the effectiveness of our various modules.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"236-248"},"PeriodicalIF":5.7,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.1109/TBDATA.2025.3618443
Wen Bai;Yufeng Wang;Yuncheng Jiang;Di Wu
To underscore the significance of the interactive frequency among diverse vertices in each snapshot, prior research has extended the $k$-core of general graphs to the $(k,h)$-core of temporal graphs, in which each vertex has at least $k$ neighbors and is connected by at least $h$ edges to each of these neighbors. Due to the numerous combinations of $k$ and $h$, the quantity of $(k,h)$-cores is substantial, which necessitates considerable time and space for querying and decomposition. As a temporal graph evolves, for instance, with edges being inserted or removed from the previous snapshot, the affected $(k,h)$-cores must also be updated to reflect the latest structure. To address these challenges, we initially develop a novel $(k,h)$-core storage index that exhibits excellent query performance while consuming linear space regarding the graph size. Subsequently, we design an efficient decomposition algorithm to extract $(k,h)$-cores from a snapshot. Following this, we offer two maintenance algorithms to manage temporal graph evolution. Finally, we validate the effectiveness of our proposed methods on actual temporal graphs. Experimental results indicate that our methods surpass existing techniques by two orders of magnitude.
{"title":"Parallel Core Decomposition of Temporal Graphs","authors":"Wen Bai;Yufeng Wang;Yuncheng Jiang;Di Wu","doi":"10.1109/TBDATA.2025.3618443","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618443","url":null,"abstract":"To underscore the significance of the interactive frequency among diverse vertices in each snapshot, prior research has extended the <inline-formula><tex-math>$k$</tex-math></inline-formula>-core of general graphs to the <inline-formula><tex-math>$(k,h)$</tex-math></inline-formula>-core of temporal graphs, in which each vertex has at least <inline-formula><tex-math>$k$</tex-math></inline-formula> neighbors and is connected by at least <inline-formula><tex-math>$h$</tex-math></inline-formula> edges to each of these neighbors. Due to the numerous combinations of <inline-formula><tex-math>$k$</tex-math></inline-formula> and <inline-formula><tex-math>$h$</tex-math></inline-formula>, the quantity of <inline-formula><tex-math>$(k,h)$</tex-math></inline-formula>-cores is substantial, which necessitates considerable time and space for querying and decomposition. As a temporal graph evolves, for instance, with edges being inserted or removed from the previous snapshot, the affected <inline-formula><tex-math>$(k,h)$</tex-math></inline-formula>-cores must also be updated to reflect the latest structure. To address these challenges, we initially develop a novel <inline-formula><tex-math>$(k,h)$</tex-math></inline-formula>-core storage index that exhibits excellent query performance while consuming linear space regarding the graph size. Subsequently, we design an efficient decomposition algorithm to extract <inline-formula><tex-math>$(k,h)$</tex-math></inline-formula>-cores from a snapshot. Following this, we offer two maintenance algorithms to manage temporal graph evolution. Finally, we validate the effectiveness of our proposed methods on actual temporal graphs. Experimental results indicate that our methods surpass existing techniques by two orders of magnitude.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"212-223"},"PeriodicalIF":5.7,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.1109/TBDATA.2025.3618449
Meng Jian;Ruoxi Li;Yulong Bai;Ge Shi
In the digital age, the overwhelming amount of information necessitates advanced recommendation systems to deliver personalized content. However, these systems face significant challenges, such as sparse user-item interactions and long-tail bias. Recent studies construct structural learning or self-supervised learning on the interaction graph achieving a positive impact on alleviating the problems, but the interaction data itself may be far too little to solve the problems. While knowledge graphs (KGs) offer a promising solution by providing semantic depth to recommendations, their integration often introduces noise from redundant knowledge. Addressing these critical gaps, this study proposes a knowledge-guided interest contrast (KGIC) to enhance recommendations, which innovatively harmonizes collaborative filtering with semantic insights from KG. The KGIC model introduces three key innovations: (1) a knowledge filtering mechanism that selectively leverages interest-relevant signals from the knowledge graph to encode interest and avoid redundant knowledge interference; (2) an adaptive graph augmentation strategy that enhances the interaction graph based on semantic-aware interest propagation and interaction intensity estimation; and (3) a self-supervised contrastive learning task that mitigates long-tail bias and sparsity issues by homogenizing the embedding distribution between augmented views. The extensive evaluation reveals the superiority of KGIC with knowledge filtering and graph augmentation for recommendation.
{"title":"Enhancing Recommendations With Knowledge-Guided Interest Contrast","authors":"Meng Jian;Ruoxi Li;Yulong Bai;Ge Shi","doi":"10.1109/TBDATA.2025.3618449","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3618449","url":null,"abstract":"In the digital age, the overwhelming amount of information necessitates advanced recommendation systems to deliver personalized content. However, these systems face significant challenges, such as sparse user-item interactions and long-tail bias. Recent studies construct structural learning or self-supervised learning on the interaction graph achieving a positive impact on alleviating the problems, but the interaction data itself may be far too little to solve the problems. While knowledge graphs (KGs) offer a promising solution by providing semantic depth to recommendations, their integration often introduces noise from redundant knowledge. Addressing these critical gaps, this study proposes a knowledge-guided interest contrast (KGIC) to enhance recommendations, which innovatively harmonizes collaborative filtering with semantic insights from KG. The KGIC model introduces three key innovations: (1) a knowledge filtering mechanism that selectively leverages interest-relevant signals from the knowledge graph to encode interest and avoid redundant knowledge interference; (2) an adaptive graph augmentation strategy that enhances the interaction graph based on semantic-aware interest propagation and interaction intensity estimation; and (3) a self-supervised contrastive learning task that mitigates long-tail bias and sparsity issues by homogenizing the embedding distribution between augmented views. The extensive evaluation reveals the superiority of KGIC with knowledge filtering and graph augmentation for recommendation.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"200-211"},"PeriodicalIF":5.7,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}