首页 > 最新文献

IEEE Transactions on Big Data最新文献

英文 中文
Enhancing Weak Supervision for Concept Prerequisite Relation Learning 加强概念前提关系学习的弱监督
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-03-17 DOI: 10.1109/TBDATA.2025.3552330
Miao Zhang;Jiawei Wang;Kui Xiao;Zhifang Huang;Zhifei Li;Yan Zhang
Concept prerequisite relation learning is used to identify dependency relations between knowledge concepts, which helps learners choose effective learning paths. Currently, most of the mainstream methods utilise deep learning algorithms to capture the prerequisite relations between concepts through supervised or semi-supervised learning. However, these methods are highly dependent on labelled data, which is scarce and costly to annotate in reality. To address this problem, we propose a framework called Weakly Supervised Enhanced Concept Prerequisite Relation Learning (WSECPRL). Specifically, we first generate an enhanced concept pseudo-relation graph without labeled data using the pre-trained language model and the large knowledge base as auxiliary information. Second, we propose an improved variational graph auto-encoder model to correctly determine the concept prerequisite relations. We incorporate a multi-head attention mechanism to enhance the representation learning capability of weakly supervised learning. The model reconstructs a directed graph into multiple undirected graphs by splitting the adjacency matrix and determines the direction of the concept prerequisite relation based on the strength of the dependency relation between concepts. Finally, experimental results on several publicly available datasets demonstrate the effectiveness of our proposed framework, with WSECPRL outperforming existing baseline models in terms of F1 scores and AUC.
概念前提关系学习用于识别知识概念之间的依赖关系,帮助学习者选择有效的学习路径。目前,大多数主流方法都是利用深度学习算法通过监督或半监督学习来捕捉概念之间的前提关系。然而,这些方法高度依赖于标记数据,而这些标记数据在现实中是稀缺且昂贵的。为了解决这个问题,我们提出了一个框架,称为弱监督增强概念前提关系学习(WSECPRL)。具体而言,我们首先使用预训练的语言模型和大型知识库作为辅助信息,生成一个没有标记数据的增强概念伪关系图。其次,我们提出了一种改进的变分图自编码器模型,以正确确定概念前提关系。为了提高弱监督学习的表示学习能力,我们引入了多头注意机制。该模型通过拆分邻接矩阵将一个有向图重构为多个无向图,并根据概念间依赖关系的强弱确定概念前提关系的方向。最后,在几个公开可用的数据集上的实验结果证明了我们提出的框架的有效性,WSECPRL在F1分数和AUC方面优于现有的基线模型。
{"title":"Enhancing Weak Supervision for Concept Prerequisite Relation Learning","authors":"Miao Zhang;Jiawei Wang;Kui Xiao;Zhifang Huang;Zhifei Li;Yan Zhang","doi":"10.1109/TBDATA.2025.3552330","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3552330","url":null,"abstract":"Concept prerequisite relation learning is used to identify dependency relations between knowledge concepts, which helps learners choose effective learning paths. Currently, most of the mainstream methods utilise deep learning algorithms to capture the prerequisite relations between concepts through supervised or semi-supervised learning. However, these methods are highly dependent on labelled data, which is scarce and costly to annotate in reality. To address this problem, we propose a framework called <underline>W</u>eakly <underline>S</u>upervised <underline>E</u>nhanced <underline>C</u>oncept <underline>P</u>rerequisite <underline>R</u>elation <underline>L</u>earning (WSECPRL). Specifically, we first generate an enhanced concept pseudo-relation graph without labeled data using the pre-trained language model and the large knowledge base as auxiliary information. Second, we propose an improved variational graph auto-encoder model to correctly determine the concept prerequisite relations. We incorporate a multi-head attention mechanism to enhance the representation learning capability of weakly supervised learning. The model reconstructs a directed graph into multiple undirected graphs by splitting the adjacency matrix and determines the direction of the concept prerequisite relation based on the strength of the dependency relation between concepts. Finally, experimental results on several publicly available datasets demonstrate the effectiveness of our proposed framework, with WSECPRL outperforming existing baseline models in terms of F1 scores and AUC.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2643-2656"},"PeriodicalIF":5.7,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Antagonistic $k$k-Plex Enumeration in Signed Graphs 符号图中的有效对抗$k$k- plex枚举
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-03-17 DOI: 10.1109/TBDATA.2025.3552335
Lantian Xu;Rong-Hua Li;Dong Wen;Qiangqiang Dai;Guoren Wang
A signed graph is a graph where each edge receives a sign, positive or negative. The signed graph model has been used in many real applications, such as protein complex discovery and social network analysis. Finding cohesive subgraphs in signed graphs is a fundamental problem. A $k$-plex is a common model for cohesive subgraphs in which every vertex is adjacent to all but at most $k$ vertices within the subgraph. In this paper, we propose the model of size-constrained antagonistic $k$-plex in a signed graph. The proposed model guarantees that the resulting subgraph is a $k$-plex and can be divided into two sub-$k$-plexes, both of which have positive inner edges and negative outer edges. This paper aims to identify all maximal antagonistic $k$-plexes in a signed graph. Through rigorous analysis, we show that the problem is NP-Hardness. We propose a novel framework for maximal antagonistic $k$-plexes utilizing set enumeration. Efficiency is improved through pivot pruning and early termination based on the color bound. Preprocessing techniques based on degree and dichromatic graphs effectively narrow the search space before enumeration. Extensive experiments on real-world datasets demonstrate our algorithm’s efficiency, effectiveness, and scalability.
带符号的图是每条边都有正负符号的图。签名图模型已经应用于许多实际应用中,如蛋白质复合物的发现和社会网络分析。在有符号图中寻找内聚子图是一个基本问题。$k$ plex是内聚子图的常用模型,其中每个顶点与子图中的所有顶点相邻,但不超过$k$顶点。在这篇论文中,我们提出了一个有符号图中大小约束的对抗$k$-plex模型。该模型保证了生成的子图是一个k -plex,并且可以分为两个子k -plex,它们都具有正的内边和负的外边。本文的目的是识别一个有符号图中所有的极大对抗性的$k$-丛。通过严格的分析,我们发现问题是np -硬度。我们提出了一种利用集合枚举的最大对抗性$k$丛的新框架。通过基于颜色界的枢轴剪枝和提前终止来提高效率。基于度图和二色图的预处理技术有效地缩小了枚举前的搜索空间。在真实世界数据集上的大量实验证明了我们的算法的效率、有效性和可扩展性。
{"title":"Efficient Antagonistic $k$k-Plex Enumeration in Signed Graphs","authors":"Lantian Xu;Rong-Hua Li;Dong Wen;Qiangqiang Dai;Guoren Wang","doi":"10.1109/TBDATA.2025.3552335","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3552335","url":null,"abstract":"A signed graph is a graph where each edge receives a sign, positive or negative. The signed graph model has been used in many real applications, such as protein complex discovery and social network analysis. Finding cohesive subgraphs in signed graphs is a fundamental problem. A <inline-formula><tex-math>$k$</tex-math></inline-formula>-plex is a common model for cohesive subgraphs in which every vertex is adjacent to all but at most <inline-formula><tex-math>$k$</tex-math></inline-formula> vertices within the subgraph. In this paper, we propose the model of size-constrained antagonistic <inline-formula><tex-math>$k$</tex-math></inline-formula>-plex in a signed graph. The proposed model guarantees that the resulting subgraph is a <inline-formula><tex-math>$k$</tex-math></inline-formula>-plex and can be divided into two sub-<inline-formula><tex-math>$k$</tex-math></inline-formula>-plexes, both of which have positive inner edges and negative outer edges. This paper aims to identify all maximal antagonistic <inline-formula><tex-math>$k$</tex-math></inline-formula>-plexes in a signed graph. Through rigorous analysis, we show that the problem is NP-Hardness. We propose a novel framework for maximal antagonistic <inline-formula><tex-math>$k$</tex-math></inline-formula>-plexes utilizing set enumeration. Efficiency is improved through pivot pruning and early termination based on the color bound. Preprocessing techniques based on degree and dichromatic graphs effectively narrow the search space before enumeration. Extensive experiments on real-world datasets demonstrate our algorithm’s efficiency, effectiveness, and scalability.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2587-2600"},"PeriodicalIF":5.7,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Residual Learning for Self-Knowledge Distillation: Enhancing Neural Networks Through Consistency Across Layers 残差学习的自知识蒸馏:通过跨层一致性增强神经网络
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-03-17 DOI: 10.1109/TBDATA.2025.3552326
Hanpeng Liu;Shuoxi Zhang;Kun He
Knowledge distillation is widely used technique to transfer knowledge from a large pretrained teacher network to a small student network. However, training complex teacher models requires significant computational resources and storage. To address this, a growing area of research, known as self-knowledge distillation (Self-KD), aims to enhance the performance of a neural network by leveraging its own latent knowledge. Despite its potential, existing Self-KD methods often struggle to effectively extract and utilize the model's dark knowledge. In this work, we identify a consistency problem between feature layer and output layer, and propose a novel Self-KD approach called Residual Learning for Self-Knowledge Distillation (RSKD). Our method addresses this issue by enabling the last feature layer of the student model learn the residual gap between the outputs of the pseudo-teacher and the student. Additionally, we extend RSKD by allowing each intermediate feature layer of the student model to learn the residual gap between the corresponding deeper features of the pseudo-teacher and the student. Extensive experiments on various visual datasets demonstrate the effectiveness of the proposed method, which outperforms the state-of-the-art baselines.
知识蒸馏是一种广泛应用于将知识从大型预训练教师网络转移到小型学生网络的技术。然而,训练复杂的教师模型需要大量的计算资源和存储空间。为了解决这个问题,一个不断发展的研究领域,被称为自我知识蒸馏(Self-KD),旨在通过利用其自身的潜在知识来提高神经网络的性能。尽管有潜力,现有的Self-KD方法往往难以有效地提取和利用模型的暗知识。在这项工作中,我们发现了特征层和输出层之间的一致性问题,并提出了一种新的自知识蒸馏残差学习(RSKD)方法。我们的方法通过使学生模型的最后一个特征层学习伪教师和学生输出之间的残差来解决这个问题。此外,我们扩展了RSKD,允许学生模型的每个中间特征层学习伪教师和学生的相应深层特征之间的残差。在各种视觉数据集上的大量实验证明了所提出方法的有效性,其优于最先进的基线。
{"title":"Residual Learning for Self-Knowledge Distillation: Enhancing Neural Networks Through Consistency Across Layers","authors":"Hanpeng Liu;Shuoxi Zhang;Kun He","doi":"10.1109/TBDATA.2025.3552326","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3552326","url":null,"abstract":"Knowledge distillation is widely used technique to transfer knowledge from a large pretrained teacher network to a small student network. However, training complex teacher models requires significant computational resources and storage. To address this, a growing area of research, known as self-knowledge distillation (Self-KD), aims to enhance the performance of a neural network by leveraging its own latent knowledge. Despite its potential, existing Self-KD methods often struggle to effectively extract and utilize the model's dark knowledge. In this work, we identify a consistency problem between feature layer and output layer, and propose a novel Self-KD approach called <bold>R</b>esidual Learning for <bold>S</b>elf-<bold>K</b>nowledge <bold>D</b>istillation (<bold>RSKD</b>). Our method addresses this issue by enabling the last feature layer of the student model learn the residual gap between the outputs of the pseudo-teacher and the student. Additionally, we extend RSKD by allowing each intermediate feature layer of the student model to learn the residual gap between the corresponding deeper features of the pseudo-teacher and the student. Extensive experiments on various visual datasets demonstrate the effectiveness of the proposed method, which outperforms the state-of-the-art baselines.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2615-2627"},"PeriodicalIF":5.7,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Subdata Selection for Prediction Based on the Distribution of the Covariates 基于协变量分布的预测最优子数据选择
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-03-17 DOI: 10.1109/TBDATA.2025.3552343
Alvaro Cia-Mina;Jesus Lopez-Fidalgo;Weng Kee Wong
Huge data sets are widely available now and there is growing interest in selecting an optimal subsample from the full data set to improve inference efficiency and reduce labeling costs. We propose a new criterion called J–optimality, that builds upon a popular optimal selection criterion that minimizes the Random–X prediction error by additionally incorporating the joint distribution of the covariates. A key advantage of our approach is that we can relate the subsampling selection problem to that of finding an optimal approximate design under a convex criterion, where analytical tools for finding and studying them are already available. Consequently, the J–optimal subsampling method comes with theoretical results and theory-based algorithms for finding them. Simulation results and real data analysis show our proposed methods outperform current subsampling methods and the proposed algorithms can also adapt efficiently to select an optimal subsample from streaming data.
现在大量的数据集广泛可用,人们对从完整的数据集中选择最优子样本以提高推理效率和降低标记成本的兴趣越来越大。我们提出了一个新的标准,称为j -最优性,它建立在一个流行的最优选择标准上,该标准通过额外结合协变量的联合分布来最小化Random-X预测误差。我们的方法的一个关键优势是,我们可以将子抽样选择问题与在凸准则下寻找最佳近似设计的问题联系起来,在凸准则下找到和研究它们的分析工具已经可用。因此,j -最优子抽样方法具有理论结果和基于理论的算法来寻找它们。仿真结果和实际数据分析表明,本文提出的方法优于现有的子采样方法,并能有效地适应从流数据中选择最优子样本。
{"title":"Optimal Subdata Selection for Prediction Based on the Distribution of the Covariates","authors":"Alvaro Cia-Mina;Jesus Lopez-Fidalgo;Weng Kee Wong","doi":"10.1109/TBDATA.2025.3552343","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3552343","url":null,"abstract":"Huge data sets are widely available now and there is growing interest in selecting an optimal subsample from the full data set to improve inference efficiency and reduce labeling costs. We propose a new criterion called J–optimality, that builds upon a popular optimal selection criterion that minimizes the Random–X prediction error by additionally incorporating the joint distribution of the covariates. A key advantage of our approach is that we can relate the subsampling selection problem to that of finding an optimal approximate design under a convex criterion, where analytical tools for finding and studying them are already available. Consequently, the J–optimal subsampling method comes with theoretical results and theory-based algorithms for finding them. Simulation results and real data analysis show our proposed methods outperform current subsampling methods and the proposed algorithms can also adapt efficiently to select an optimal subsample from streaming data.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2601-2614"},"PeriodicalIF":5.7,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10930599","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NetPrompt: Neural Network Prompting Enhances Event Extraction in Large Language Models NetPrompt:神经网络提示增强大型语言模型中的事件提取
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-03-17 DOI: 10.1109/TBDATA.2025.3552333
Lin Mu;Yide Cheng;Jun Shen;Yiwen Zhang;Hong Zhong
Event Extraction involves extracting event-related information such as event types and event arguments from context, which has long been tackled through well-designed neural networks or fine-tuned pre-trained language models. These approaches require substantial annotated data for tuning parameters and are resource-intensive. Recently, Prompting strategies with frozen parameters, such as Chain-of-Thought and Self-Consistency, have delivered success in NLP using LLMs by generating intermediate thought steps. However, they suffer from the challenge of error propagation and lack of interaction between different thoughts. In this paper, we propose Neural Network-based Prompting (NetPrompt), a novel network-structured prompting strategy for event extraction. The core idea behind NetPrompt is to imitate the excellent information integration capabilities of neural network structures. Specifically, we first decompose the event extraction problem into diverse intermediate subtasks, and each subtask is represented as a node in different layers of the network, the output of the nodes in the preceding layer is fed into the subsequent layer. Secondly, we propose pruning strategies to adapt the reasoning overhead to different problems. Finally, we have conducted extensive experiments on two widely used event extraction benchmarks to evaluate NetPrompt. The results demonstrated that NetPrompt significantly improved the event extraction performance compared to previous methods.
事件提取涉及从上下文中提取事件相关的信息,例如事件类型和事件参数,这一直是通过精心设计的神经网络或微调的预训练语言模型来解决的。这些方法需要大量带注释的数据来调优参数,并且资源密集。最近,具有冻结参数的提示策略,如思维链和自一致性,通过生成中间思维步骤,在使用llm的NLP中取得了成功。然而,它们面临着错误传播和不同思想之间缺乏相互作用的挑战。在本文中,我们提出了一种基于神经网络的提示(NetPrompt),这是一种新颖的网络结构化事件提取提示策略。NetPrompt的核心思想是模仿神经网络结构优秀的信息集成能力。具体而言,我们首先将事件提取问题分解为多个中间子任务,每个子任务表示为网络不同层的节点,将前一层节点的输出馈送到后一层。其次,我们提出了修剪策略,以适应不同问题的推理开销。最后,我们在两个广泛使用的事件提取基准上进行了大量的实验来评估NetPrompt。结果表明,与以前的方法相比,NetPrompt显著提高了事件提取的性能。
{"title":"NetPrompt: Neural Network Prompting Enhances Event Extraction in Large Language Models","authors":"Lin Mu;Yide Cheng;Jun Shen;Yiwen Zhang;Hong Zhong","doi":"10.1109/TBDATA.2025.3552333","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3552333","url":null,"abstract":"Event Extraction involves extracting event-related information such as event types and event arguments from context, which has long been tackled through well-designed neural networks or fine-tuned pre-trained language models. These approaches require substantial annotated data for tuning parameters and are resource-intensive. Recently, Prompting strategies with frozen parameters, such as Chain-of-Thought and Self-Consistency, have delivered success in NLP using LLMs by generating intermediate thought steps. However, they suffer from the challenge of error propagation and lack of interaction between different thoughts. In this paper, we propose <italic>Neural Network-based Prompting</i> (NetPrompt), a novel network-structured prompting strategy for event extraction. The core idea behind NetPrompt is to imitate the excellent information integration capabilities of neural network structures. Specifically, we first decompose the event extraction problem into diverse intermediate subtasks, and each subtask is represented as a node in different layers of the network, the output of the nodes in the preceding layer is fed into the subsequent layer. Secondly, we propose pruning strategies to adapt the reasoning overhead to different problems. Finally, we have conducted extensive experiments on two widely used event extraction benchmarks to evaluate NetPrompt. The results demonstrated that NetPrompt significantly improved the event extraction performance compared to previous methods.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2628-2642"},"PeriodicalIF":5.7,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward High-Quality Spatiotemporal Recommendation: Trajectory Recovery Based on Spatial and Temporal Dependencies 迈向高质量时空推荐:基于时空依赖的轨迹恢复
IF 7.5 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-03-14 DOI: 10.1109/TBDATA.2025.3570071
Yihao Zhao;Chenhao Wang;Hongyu Wang;Shunzhi Zhu;Lisi Chen
The rapid advancement of location and information technologies has generated a significant volume of human mobility data, which has been extensively utilized in spatiotemporal recommendation systems, including personalized point-of-interest recommendation, route recommendation, and location-aware event recommendation. Achieving high-quality recommendation results necessitates excellent quality of input trajectory data. However, trajectories obtained from GPS-enabled devices often contain missing and erroneous data that is unevenly distributed over time and highly sparse, which significantly hampers the effectiveness spatiotemporal data analytics. Therefore, trajectory recovery plays an important role in spatiotemporal recommendation systems. The objective of trajectory recovery is to utilize historical trajectories to restore missing locations, providing high-quality data for spatiotemporal recommendation systems. The development of an effective trajectory recovery mechanism faces three major challenges: 1) Complex and multi-granularity transition patterns among different locations; 2) Difficulty in discovering spatio-temporal dependencies; and 3) Data sparsity and noise. To address these challenges, we propose an attentional model with spatio-temporal recurrent neural networks, ARMove, to recover human mobility from long and sparse trajectories. In ARMove, we first design a spatio-temporal weighted recurrent neural network to capture users’ long-term preferences. Next, we introduce a multi-granularity trajectory encoder to model complex transition patterns and multi-level periodicity of human mobility. An attention-based history aggregation module is proposed to leverage historical mobility information. Extensive evaluation results reveal that our model outperforms the state-of-the-art models, demonstrating its ability to reconstruct high-quality and fine-grained human mobility trajectories.
位置和信息技术的快速发展产生了大量的人类移动数据,这些数据已被广泛应用于时空推荐系统,包括个性化兴趣点推荐、路线推荐和位置感知事件推荐。获得高质量的推荐结果需要高质量的输入轨迹数据。然而,从启用gps的设备获得的轨迹通常包含丢失和错误的数据,这些数据随时间分布不均匀且高度稀疏,这极大地阻碍了时空数据分析的有效性。因此,轨迹恢复在时空推荐系统中起着重要的作用。轨迹恢复的目标是利用历史轨迹来恢复缺失的位置,为时空推荐系统提供高质量的数据。建立有效的轨迹恢复机制面临三大挑战:1)不同位置间复杂、多粒度的过渡模式;2)难以发现时空依赖关系;3)数据稀疏性和噪声。为了解决这些挑战,我们提出了一个具有时空递归神经网络的注意力模型ARMove,以从长而稀疏的轨迹中恢复人类的移动性。在ARMove中,我们首先设计了一个时空加权递归神经网络来捕捉用户的长期偏好。接下来,我们引入了一个多粒度的轨迹编码器来模拟复杂的过渡模式和多层次的人类移动周期。提出了一种基于关注的历史移动信息聚合模块。广泛的评估结果表明,我们的模型优于最先进的模型,证明了其重建高质量和细粒度人类移动轨迹的能力。
{"title":"Toward High-Quality Spatiotemporal Recommendation: Trajectory Recovery Based on Spatial and Temporal Dependencies","authors":"Yihao Zhao;Chenhao Wang;Hongyu Wang;Shunzhi Zhu;Lisi Chen","doi":"10.1109/TBDATA.2025.3570071","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3570071","url":null,"abstract":"The rapid advancement of location and information technologies has generated a significant volume of human mobility data, which has been extensively utilized in spatiotemporal recommendation systems, including personalized point-of-interest recommendation, route recommendation, and location-aware event recommendation. Achieving high-quality recommendation results necessitates excellent quality of input trajectory data. However, trajectories obtained from GPS-enabled devices often contain missing and erroneous data that is unevenly distributed over time and highly sparse, which significantly hampers the effectiveness spatiotemporal data analytics. Therefore, trajectory recovery plays an important role in spatiotemporal recommendation systems. The objective of trajectory recovery is to utilize historical trajectories to restore missing locations, providing high-quality data for spatiotemporal recommendation systems. The development of an effective trajectory recovery mechanism faces three major challenges: 1) Complex and multi-granularity transition patterns among different locations; 2) Difficulty in discovering spatio-temporal dependencies; and 3) Data sparsity and noise. To address these challenges, we propose an attentional model with spatio-temporal recurrent neural networks, ARMove, to recover human mobility from long and sparse trajectories. In ARMove, we first design a spatio-temporal weighted recurrent neural network to capture users’ long-term preferences. Next, we introduce a multi-granularity trajectory encoder to model complex transition patterns and multi-level periodicity of human mobility. An attention-based history aggregation module is proposed to leverage historical mobility information. Extensive evaluation results reveal that our model outperforms the state-of-the-art models, demonstrating its ability to reconstruct high-quality and fine-grained human mobility trajectories.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1628-1639"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial Emerging Horizons: The Rise of Large Language Models and Cross-Modal Generative AI 新兴的视野:大语言模型和跨模态生成人工智能的兴起
IF 7.5 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-03-14 DOI: 10.1109/TBDATA.2025.3537217
Guang Yang;Jing Zhang;Giorgos Papanastasiou;Ge Wang;Dacheng Tao
{"title":"Editorial Emerging Horizons: The Rise of Large Language Models and Cross-Modal Generative AI","authors":"Guang Yang;Jing Zhang;Giorgos Papanastasiou;Ge Wang;Dacheng Tao","doi":"10.1109/TBDATA.2025.3537217","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3537217","url":null,"abstract":"","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"896-897"},"PeriodicalIF":7.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11003991","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How to Decide Like Human? A Commonsense-Aware Hierarchical Framework for Knowledge Graph Reasoning 如何像人一样做决定?知识图推理的常识感知层次框架
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-02-20 DOI: 10.1109/TBDATA.2025.3544126
Yi Xia;Gang Zhou;Junyong Luo;Mingjing Lan;Ningbo Huang
Reasoning over knowledge graphs has attracted considerable attention from researchers and is being widely applied to contribute question answering systems, recommender systems, and other information retrieval systems. However, existing reasoning methods tend to suffer from poor interpretability which is not consistent with human commonsense. The trustworthiness and reliability of the knowledge discover outcomes thus decreased as a result. Inspired by the process of human decision-making, we propose a commonsense-aware hierarchical framework called HDLH, which incorporates commonsense knowledge into hierarchical knowledge graph reasoning process with deep reinforcement learning. HDLH implements hierarchical reasoning process through exploration and exploitation sequentially by applying multi-agent reinforcement learning. Multiple agents in HDLH simulate the multi-level decision-making ability of humans, and reason hierarchically and reasonably to maintain its efficiency and interpretability. Moreover, commonsense knowledge is incorporated by means of the reward-shaping function, ultimately guiding the agent to reason more consistently with human perceptions and reduce the huge search space. We evaluated HDLH with various tasks on five real-world datasets. The experimental results reveal that HDLH achieves better performance compared with state-of-the-art baseline models.
知识图推理已经引起了研究者的广泛关注,并被广泛应用于问答系统、推荐系统和其他信息检索系统中。然而,现有的推理方法往往具有较差的可解释性,不符合人类的常识。知识发现结果的可信度和可靠性因此下降。受人类决策过程的启发,我们提出了一种基于常识感知的层次框架HDLH,该框架将常识性知识融入深度强化学习的层次知识图推理过程中。HDLH采用多智能体强化学习,通过顺序探索和利用实现分层推理过程。HDLH中的多智能体模拟人类多层次的决策能力,分层合理地进行推理,保持其效率和可解释性。此外,通过奖励塑造函数将常识性知识纳入其中,最终引导智能体更符合人类感知的推理,减少巨大的搜索空间。我们在五个真实数据集上用不同的任务评估了HDLH。实验结果表明,与最先进的基线模型相比,HDLH具有更好的性能。
{"title":"How to Decide Like Human? A Commonsense-Aware Hierarchical Framework for Knowledge Graph Reasoning","authors":"Yi Xia;Gang Zhou;Junyong Luo;Mingjing Lan;Ningbo Huang","doi":"10.1109/TBDATA.2025.3544126","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3544126","url":null,"abstract":"Reasoning over knowledge graphs has attracted considerable attention from researchers and is being widely applied to contribute question answering systems, recommender systems, and other information retrieval systems. However, existing reasoning methods tend to suffer from poor interpretability which is not consistent with human commonsense. The trustworthiness and reliability of the knowledge discover outcomes thus decreased as a result. Inspired by the process of human decision-making, we propose a commonsense-aware hierarchical framework called <italic>HDLH</i>, which incorporates commonsense knowledge into hierarchical knowledge graph reasoning process with deep reinforcement learning. <italic>HDLH</i> implements hierarchical reasoning process through exploration and exploitation sequentially by applying multi-agent reinforcement learning. Multiple agents in <italic>HDLH</i> simulate the multi-level decision-making ability of humans, and reason hierarchically and reasonably to maintain its efficiency and interpretability. Moreover, commonsense knowledge is incorporated by means of the reward-shaping function, ultimately guiding the agent to reason more consistently with human perceptions and reduce the huge search space. We evaluated <italic>HDLH</i> with various tasks on five real-world datasets. The experimental results reveal that <italic>HDLH</i> achieves better performance compared with state-of-the-art baseline models.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2545-2556"},"PeriodicalIF":5.7,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LightST: A Simplifying Spatio-Temporal Graph Neural Network for Traffic Flow Forecasting LightST:一种用于交通流量预测的简化时空图神经网络
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-02-20 DOI: 10.1109/TBDATA.2025.3544131
Jie Hu;Taichuan Zheng;Lilan Peng;Fei Teng;Shengdong Du;Tianrui Li
Traffic flow forecasting task plays an essential role in intelligent transportation systems. Accurately capturing the intricate spatio-temporal dependencies in traffic network signals is the core of precise prediction. Recently, a paradigm that models spatio-temporal dependencies through graph neural networks and time series models has become one of the most promising methods to solve this problem. However, existing methods still have limitations due to ineffectively modeling dynamic spatial dependencies and high time and space complexity. To address these issues, we propose a simplifying and powerful general spatio-temporal traffic flow forecasting model called LightST. Specifically, LightST first embeds temporal covariates and spatial position information to enhance the spatio-temporal modeling capabilities. Then, stacked temporal linear layers are introduced to capture temporal dependencies efficiently. Finally,we propose a concise adaptive spatio-temporal embedding graph convolution method to extract implicit spatial dependencies over time via dynamic graph convolution with adaptive spatio-temporal embedding graph generation. Extensive experiment results on four public traffic flow datasets demonstrate the superiority of our LightST concerning computational efficiency and prediction performance.
交通流预测任务在智能交通系统中起着至关重要的作用。准确捕捉交通网络信号复杂的时空依赖关系是交通网络信号精确预测的核心。近年来,利用图神经网络和时间序列模型对时空依赖关系进行建模已成为解决这一问题最有前途的方法之一。然而,现有的方法由于不能有效地建模动态空间依赖关系和较高的时间和空间复杂性而存在局限性。为了解决这些问题,我们提出了一个简化且功能强大的通用时空交通流预测模型LightST。具体而言,LightST首先嵌入了时间协变量和空间位置信息,以增强时空建模能力。然后,引入叠置时间线性层来有效捕获时间依赖性。最后,我们提出了一种简洁的自适应时空嵌入图卷积方法,通过动态图卷积与自适应时空嵌入图生成来提取隐含的空间依赖关系。在四个公共交通流数据集上的大量实验结果证明了我们的LightST在计算效率和预测性能方面的优势。
{"title":"LightST: A Simplifying Spatio-Temporal Graph Neural Network for Traffic Flow Forecasting","authors":"Jie Hu;Taichuan Zheng;Lilan Peng;Fei Teng;Shengdong Du;Tianrui Li","doi":"10.1109/TBDATA.2025.3544131","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3544131","url":null,"abstract":"Traffic flow forecasting task plays an essential role in intelligent transportation systems. Accurately capturing the intricate spatio-temporal dependencies in traffic network signals is the core of precise prediction. Recently, a paradigm that models spatio-temporal dependencies through graph neural networks and time series models has become one of the most promising methods to solve this problem. However, existing methods still have limitations due to ineffectively modeling dynamic spatial dependencies and high time and space complexity. To address these issues, we propose a simplifying and powerful general spatio-temporal traffic flow forecasting model called LightST. Specifically, LightST first embeds temporal covariates and spatial position information to enhance the spatio-temporal modeling capabilities. Then, stacked temporal linear layers are introduced to capture temporal dependencies efficiently. Finally,we propose a concise adaptive spatio-temporal embedding graph convolution method to extract implicit spatial dependencies over time via dynamic graph convolution with adaptive spatio-temporal embedding graph generation. Extensive experiment results on four public traffic flow datasets demonstrate the superiority of our LightST concerning computational efficiency and prediction performance.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2517-2528"},"PeriodicalIF":5.7,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Higher-Order Community Detection by Motif-Based Modularity Optimization 基于模序模块化优化的高阶社团检测
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-02-20 DOI: 10.1109/TBDATA.2025.3544129
Jing Xiao;Yu-Cheng Zou;Xiao-Ke Xu
Recently higher-order community detection based on network motifs has received increasing attention, because motif-based communities reflect not only mesoscale structures but also functional characteristics of real-life networks. In this study, we propose a Modularity Optimization method for Motif-based Community Detection (MOMCD). In order to approximate the global optimum in modularity optimization, an improved nature-inspired metaheuristic algorithm is proposed as optimization strategy. In addition, by comprehensively utilizing motif-based (higher-order) and edge-based (lower-order) structural information, a neighbor community modification operation and a local search operation are also designed to improve the quality of individuals and promote the convergence of MOMCD. Experimental results show that MOMCD is promising and competitive in identifying motif-based communities from synthetic and real-life networks, which outperforms state-of-the-art approaches in terms of quality and accuracy, and deepens our understanding of network structural and functional characteristics.
近年来,基于网络基序的高阶社区检测受到越来越多的关注,因为基于基序的社区不仅反映了现实网络的中尺度结构,而且反映了现实网络的功能特征。在这项研究中,我们提出了一种基于模序的社区检测(MOMCD)的模块化优化方法。为了逼近模块化优化中的全局最优,提出了一种改进的自然启发式元启发式算法作为优化策略。此外,综合利用基于motif(高阶)和基于edge(低阶)的结构信息,设计邻居群体修改操作和局部搜索操作,提高个体质量,促进MOMCD的收敛。实验结果表明,MOMCD在从合成和现实网络中识别基于motif的社区方面具有前景和竞争力,在质量和准确性方面优于最先进的方法,并加深了我们对网络结构和功能特征的理解。
{"title":"Higher-Order Community Detection by Motif-Based Modularity Optimization","authors":"Jing Xiao;Yu-Cheng Zou;Xiao-Ke Xu","doi":"10.1109/TBDATA.2025.3544129","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3544129","url":null,"abstract":"Recently higher-order community detection based on network motifs has received increasing attention, because motif-based communities reflect not only mesoscale structures but also functional characteristics of real-life networks. In this study, we propose a Modularity Optimization method for Motif-based Community Detection (MOMCD). In order to approximate the global optimum in modularity optimization, an improved nature-inspired metaheuristic algorithm is proposed as optimization strategy. In addition, by comprehensively utilizing motif-based (higher-order) and edge-based (lower-order) structural information, a neighbor community modification operation and a local search operation are also designed to improve the quality of individuals and promote the convergence of MOMCD. Experimental results show that MOMCD is promising and competitive in identifying motif-based communities from synthetic and real-life networks, which outperforms state-of-the-art approaches in terms of quality and accuracy, and deepens our understanding of network structural and functional characteristics.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2529-2544"},"PeriodicalIF":5.7,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Big Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1