首页 > 最新文献

IEEE Transactions on Big Data最新文献

英文 中文
Scalable Learning-Based Community-Preserving Graph Generation 基于可扩展学习的保社区图生成
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-27 DOI: 10.1109/TBDATA.2025.3533898
Sheng Xiang;Chenhao Xu;Dawei Cheng;Ying Zhang
Graph generation plays an essential role in understanding the formation of complex network structures across various fields, such as biological and social networks. Recent studies have shifted towards employing deep learning methods to grasp the topology of graphs. Yet, most current graph generators fail to adequately capture the community structure, which stands out as a critical and distinctive aspect of graphs. Additionally, these generators are generally limited to smaller graphs because of their inefficiencies and scaling challenges. This paper introduces the Community-Preserving Graph Adversarial Network (CPGAN), designed to effectively simulate graphs. CPGAN leverages graph convolution networks within its encoder and maintains shared parameters during generation to encapsulate community structure data and ensure permutation invariance. We also present the Scalable Community-Preserving Graph Attention Network (SCPGAN), aimed at enhancing the scalability of our model. SCPGAN considerably cuts down on inference and training durations, as well as GPU memory usage, through the use of an ego-graph sampling approach and a short-pipeline autoencoder framework. Tests conducted on six real-world graph datasets reveal that CPGAN manages a beneficial balance between efficiency and simulation quality when compared to leading-edge baselines. Moreover, SCPGAN marks substantial strides in model efficiency and scalability, successfully increasing the size of generated graphs to the 10 million node level while maintaining competitive quality, on par with other advanced learning models.
图生成在理解生物和社会网络等各个领域复杂网络结构的形成方面起着至关重要的作用。最近的研究转向使用深度学习方法来掌握图的拓扑结构。然而,大多数当前的图形生成器未能充分捕获社区结构,这是图形的一个关键和独特的方面。此外,由于效率低下和缩放困难,这些生成器通常仅限于较小的图。本文介绍了一种用于有效模拟图的保社区图对抗网络(CPGAN)。CPGAN利用编码器内的图卷积网络,并在生成过程中保持共享参数,以封装社区结构数据并确保排列不变性。我们还提出了可扩展的社区保持图注意网络(SCPGAN),旨在增强我们模型的可扩展性。SCPGAN通过使用自我图采样方法和短管道自动编码器框架,大大减少了推理和训练持续时间,以及GPU内存使用。在六个真实世界的图形数据集上进行的测试表明,与前沿基线相比,CPGAN在效率和模拟质量之间取得了有益的平衡。此外,SCPGAN在模型效率和可扩展性方面取得了重大进展,成功地将生成的图的大小增加到1000万个节点级别,同时保持了与其他高级学习模型相当的质量。
{"title":"Scalable Learning-Based Community-Preserving Graph Generation","authors":"Sheng Xiang;Chenhao Xu;Dawei Cheng;Ying Zhang","doi":"10.1109/TBDATA.2025.3533898","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533898","url":null,"abstract":"Graph generation plays an essential role in understanding the formation of complex network structures across various fields, such as biological and social networks. Recent studies have shifted towards employing deep learning methods to grasp the topology of graphs. Yet, most current graph generators fail to adequately capture the community structure, which stands out as a critical and distinctive aspect of graphs. Additionally, these generators are generally limited to smaller graphs because of their inefficiencies and scaling challenges. This paper introduces the Community-Preserving Graph Adversarial Network (CPGAN), designed to effectively simulate graphs. CPGAN leverages graph convolution networks within its encoder and maintains shared parameters during generation to encapsulate community structure data and ensure permutation invariance. We also present the Scalable Community-Preserving Graph Attention Network (SCPGAN), aimed at enhancing the scalability of our model. SCPGAN considerably cuts down on inference and training durations, as well as GPU memory usage, through the use of an ego-graph sampling approach and a short-pipeline autoencoder framework. Tests conducted on six real-world graph datasets reveal that CPGAN manages a beneficial balance between efficiency and simulation quality when compared to leading-edge baselines. Moreover, SCPGAN marks substantial strides in model efficiency and scalability, successfully increasing the size of generated graphs to the 10 million node level while maintaining competitive quality, on par with other advanced learning models.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2457-2470"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Exchange for the Metaverse With Accountable Decentralized TTPs and Incentive Mechanisms 负责任的分散ttp和激励机制的元宇宙数据交换
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-27 DOI: 10.1109/TBDATA.2025.3533924
Liang Zhang;Xingyu Wu;Yuhang Ma;Haibin Kan
As a global virtual environment, the metaverse poses various challenges regarding data storage, sharing, interoperability, and privacy preservation. Typically, a trusted third party (TTP) is considered necessary in these scenarios. However, relying on a single TTP may introduce biases, compromise privacy, or lead to single-point-of-failure problem. To address these challenges and enable secure data exchange in the metaverse, we propose a system based on decentralized TTPs and the Ethereum blockchain. First, we use the threshold ElGamal cryptosystem to create the decentralized TTPs, employing verifiable secret sharing (VSS) to force owners to share data honestly. Second, we leverage the Ethereum blockchain to serve as the public communication channel, automatic verification machine, and smart contract engine. Third, we apply discrete logarithm equality (DLEQ) algorithms to generate non-interactive zero knowledge (NIZK) proofs when encrypted data is uploaded to the blockchain. Fourth, we present an incentive mechanism to benefit data owners and TTPs from data-sharing activities, as well as a penalty policy if malicious behavior is detected. Consequently, we construct a data exchange framework for the metaverse, in which all involved entities are accountable. Finally, we perform comprehensive experiments to demonstrate the feasibility and analyze the properties of the proposed system.
作为一个全球性的虚拟环境,元宇宙在数据存储、共享、互操作性和隐私保护方面提出了各种挑战。通常,在这些场景中,可信第三方(TTP)被认为是必要的。但是,依赖于单个http可能会引入偏差、损害隐私或导致单点故障问题。为了应对这些挑战并实现虚拟世界中的安全数据交换,我们提出了一个基于去中心化https和以太坊区块链的系统。首先,我们使用阈值ElGamal密码系统创建去中心化的https,采用可验证的秘密共享(VSS)来强制所有者诚实地共享数据。其次,我们利用以太坊区块链作为公共通信通道、自动验证机和智能合约引擎。第三,我们应用离散对数等式(DLEQ)算法在加密数据上传到区块链时生成非交互式零知识(NIZK)证明。第四,我们提出了一种激励机制,使数据所有者和ttp从数据共享活动中受益,以及如果检测到恶意行为的惩罚政策。因此,我们为元世界构建了一个数据交换框架,其中所有涉及的实体都是负责任的。最后,我们进行了全面的实验来证明该系统的可行性,并分析了该系统的性能。
{"title":"Data Exchange for the Metaverse With Accountable Decentralized TTPs and Incentive Mechanisms","authors":"Liang Zhang;Xingyu Wu;Yuhang Ma;Haibin Kan","doi":"10.1109/TBDATA.2025.3533924","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533924","url":null,"abstract":"As a global virtual environment, the metaverse poses various challenges regarding data storage, sharing, interoperability, and privacy preservation. Typically, a trusted third party (TTP) is considered necessary in these scenarios. However, relying on a single TTP may introduce biases, compromise privacy, or lead to single-point-of-failure problem. To address these challenges and enable secure data exchange in the metaverse, we propose a system based on decentralized TTPs and the Ethereum blockchain. First, we use the threshold ElGamal cryptosystem to create the decentralized TTPs, employing verifiable secret sharing (VSS) to force owners to share data honestly. Second, we leverage the Ethereum blockchain to serve as the public communication channel, automatic verification machine, and smart contract engine. Third, we apply discrete logarithm equality (DLEQ) algorithms to generate non-interactive zero knowledge (NIZK) proofs when encrypted data is uploaded to the blockchain. Fourth, we present an incentive mechanism to benefit data owners and TTPs from data-sharing activities, as well as a penalty policy if malicious behavior is detected. Consequently, we construct a data exchange framework for the metaverse, in which all involved entities are accountable. Finally, we perform comprehensive experiments to demonstrate the feasibility and analyze the properties of the proposed system.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2431-2442"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multiplex Hypergraph Attribute-Based Graph Collaborative Filtering for Cold-Start POI Recommendation 冷启动POI推荐中基于多路超图属性的图协同过滤
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-27 DOI: 10.1109/TBDATA.2025.3533908
Simon Nandwa Anjiri;Derui Ding;Yan Song;Ying Sun
Within the scope of location-based services and personalized recommendations, the challenges of recommending new and unvisited points of interest (POIs) to mobile users are compounded by the sparsity of check-in data. Traditional recommendation models often overlook user and POI attributes, which exacerbates data sparsity and cold-start problems. To address this issue, a novel multiplex hypergraph attribute-based graph collaborative filtering is proposed for POI recommendation to create a robust recommendation system capable of handling sparse data and cold-start scenarios. Specifically, a multiplex network hypergraph is first constructed to capture complex relationships between users, POIs, and attributes based on the similarities of attributes, visit frequencies, and preferences. Then, an adaptive variational graph auto-encoder adversarial network is developed to accurately infer the users’/POIs’ preference embeddings from their attribute distributions, which reflect complex attribute dependencies and latent structures within the data. Moreover, a dual graph neural network variant based on both Graphsage K-nearest neighbor networks and gated recurrent units are created to effectively capture various attributes of different modalities in a neighborhood, including temporal dependencies in user preferences and spatial attributes of POIs. Finally, experiments conducted on Foursquare and Yelp datasets reveal the superiority and robustness of the developed model compared to some typical state-of-the-art approaches and adequately illustrate the effectiveness of the issues with cold-start users and POIs.
在基于位置的服务和个性化推荐的范围内,向移动用户推荐新的和未访问的兴趣点(poi)的挑战由于签到数据的稀疏性而变得更加复杂。传统的推荐模型往往忽略了用户和POI属性,这加剧了数据稀疏性和冷启动问题。为了解决这一问题,提出了一种新的基于多路超图属性的图协同过滤方法,用于POI推荐,以创建一个能够处理稀疏数据和冷启动场景的鲁棒推荐系统。具体来说,首先构建了一个多路网络超图,以捕获基于属性相似性、访问频率和偏好的用户、poi和属性之间的复杂关系。然后,开发了一种自适应变分图自编码器对抗网络,从用户/ poi的属性分布中准确推断出用户/ poi的偏好嵌入,这些属性分布反映了数据中复杂的属性依赖关系和潜在结构。此外,基于Graphsage k近邻网络和门控循环单元,创建了对偶图神经网络变体,以有效捕获邻域中不同模态的各种属性,包括用户偏好的时间依赖性和poi的空间属性。最后,在Foursquare和Yelp数据集上进行的实验表明,与一些典型的最先进的方法相比,所开发的模型具有优越性和鲁棒性,并充分说明了冷启动用户和poi问题的有效性。
{"title":"A Multiplex Hypergraph Attribute-Based Graph Collaborative Filtering for Cold-Start POI Recommendation","authors":"Simon Nandwa Anjiri;Derui Ding;Yan Song;Ying Sun","doi":"10.1109/TBDATA.2025.3533908","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533908","url":null,"abstract":"Within the scope of location-based services and personalized recommendations, the challenges of recommending new and unvisited points of interest (POIs) to mobile users are compounded by the sparsity of check-in data. Traditional recommendation models often overlook user and POI attributes, which exacerbates data sparsity and cold-start problems. To address this issue, a novel multiplex hypergraph attribute-based graph collaborative filtering is proposed for POI recommendation to create a robust recommendation system capable of handling sparse data and cold-start scenarios. Specifically, a multiplex network hypergraph is first constructed to capture complex relationships between users, POIs, and attributes based on the similarities of attributes, visit frequencies, and preferences. Then, an adaptive variational graph auto-encoder adversarial network is developed to accurately infer the users’/POIs’ preference embeddings from their attribute distributions, which reflect complex attribute dependencies and latent structures within the data. Moreover, a dual graph neural network variant based on both Graphsage K-nearest neighbor networks and gated recurrent units are created to effectively capture various attributes of different modalities in a neighborhood, including temporal dependencies in user preferences and spatial attributes of POIs. Finally, experiments conducted on Foursquare and Yelp datasets reveal the superiority and robustness of the developed model compared to some typical state-of-the-art approaches and adequately illustrate the effectiveness of the issues with cold-start users and POIs.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2401-2416"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital Twin Data Management: A Comprehensive Review 数字孪生数据管理:全面回顾
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-27 DOI: 10.1109/TBDATA.2025.3533891
Ezekiel B. Ouedraogo;Ammar Hawbani;Xingfu Wang;Zhi Liu;Liang Zhao;Mohammed A. A. Al-qaness;Saeed Hamood Alsamhi
Digital Twins are virtual representations of physical assets and systems that rely on effective Data Management to integrate, process, and analyze diverse data sources. This article comprehensively examines Data Management challenges, architectures, techniques, and applications in the context of Digital Twins. It explores key issues such as data heterogeneity, quality assurance, scalability, security, and interoperability. The paper outlines architectural approaches like centralized, distributed, cloud-based, and blockchain solutions and Data Management techniques for modeling, integration, fusion, quality management, and visualization. Domain-specific considerations across manufacturing, smart cities, healthcare, and other sectors are discussed. Finally, open research challenges related to standards, real-time data processing, intelligent Data Management, and ethical aspects are highlighted. By synthesizing the state-of-the-art, this review serves as a valuable reference for developing robust Data Management strategies that enable Digital Twin deployments.
数字孪生是物理资产和系统的虚拟表示,依赖于有效的数据管理来集成、处理和分析不同的数据源。本文全面研究了数字孪生环境中的数据管理挑战、体系结构、技术和应用程序。它探讨了数据异构、质量保证、可伸缩性、安全性和互操作性等关键问题。本文概述了体系结构方法,如集中式、分布式、基于云的和区块链解决方案,以及用于建模、集成、融合、质量管理和可视化的数据管理技术。讨论了制造业、智能城市、医疗保健和其他行业的特定领域考虑事项。最后,强调了与标准、实时数据处理、智能数据管理和伦理方面相关的开放研究挑战。通过综合最新技术,本综述为开发健壮的数据管理策略提供了有价值的参考,从而实现数字孪生部署。
{"title":"Digital Twin Data Management: A Comprehensive Review","authors":"Ezekiel B. Ouedraogo;Ammar Hawbani;Xingfu Wang;Zhi Liu;Liang Zhao;Mohammed A. A. Al-qaness;Saeed Hamood Alsamhi","doi":"10.1109/TBDATA.2025.3533891","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3533891","url":null,"abstract":"Digital Twins are virtual representations of physical assets and systems that rely on effective Data Management to integrate, process, and analyze diverse data sources. This article comprehensively examines Data Management challenges, architectures, techniques, and applications in the context of Digital Twins. It explores key issues such as data heterogeneity, quality assurance, scalability, security, and interoperability. The paper outlines architectural approaches like centralized, distributed, cloud-based, and blockchain solutions and Data Management techniques for modeling, integration, fusion, quality management, and visualization. Domain-specific considerations across manufacturing, smart cities, healthcare, and other sectors are discussed. Finally, open research challenges related to standards, real-time data processing, intelligent Data Management, and ethical aspects are highlighted. By synthesizing the state-of-the-art, this review serves as a valuable reference for developing robust Data Management strategies that enable Digital Twin deployments.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2224-2243"},"PeriodicalIF":5.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2024 Reviewers List* 2024审稿人名单*
IF 7.5 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-15 DOI: 10.1109/TBDATA.2025.3526356
{"title":"2024 Reviewers List*","authors":"","doi":"10.1109/TBDATA.2025.3526356","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3526356","url":null,"abstract":"","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 1","pages":"310-313"},"PeriodicalIF":7.5,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843074","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GCLNet: Generalized Contrastive Learning for Weakly Supervised Temporal Action Localization 弱监督时间动作定位的广义对比学习
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-14 DOI: 10.1109/TBDATA.2025.3528727
Jing Wang;Dehui Kong;Baocai Yin
Weakly supervised temporal action localization (WTAL) aims to precisely locate action instances in given videos by video-level classification supervision, which is partly related to action classification. Most existing localization works directly utilize feature encoders pre-trained for video classification tasks to extract video features, resulting in non-targeted features that lead to incomplete or over-complete action localization. Therefore, we propose Generalized Contrast Learning Network (GCLNet), in which two novel strategies are proposed to improve the pre-trained features. First, to address the issue of over-completeness, GCLNet introduces text information with good context independence and category separability to enrich the expression of video features, as well as proposes a novel generalized contrastive learning approach for similarity metrics, which facilitates pulling closer the features belonging to the same category while pushing farther apart those from different categories. Consequently, it enables more compact intra-class feature learning and ensures accurate action localization. Second, to tackle the problem of incomplete, we exploit the respective advantages of RGB and Flow features in scene appearance and temporal motion expression, designing a hybrid attention strategy in GCLNet to enhance each channel features mutually. This process greatly improves the features through establishing cross-channel consensus. Finally, we conduct extensive experiments on THUMOS14 and ActivityNet1.2, respectively, and the results show that our proposed GCLNet can produce more representative action localization features.
弱监督时态动作定位(WTAL)的目的是通过视频级分类监督来精确定位给定视频中的动作实例,这与动作分类有一定的关系。大多数现有的定位工作直接使用针对视频分类任务预先训练的特征编码器来提取视频特征,导致非目标特征导致不完整或过完整的动作定位。因此,我们提出了广义对比学习网络(GCLNet),其中提出了两种新的策略来改进预训练的特征。首先,为了解决过度完备的问题,GCLNet引入了具有良好上下文独立性和类别可分性的文本信息,丰富了视频特征的表达,并提出了一种新的相似度度量的广义对比学习方法,使属于同一类别的特征更接近,而属于不同类别的特征更远离。因此,它可以实现更紧凑的类内特征学习,并确保准确的动作定位。其次,为了解决不完全问题,利用RGB和Flow特征在场景外观和时间运动表达方面的各自优势,在GCLNet中设计了一种混合注意策略,以相互增强各通道特征。这一过程通过建立跨渠道共识,极大地改善了特征。最后,我们分别在THUMOS14和ActivityNet1.2上进行了大量的实验,结果表明我们提出的GCLNet可以产生更多具有代表性的动作定位特征。
{"title":"GCLNet: Generalized Contrastive Learning for Weakly Supervised Temporal Action Localization","authors":"Jing Wang;Dehui Kong;Baocai Yin","doi":"10.1109/TBDATA.2025.3528727","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3528727","url":null,"abstract":"Weakly supervised temporal action localization (WTAL) aims to precisely locate action instances in given videos by video-level classification supervision, which is partly related to action classification. Most existing localization works directly utilize feature encoders pre-trained for video classification tasks to extract video features, resulting in non-targeted features that lead to incomplete or over-complete action localization. Therefore, we propose Generalized Contrast Learning Network (GCLNet), in which two novel strategies are proposed to improve the pre-trained features. First, to address the issue of over-completeness, GCLNet introduces text information with good context independence and category separability to enrich the expression of video features, as well as proposes a novel generalized contrastive learning approach for similarity metrics, which facilitates pulling closer the features belonging to the same category while pushing farther apart those from different categories. Consequently, it enables more compact intra-class feature learning and ensures accurate action localization. Second, to tackle the problem of incomplete, we exploit the respective advantages of RGB and Flow features in scene appearance and temporal motion expression, designing a hybrid attention strategy in GCLNet to enhance each channel features mutually. This process greatly improves the features through establishing cross-channel consensus. Finally, we conduct extensive experiments on THUMOS14 and ActivityNet1.2, respectively, and the results show that our proposed GCLNet can produce more representative action localization features.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2365-2375"},"PeriodicalIF":5.7,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-View Heterogeneous HyperGNN for Heterophilic Knowledge Combination Prediction 多视图异构HyperGNN的异亲性知识组合预测
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-08 DOI: 10.1109/TBDATA.2025.3527216
Huijie Liu;Shulan Ruan;Han Wu;Zhenya Huang;Defu Lian;Qi Liu;Enhong Chen
Knowledge combination prediction involves analyzing current knowledge elements and their relationships, then forecasting how these elements, drawn from various fields, can be creatively combined to form new, innovative solutions. This process is critical for countries and businesses to understand future technology trends and promote innovation in an era of rapid scientific and technological advancement. Existing methods often overlook the integration of knowledge combinations from multiple views, along with their inherent heterophily and the dual “many-to-one” property, where a single knowledge combination can include multiple elements, and a single element may belong to various combinations. To this end, we propose a novel framework named Multi-view Heterogeneous HyperGNN for Heterophilic Knowledge Combination Prediction (H3KCP). Specifically, H3KCP first constructs a hypergraph reflecting the dual “many-to-one” property of knowledge combinations, where each hyperedge may contain several nodes and each node can also belong to multiple hyperedges. Next, the framework employs a multi-view fusion approach to model knowledge combinations, considering heterophily and integrating insights from co-occurrence, co-citation, and hierarchical structure-based views. Furthermore, our analysis of H3KCP from a spectral graph perspective offers insights into its rationality. Finally, extensive experiments on real-world patent datasets and the Open Academic Graph dataset validate the effectiveness and efficiency of our approach, yielding significant insights into knowledge combinations.
知识组合预测包括分析现有的知识元素及其关系,然后预测这些元素如何从各个领域汲取,创造性地组合起来,形成新的、创新的解决方案。在科技飞速发展的时代,这一过程对于国家和企业了解未来的技术趋势和促进创新至关重要。现有方法往往忽略了从多个角度对知识组合进行集成,以及其固有的异质性和对偶的“多对一”性质,即单个知识组合可以包含多个元素,单个元素可能属于多个组合。为此,我们提出了一种新的框架,称为多视图异构超gnn,用于异亲知识组合预测(H3KCP)。具体来说,H3KCP首先构建了一个反映知识组合对偶“多对一”属性的超图,其中每个超边可以包含多个节点,每个节点也可以属于多个超边。其次,该框架采用多视图融合方法对知识组合进行建模,考虑了异质性,并整合了来自共现、共引用和基于分层结构的视图的见解。此外,我们从谱图的角度对H3KCP进行了分析,为其合理性提供了见解。最后,在真实世界的专利数据集和开放学术图数据集上进行的大量实验验证了我们方法的有效性和效率,对知识组合产生了重要的见解。
{"title":"Multi-View Heterogeneous HyperGNN for Heterophilic Knowledge Combination Prediction","authors":"Huijie Liu;Shulan Ruan;Han Wu;Zhenya Huang;Defu Lian;Qi Liu;Enhong Chen","doi":"10.1109/TBDATA.2025.3527216","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3527216","url":null,"abstract":"Knowledge combination prediction involves analyzing current knowledge elements and their relationships, then forecasting how these elements, drawn from various fields, can be creatively combined to form new, innovative solutions. This process is critical for countries and businesses to understand future technology trends and promote innovation in an era of rapid scientific and technological advancement. Existing methods often overlook the integration of knowledge combinations from multiple views, along with their inherent heterophily and the dual “many-to-one” property, where a single knowledge combination can include multiple elements, and a single element may belong to various combinations. To this end, we propose a novel framework named Multi-view <underline>H</u>eterogeneous <underline>H</u>yperGNN for <underline>H</u>eterophilic <underline>K</u>nowledge <underline>C</u>ombination <underline>P</u>rediction (H3KCP). Specifically, H3KCP first constructs a hypergraph reflecting the dual “many-to-one” property of knowledge combinations, where each hyperedge may contain several nodes and each node can also belong to multiple hyperedges. Next, the framework employs a multi-view fusion approach to model knowledge combinations, considering heterophily and integrating insights from co-occurrence, co-citation, and hierarchical structure-based views. Furthermore, our analysis of H3KCP from a spectral graph perspective offers insights into its rationality. Finally, extensive experiments on real-world patent datasets and the Open Academic Graph dataset validate the effectiveness and efficiency of our approach, yielding significant insights into knowledge combinations.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2321-2337"},"PeriodicalIF":5.7,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advances in Robust Federated Learning: A Survey With Heterogeneity Considerations 鲁棒联邦学习的研究进展:考虑异质性的综述
IF 7.5 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-08 DOI: 10.1109/TBDATA.2025.3527202
Chuan Chen;Tianchi Liao;Xiaojun Deng;Zihou Wu;Sheng Huang;Zibin Zheng
In the field of heterogeneous federated learning (FL), the key challenge is to efficiently and collaboratively train models across multiple clients with different data distributions, model structures, task objectives, computational capabilities, and communication resources. This diversity leads to significant heterogeneity, which increases the complexity of model training. In this paper, we first outline the basic concepts of heterogeneous FL and summarize the research challenges in FL in terms of five aspects: data, model, task, device and communication. In addition, we explore how existing state-of-the-art approaches cope with the heterogeneity of FL, and categorize and review these approaches at three different levels: data-level, model-level, and architecture-level. Subsequently, the paper extensively discusses privacy-preserving strategies in heterogeneous FL environments. Finally, the paper discusses current open issues and directions for future research, aiming to promote the further development of heterogeneous FL.
在异构联邦学习(FL)领域,关键的挑战是如何跨多个具有不同数据分布、模型结构、任务目标、计算能力和通信资源的客户端高效协作地训练模型。这种多样性导致了显著的异质性,从而增加了模型训练的复杂性。本文首先概述了异构语音识别的基本概念,并从数据、模型、任务、设备和通信五个方面总结了异构语音识别的研究挑战。此外,我们探讨了现有的最先进的方法如何应对FL的异质性,并在三个不同的层次上对这些方法进行了分类和回顾:数据级、模型级和体系结构级。随后,本文广泛讨论了异构FL环境下的隐私保护策略。最后,讨论了当前存在的问题和未来的研究方向,旨在促进异质FL的进一步发展。
{"title":"Advances in Robust Federated Learning: A Survey With Heterogeneity Considerations","authors":"Chuan Chen;Tianchi Liao;Xiaojun Deng;Zihou Wu;Sheng Huang;Zibin Zheng","doi":"10.1109/TBDATA.2025.3527202","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3527202","url":null,"abstract":"In the field of heterogeneous federated learning (FL), the key challenge is to efficiently and collaboratively train models across multiple clients with different data distributions, model structures, task objectives, computational capabilities, and communication resources. This diversity leads to significant heterogeneity, which increases the complexity of model training. In this paper, we first outline the basic concepts of heterogeneous FL and summarize the research challenges in FL in terms of five aspects: data, model, task, device and communication. In addition, we explore how existing state-of-the-art approaches cope with the heterogeneity of FL, and categorize and review these approaches at three different levels: data-level, model-level, and architecture-level. Subsequently, the paper extensively discusses privacy-preserving strategies in heterogeneous FL environments. Finally, the paper discusses current open issues and directions for future research, aiming to promote the further development of heterogeneous FL.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1548-1567"},"PeriodicalIF":7.5,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143949337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emulating Reader Behaviors for Fake News Detection 虚假新闻检测的读者行为模拟
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-08 DOI: 10.1109/TBDATA.2025.3527230
Junwei Yin;Min Gao;Kai Shu;Zehua Zhao;Yinqiu Huang;Jia Wang
The wide dissemination of fake news has affected our lives in many aspects, making fake news detection important and attracting increasing attention. Existing approaches make substantial contributions in this field by modeling news from a single-modal or multi-modal perspective. However, these modal-based methods can result in sub-optimal outcomes as they ignore reader behaviors in news consumption and authenticity verification. For instance, they haven't taken into consideration the component-by-component reading process: from the headline, images, comments, to the body, which is essential for modeling news with more granularity. To this end, we propose an approach of Emulating the behaviors of readers (Ember) for fake news detection on social media, incorporating readers’ reading and verificating process to model news from the component perspective thoroughly. Specifically, we first construct intra-component feature extractors to emulate the behaviors of semantic analyzing on each component. Then, we design a module that comprises inter-component feature extractors and a sequence-based aggregator. This module mimics the process of verifying the correlation between components and the overall reading and verification sequence. Thus, Ember can handle the news with various components by emulating corresponding sequences. We conduct extensive experiments on nine real-world datasets, and the results demonstrate the superiority of Ember.
假新闻的广泛传播已经在很多方面影响了我们的生活,使得假新闻的检测变得越来越重要,越来越受到人们的关注。现有的方法通过从单模态或多模态的角度对新闻进行建模,在这一领域做出了重大贡献。然而,这些基于模式的方法可能会导致次优结果,因为它们忽略了读者在新闻消费和真实性验证中的行为。例如,他们没有考虑到一个组件一个组件的阅读过程:从标题、图片、评论到正文,这对于用更多粒度建模新闻是必不可少的。为此,我们提出了一种模拟读者行为(Ember)的方法来检测社交媒体上的假新闻,将读者的阅读和验证过程结合起来,从组件的角度对新闻进行彻底的建模。具体而言,我们首先构建组件内特征提取器来模拟每个组件上的语义分析行为。然后,我们设计了一个包含组件间特征提取器和基于序列的聚合器的模块。该模块模拟了验证组件之间相关性的过程以及整体读取和验证顺序。因此,Ember可以通过模拟相应的序列来处理具有各种组件的新闻。我们在9个真实数据集上进行了大量的实验,结果证明了Ember的优越性。
{"title":"Emulating Reader Behaviors for Fake News Detection","authors":"Junwei Yin;Min Gao;Kai Shu;Zehua Zhao;Yinqiu Huang;Jia Wang","doi":"10.1109/TBDATA.2025.3527230","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3527230","url":null,"abstract":"The wide dissemination of fake news has affected our lives in many aspects, making fake news detection important and attracting increasing attention. Existing approaches make substantial contributions in this field by modeling news from a single-modal or multi-modal perspective. However, these modal-based methods can result in sub-optimal outcomes as they ignore reader behaviors in news consumption and authenticity verification. For instance, they haven't taken into consideration the component-by-component reading process: from the headline, images, comments, to the body, which is essential for modeling news with more granularity. To this end, we propose an approach of <underline>Em</u>ulating the <underline>be</u>haviors of <underline>r</u>eaders (Ember) for fake news detection on social media, incorporating readers’ reading and verificating process to model news from the component perspective thoroughly. Specifically, we first construct intra-component feature extractors to emulate the behaviors of semantic analyzing on each component. Then, we design a module that comprises inter-component feature extractors and a sequence-based aggregator. This module mimics the process of verifying the correlation between components and the overall reading and verification sequence. Thus, Ember can handle the news with various components by emulating corresponding sequences. We conduct extensive experiments on nine real-world datasets, and the results demonstrate the superiority of Ember.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2353-2364"},"PeriodicalIF":5.7,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Portraying Fine-Grained Tenant Portrait for Churn Prediction Using Semi-Supervised Graph Convolution and Attention Network 利用半监督图卷积和注意力网络描绘细粒度租户画像用于客户流失预测
IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2025-01-08 DOI: 10.1109/TBDATA.2025.3527200
Zuodong Jin;Peng Qi;Muyan Yao;Dan Tao
With the widespread application of Big Data and intelligent information systems, the tenant has become the main form of most scenarios. As a data mining technique, the portrait has been widely used to provide targeted services. Therefore, we transfer the traditional user-driven portrait into tenant driven for churn prediction. To achieve it, this paper first proposes a three-layer architecture and defines the fine-grained features for creating portraits from the perspective of tenants. In a large-scale telecommunication industry dataset of 100,000 tenants, we construct the tenant portrait through the proposed framework, and analyze the influences of the defined features on churn possibility. Then, considering the information missing caused by privacy concerns, we come up with the CrossMatch, a portrait completion model based on semi-supervised and graph convolution, which combines the relation characteristics among tenants for recovering missing information. On this basis, we design the tenant churn prediction method based on a directed attention network. Moreover, we recover missing information on three public node datasets with CrossMatch, achieving around 1-2$%$ improvement. We then apply the directed attention network for churn prediction and achieve an Accuracy of 75.06$%$, Precision of 77.78$%$, and F1-score of 71.43$%$, which outperforms all the baselines.
随着大数据和智能信息系统的广泛应用,租户已经成为大多数场景的主要形式。作为一种数据挖掘技术,画像已被广泛用于提供有针对性的服务。因此,我们将传统的用户驱动画像转换为租户驱动的流失预测。为了实现这一目标,本文首先提出了一个三层架构,并定义了从租户角度创建肖像的细粒度特征。在10万租户的大型电信行业数据集中,我们通过提出的框架构建了租户画像,并分析了定义的特征对流失可能性的影响。然后,考虑到隐私问题导致的信息缺失,我们提出了一种基于半监督和图卷积的画像补全模型CrossMatch,该模型结合租户之间的关系特征来恢复缺失的信息。在此基础上,设计了基于定向注意力网络的租户流失预测方法。此外,我们使用CrossMatch在三个公共节点数据集上恢复了缺失的信息,实现了大约1- 2%的改进。然后,我们将定向注意力网络应用于流失预测,并获得了75.06美元的准确度,77.78美元的精确度和71.43美元的f1分数,优于所有基线。
{"title":"Portraying Fine-Grained Tenant Portrait for Churn Prediction Using Semi-Supervised Graph Convolution and Attention Network","authors":"Zuodong Jin;Peng Qi;Muyan Yao;Dan Tao","doi":"10.1109/TBDATA.2025.3527200","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3527200","url":null,"abstract":"With the widespread application of Big Data and intelligent information systems, the tenant has become the main form of most scenarios. As a data mining technique, the portrait has been widely used to provide targeted services. Therefore, we transfer the traditional user-driven portrait into tenant driven for churn prediction. To achieve it, this paper first proposes a three-layer architecture and defines the fine-grained features for creating portraits from the perspective of tenants. In a large-scale telecommunication industry dataset of 100,000 tenants, we construct the tenant portrait through the proposed framework, and analyze the influences of the defined features on churn possibility. Then, considering the information missing caused by privacy concerns, we come up with the <i>CrossMatch</i>, a portrait completion model based on semi-supervised and graph convolution, which combines the relation characteristics among tenants for recovering missing information. On this basis, we design the tenant churn prediction method based on a directed attention network. Moreover, we recover missing information on three public node datasets with <i>CrossMatch</i>, achieving around 1-2<inline-formula><tex-math>$%$</tex-math></inline-formula> improvement. We then apply the directed attention network for churn prediction and achieve an Accuracy of 75.06<inline-formula><tex-math>$%$</tex-math></inline-formula>, Precision of 77.78<inline-formula><tex-math>$%$</tex-math></inline-formula>, and F1-score of 71.43<inline-formula><tex-math>$%$</tex-math></inline-formula>, which outperforms all the baselines.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2296-2307"},"PeriodicalIF":5.7,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Big Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1