The widespread use of AI-generated text has introduced significant security concerns, driving the need for reliable detection systems. However, recent studies reveal that neural network-based detectors are vulnerable to adversarial examples. To improve the robustness of such classifiers, a number of adversarial attack strategies have been developed, particularly in the context of text sentiment classification. Most existing adversarial attack methods focus on the semantics of individual words or sentences, often neglecting the broader contextual semantics of the entire text—particularly in the case of long AI-generated text. This limitation frequently results in adversarial examples that lack fluency and coherence. In this paper, we propose a novel method called Sentence-based Adversarial attack on AI-Generated Text detectors (SAGT), which generates linguistically fluent adversarial examples by inserting model-generated sentences into the original text. To ensure contextual semantic consistency, we extract important keywords from the original text—selected based on changes in the detector's confidence score—and incorporate them into the generated sentences. Extensive experimental results demonstrate that adversarial examples crafted by SAGT can effectively evade AI-generated text detectors.
{"title":"Sentences Based Adversarial Attack on AI-Generated Text Detectors","authors":"Rongxin Tu;Xiangui Kang;Chee Wei Tan;Chi-Hung Chi;Kwok-Yan Lam","doi":"10.1109/TBDATA.2025.3600034","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3600034","url":null,"abstract":"The widespread use of AI-generated text has introduced significant security concerns, driving the need for reliable detection systems. However, recent studies reveal that neural network-based detectors are vulnerable to adversarial examples. To improve the robustness of such classifiers, a number of adversarial attack strategies have been developed, particularly in the context of text sentiment classification. Most existing adversarial attack methods focus on the semantics of individual words or sentences, often neglecting the broader contextual semantics of the entire text—particularly in the case of long AI-generated text. This limitation frequently results in adversarial examples that lack fluency and coherence. In this paper, we propose a novel method called <italic>Sentence-based Adversarial attack on AI-Generated Text detectors (SAGT)</i>, which generates linguistically fluent adversarial examples by inserting model-generated sentences into the original text. To ensure contextual semantic consistency, we extract important keywords from the original text—selected based on changes in the detector's confidence score—and incorporate them into the generated sentences. Extensive experimental results demonstrate that adversarial examples crafted by <italic>SAGT</i> can effectively evade AI-generated text detectors.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"80-91"},"PeriodicalIF":5.7,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-19DOI: 10.1109/TBDATA.2025.3600037
Rong Wang;Na Lv;Xing Huang;Qingwang Guo;Yunpeng Xiao;Chaolong Jia;Haofei Xie
In Intelligent Transportation Systems (ITS), the accuracy of compensating for missing traffic data is critical. This directly impacts the effectiveness of traffic flow prediction and road condition monitoring. Inspired by image restoration techniques, this study introduces the Generative Adversarial Network (GAN) to enhance traffic data compensation. First, to address the problem of converting traffic data into the traffic flow matrix of the road network, we propose the RoadNetIMatrix algorithm to generate the traffic flow matrix of the road network. This algorithm precisely captures traffic flow dynamics in road networks and provides a holistic representation of traffic states. Second, given the inherent spatio-temporal correlation in traffic data, we proposed a spatio-temporal collaborative mining component (STSSM). This component integrates the hidden temporal dependencies and spatial features of the mined traffic data into the GAN generator to improve the authenticity of the generated content and ensure the consistency of data compensation. Finally, addressing the influence of external characteristics of traffic data on data compensation results, an external information module based on a multi-head attention mechanism is constructed, which can effectively mine the influence of external factors of traffic data. Furthermore, spatio-temporal and external features are fused to further improve the accuracy of data compensation. Experiments show that the model has a higher accuracy of data compensation and a better generalization of the system in the case of multiple types or a high data loss rate.
{"title":"ST-DDGAN: A Traffic Data Compensation Model Based on Image Restoration Technology","authors":"Rong Wang;Na Lv;Xing Huang;Qingwang Guo;Yunpeng Xiao;Chaolong Jia;Haofei Xie","doi":"10.1109/TBDATA.2025.3600037","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3600037","url":null,"abstract":"In Intelligent Transportation Systems (ITS), the accuracy of compensating for missing traffic data is critical. This directly impacts the effectiveness of traffic flow prediction and road condition monitoring. Inspired by image restoration techniques, this study introduces the Generative Adversarial Network (GAN) to enhance traffic data compensation. First, to address the problem of converting traffic data into the traffic flow matrix of the road network, we propose the RoadNetIMatrix algorithm to generate the traffic flow matrix of the road network. This algorithm precisely captures traffic flow dynamics in road networks and provides a holistic representation of traffic states. Second, given the inherent spatio-temporal correlation in traffic data, we proposed a spatio-temporal collaborative mining component (STSSM). This component integrates the hidden temporal dependencies and spatial features of the mined traffic data into the GAN generator to improve the authenticity of the generated content and ensure the consistency of data compensation. Finally, addressing the influence of external characteristics of traffic data on data compensation results, an external information module based on a multi-head attention mechanism is constructed, which can effectively mine the influence of external factors of traffic data. Furthermore, spatio-temporal and external features are fused to further improve the accuracy of data compensation. Experiments show that the model has a higher accuracy of data compensation and a better generalization of the system in the case of multiple types or a high data loss rate.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"62-79"},"PeriodicalIF":5.7,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-19DOI: 10.1109/TBDATA.2025.3600014
Lan Zhao;Boyue Wang;Junbin Gao;Xiaoyan Li;Yongli Hu;Baocai Yin
Current multi-modal knowledge graph completion often incorporates simple fusion neural networks to achieve multi-modal alignment and knowledge completion tasks, which face three major challenges: 1) Inconsistent semantics between images and texts corresponding to the same entity; 2) Discrepancies in semantic spaces resulting from the use of diverse uni-modal feature extractors; 3) Inadequate evaluation of semantic alignment using only energy functions or basic contrastive learning losses. To address these challenges, we propose the Multi-modal Entity in One Word (MEOW) model. This model ensures alignment at various levels, including text-image match alignment, feature alignment and distribution alignment. Specificially, the entity image filtering module utilizes a visual-language model to exclude unrelated images by aligning their captions with corresponding text descriptions. A pre-trained CLIP-based encoder is utilized for encoding dense semantic relationships, while a graph attention network based structure encoder handles sparse semantic relationships, yielding a comprehensive semantic representation and enhancing convergence speed. Additionally, a diffusion model is integrated to enhance denoising capabilities. The proposed MEOW further includes a distribution alignment module equipped with dense alignment constraint, integrity alignment constraint, and fusion fidelity constraint to effectively align multi-modal representations. Experiments on two public multi-modal knowledge graph datasets show that MEOW significantly improves link prediction performance.
{"title":"Multi-Modal Entity in One Word: Aligning Multi-Level Semantics for Multi-Modal Knowledge Graph Completion","authors":"Lan Zhao;Boyue Wang;Junbin Gao;Xiaoyan Li;Yongli Hu;Baocai Yin","doi":"10.1109/TBDATA.2025.3600014","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3600014","url":null,"abstract":"Current multi-modal knowledge graph completion often incorporates simple fusion neural networks to achieve multi-modal alignment and knowledge completion tasks, which face three major challenges: 1) Inconsistent semantics between images and texts corresponding to the same entity; 2) Discrepancies in semantic spaces resulting from the use of diverse uni-modal feature extractors; 3) Inadequate evaluation of semantic alignment using only energy functions or basic contrastive learning losses. To address these challenges, we propose the Multi-modal Entity in One Word (MEOW) model. This model ensures alignment at various levels, including text-image match alignment, feature alignment and distribution alignment. Specificially, the entity image filtering module utilizes a visual-language model to exclude unrelated images by aligning their captions with corresponding text descriptions. A pre-trained CLIP-based encoder is utilized for encoding dense semantic relationships, while a graph attention network based structure encoder handles sparse semantic relationships, yielding a comprehensive semantic representation and enhancing convergence speed. Additionally, a diffusion model is integrated to enhance denoising capabilities. The proposed MEOW further includes a distribution alignment module equipped with dense alignment constraint, integrity alignment constraint, and fusion fidelity constraint to effectively align multi-modal representations. Experiments on two public multi-modal knowledge graph datasets show that MEOW significantly improves link prediction performance.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 6","pages":"3539-3552"},"PeriodicalIF":5.7,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145493294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-19DOI: 10.1109/TBDATA.2025.3600004
Bo Liu;Lingling Tao;Xiaodan Chen;Zhijun Li
The dynamics of multivariate time series (MTS) data are jointly characterized by its nonlinear temporal dependencies and complex variable dependencies, making unsupervised time series anomaly detection a challenging task. Existing methods primarily rely on prediction or reconstruction errors, neglecting the valuable information within the variable dependencies. In this paper, we propose a variable dependency discrepancy-based Transformer (VDDFormer) for unsupervised MTS anomaly detection. VDDFormer comprises a variable correlation encoder, a temporal dependency encoder, and a reconstruction decoder. The variable correlation encoder capitalizes on a variable dependency attention mechanism, which employs self-attention to learn the global variable dependencies; meanwhile, the local variable dependencies are captured by the adaptive correlation matrix. The global and local variable dependencies are then used to compute the variable dependency discrepancy as a new intrinsic property to distinguish between normal and abnormal patterns. By integrating this new discrepancy with the reconstruction error, the model effectively enhances its anomaly differentiation capability. Extensive experiments on five real-world anomaly detection datasets demonstrate that VDDFormer effectively and robustly detects group anomaly patterns by leveraging the variable dependency discrepancy and achieves state-of-the-art performance on four out of the five datasets.
{"title":"VDDFormer: A Variable Dependency Discrepancy-Based Transformer for Multivariate Time Series Anomaly Detection","authors":"Bo Liu;Lingling Tao;Xiaodan Chen;Zhijun Li","doi":"10.1109/TBDATA.2025.3600004","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3600004","url":null,"abstract":"The dynamics of multivariate time series (MTS) data are jointly characterized by its nonlinear temporal dependencies and complex variable dependencies, making unsupervised time series anomaly detection a challenging task. Existing methods primarily rely on prediction or reconstruction errors, neglecting the valuable information within the variable dependencies. In this paper, we propose a variable dependency discrepancy-based Transformer (VDDFormer) for unsupervised MTS anomaly detection. VDDFormer comprises a variable correlation encoder, a temporal dependency encoder, and a reconstruction decoder. The variable correlation encoder capitalizes on a variable dependency attention mechanism, which employs self-attention to learn the global variable dependencies; meanwhile, the local variable dependencies are captured by the adaptive correlation matrix. The global and local variable dependencies are then used to compute the variable dependency discrepancy as a new intrinsic property to distinguish between normal and abnormal patterns. By integrating this new discrepancy with the reconstruction error, the model effectively enhances its anomaly differentiation capability. Extensive experiments on five real-world anomaly detection datasets demonstrate that VDDFormer effectively and robustly detects group anomaly patterns by leveraging the variable dependency discrepancy and achieves state-of-the-art performance on four out of the five datasets.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"34-46"},"PeriodicalIF":5.7,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-19DOI: 10.1109/TBDATA.2025.3600035
Luyi Bai;Jixuan Dong;Lin Zhu
The temporal knowledge graph (TKG) query facilitates the retrieval of potential answers by parsing questions that incorporate temporal constraints, regarded as a vital downstream task in the broader spectrum of the TKG applications. Currently, enhancing the accuracy of the queries and the user experience has become a focal point for researchers. Existing query methods of the TKG aim to execute unambiguous standard query statements to return query results while neglecting the potential ambiguity in user input queries. To overcome this problem, in this paper, we propose a semantic query model for temporal knowledge graphs, TKGSQ-PM (Temporal Knowledge Graph Semantic Query based on Pre-trained Model). This model first identifies and extracts entity and temporal information from temporal knowledge graph queries and obtains corresponding temporal knowledge graph embedding information based on embedding methods. Then, it utilizes the pre-trained model DistilBERT to infer the true query intent from user input queries. Finally, it performs comprehensive sorting to return high-quality query results. We conduct multiple experiments on three different datasets to demonstrate the efficiency and effectiveness of the proposed methods. Experimental results indicate that the TKGSQ-PM model has an overall advantage over baseline models in terms of query effectiveness and efficiency.
{"title":"Intent-Driven Semantic Query: An Effective Approach for Temporal Knowledge Graph Query","authors":"Luyi Bai;Jixuan Dong;Lin Zhu","doi":"10.1109/TBDATA.2025.3600035","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3600035","url":null,"abstract":"The temporal knowledge graph (TKG) query facilitates the retrieval of potential answers by parsing questions that incorporate temporal constraints, regarded as a vital downstream task in the broader spectrum of the TKG applications. Currently, enhancing the accuracy of the queries and the user experience has become a focal point for researchers. Existing query methods of the TKG aim to execute unambiguous standard query statements to return query results while neglecting the potential ambiguity in user input queries. To overcome this problem, in this paper, we propose a semantic query model for temporal knowledge graphs, TKGSQ-PM (Temporal Knowledge Graph Semantic Query based on Pre-trained Model). This model first identifies and extracts entity and temporal information from temporal knowledge graph queries and obtains corresponding temporal knowledge graph embedding information based on embedding methods. Then, it utilizes the pre-trained model DistilBERT to infer the true query intent from user input queries. Finally, it performs comprehensive sorting to return high-quality query results. We conduct multiple experiments on three different datasets to demonstrate the efficiency and effectiveness of the proposed methods. Experimental results indicate that the TKGSQ-PM model has an overall advantage over baseline models in terms of query effectiveness and efficiency.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"12 1","pages":"92-104"},"PeriodicalIF":5.7,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-10DOI: 10.1109/TBDATA.2024.3451088
Shuo Shang;Qi Liu;Renhe Jiang;Ryosuke Shibasaki;Panos Kalnis;Christian S. Jensen
{"title":"Editorial High-Performance Recommender Systems Based on Spatiotemporal Data","authors":"Shuo Shang;Qi Liu;Renhe Jiang;Ryosuke Shibasaki;Panos Kalnis;Christian S. Jensen","doi":"10.1109/TBDATA.2024.3451088","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3451088","url":null,"abstract":"","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1588-1588"},"PeriodicalIF":7.5,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11077801","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-10DOI: 10.1109/TBDATA.2024.3485316
Desheng Dash Wu;David L. Olson
This special issue deals with research related to applications of and methods to support Big Data analytics in complex social information networks. The digital age and the rise of social media have sped up changes to social systems with unforeseen consequences. However, there are major challenges created.
{"title":"Editorial: Big Data Analytics in Complex Social Information Networks","authors":"Desheng Dash Wu;David L. Olson","doi":"10.1109/TBDATA.2024.3485316","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3485316","url":null,"abstract":"This special issue deals with research related to applications of and methods to support Big Data analytics in complex social information networks. The digital age and the rise of social media have sped up changes to social systems with unforeseen consequences. However, there are major challenges created.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1650-1651"},"PeriodicalIF":7.5,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11077792","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-18DOI: 10.1109/TBDATA.2025.3562486
Wenxin Zhang;Cuicui Luo
Graph Neural Networks (GNNs) play a significant role and have been widely applied in fraud detection tasks, exhibiting substantial improvements in detection performance compared to conventional methodologies. However, within the intricate structure of fraud graphs, fraudsters usually camouflage themselves among a large number of benign entities. An effective solution to address the camouflage problem involves the incorporation of complex and abundant edge information. Nevertheless, existing GNN-based methods frequently neglect to integrate this crucial information into the message passing process, thereby limiting their efficacy. To address the above issues, this study proposes a novel Gated Edge-augmented Graph Neural Network(GE-GNN). Our approach begins with an edge-based feature augmentation mechanism that leverages both node and edge features within a single relation. Subsequently, we apply the augmented representation to the message passing process to update the node embeddings. Furthermore, we design a gate logistic to regulate the expression of augmented information. Finally, we integrate node features across different relations to obtain a comprehensive representation. Extensive experimental results on two real-world datasets demonstrate that the proposed method outperforms several state-of-the-art methods.
{"title":"GE-GNN: Gated Edge-Augmented Graph Neural Network for Fraud Detection","authors":"Wenxin Zhang;Cuicui Luo","doi":"10.1109/TBDATA.2025.3562486","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3562486","url":null,"abstract":"Graph Neural Networks (GNNs) play a significant role and have been widely applied in fraud detection tasks, exhibiting substantial improvements in detection performance compared to conventional methodologies. However, within the intricate structure of fraud graphs, fraudsters usually camouflage themselves among a large number of benign entities. An effective solution to address the camouflage problem involves the incorporation of complex and abundant edge information. Nevertheless, existing GNN-based methods frequently neglect to integrate this crucial information into the message passing process, thereby limiting their efficacy. To address the above issues, this study proposes a novel Gated Edge-augmented Graph Neural Network(GE-GNN). Our approach begins with an edge-based feature augmentation mechanism that leverages both node and edge features within a single relation. Subsequently, we apply the augmented representation to the message passing process to update the node embeddings. Furthermore, we design a gate logistic to regulate the expression of augmented information. Finally, we integrate node features across different relations to obtain a comprehensive representation. Extensive experimental results on two real-world datasets demonstrate that the proposed method outperforms several state-of-the-art methods.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1664-1676"},"PeriodicalIF":7.5,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-08DOI: 10.1109/TBDATA.2025.3558855
Faqian Guan;Tianqing Zhu;Wanlei Zhou;Philip S. Yu
Graph neural networks (GNNs) have obtained considerable attention due to their ability to leverage the inherent topological and node information present in graph data. While extensive research has been conducted on privacy attacks targeting machine learning models, the exploration of privacy risks associated with node-level membership inference attacks on GNNs remains relatively limited. GNNs learn representations that encapsulate valuable information about the nodes. These learned representations can be exploited by attackers to infer whether a specific node belongs to the training dataset, leading to the disclosure of sensitive information. The insidious nature of such privacy breaches often leads to an underestimation of the associated risks. Furthermore, the inherent challenges posed by node membership inference attacks make it difficult to develop effective attack models for GNNs that can successfully infer node membership. We propose a more efficient approach that specifically targets node-level membership inference attacks on GNNs. Initially, we combine nodes and their respective neighbors to carry out node membership inference attacks. To address the challenge of variable-length features arising from the differing number of neighboring nodes, we introduce an effective feature processing strategy. Furthermore, we propose two strategies: multiple training of shadow models and random selection of non-membership data, to enhance the performance of the attack model. We empirically evaluate the efficacy of our proposed method using three benchmark datasets. Additionally, we explore two potential defense mechanisms against node-level membership inference attacks.
{"title":"Topology-Based Node-Level Membership Inference Attacks on Graph Neural Networks","authors":"Faqian Guan;Tianqing Zhu;Wanlei Zhou;Philip S. Yu","doi":"10.1109/TBDATA.2025.3558855","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3558855","url":null,"abstract":"Graph neural networks (GNNs) have obtained considerable attention due to their ability to leverage the inherent topological and node information present in graph data. While extensive research has been conducted on privacy attacks targeting machine learning models, the exploration of privacy risks associated with node-level membership inference attacks on GNNs remains relatively limited. GNNs learn representations that encapsulate valuable information about the nodes. These learned representations can be exploited by attackers to infer whether a specific node belongs to the training dataset, leading to the disclosure of sensitive information. The insidious nature of such privacy breaches often leads to an underestimation of the associated risks. Furthermore, the inherent challenges posed by node membership inference attacks make it difficult to develop effective attack models for GNNs that can successfully infer node membership. We propose a more efficient approach that specifically targets node-level membership inference attacks on GNNs. Initially, we combine nodes and their respective neighbors to carry out node membership inference attacks. To address the challenge of variable-length features arising from the differing number of neighboring nodes, we introduce an effective feature processing strategy. Furthermore, we propose two strategies: multiple training of shadow models and random selection of non-membership data, to enhance the performance of the attack model. We empirically evaluate the efficacy of our proposed method using three benchmark datasets. Additionally, we explore two potential defense mechanisms against node-level membership inference attacks.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2809-2826"},"PeriodicalIF":5.7,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-01DOI: 10.1109/TBDATA.2025.3556636
Hanqi Zhang;Yandong Zheng;Chang Xu;Liehuang Zhu;Jiayin Wang
With the rapid development of cloud computing, online health monitoring systems are becoming increasingly prevalent. To protect medical data privacy while supporting search operations, Dynamic Searchable Symmetric Encryption (DSSE) technology has been widely used in health monitoring systems. For better monitoring of patient status, keyword range query is also a necessary requirement for the DSSE scheme. Furthermore, in the multi-user setting, user revocation usually leads the owner to download and re-encrypt all indexes, resulting in significant computational overhead. In this paper, we propose a lightweight revocable DSSE scheme with range query support. First, we propose a novel and privacy-preserving range query algorithm that defends plaintext inference attacks. Second, we design a singly linked list structure based on delegatable pseudorandom functions and key-updatable pseudorandom functions, which support lightweight user revocation. Rigorous security analysis proves the security of our proposed range query scheme and demonstrates that our scheme can achieve forward and backward privacy. Experimental evaluations show that our scheme is highly efficient.
{"title":"Revocable DSSE in Healthcare Systems With Range Query Support","authors":"Hanqi Zhang;Yandong Zheng;Chang Xu;Liehuang Zhu;Jiayin Wang","doi":"10.1109/TBDATA.2025.3556636","DOIUrl":"https://doi.org/10.1109/TBDATA.2025.3556636","url":null,"abstract":"With the rapid development of cloud computing, online health monitoring systems are becoming increasingly prevalent. To protect medical data privacy while supporting search operations, Dynamic Searchable Symmetric Encryption (DSSE) technology has been widely used in health monitoring systems. For better monitoring of patient status, keyword range query is also a necessary requirement for the DSSE scheme. Furthermore, in the multi-user setting, user revocation usually leads the owner to download and re-encrypt all indexes, resulting in significant computational overhead. In this paper, we propose a lightweight revocable DSSE scheme with range query support. First, we propose a novel and privacy-preserving range query algorithm that defends plaintext inference attacks. Second, we design a singly linked list structure based on delegatable pseudorandom functions and key-updatable pseudorandom functions, which support lightweight user revocation. Rigorous security analysis proves the security of our proposed range query scheme and demonstrates that our scheme can achieve forward and backward privacy. Experimental evaluations show that our scheme is highly efficient.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2764-2778"},"PeriodicalIF":5.7,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}