Wacml:基于图神经网络的不平衡节点分类算法

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC ACS Applied Electronic Materials Pub Date : 2024-08-30 DOI:10.1007/s00530-024-01454-1
Junfeng Wang, Jiayue Yang, Lidun
{"title":"Wacml:基于图神经网络的不平衡节点分类算法","authors":"Junfeng Wang, Jiayue Yang, Lidun","doi":"10.1007/s00530-024-01454-1","DOIUrl":null,"url":null,"abstract":"<p>The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Wacml: based on graph neural network for imbalanced node classification algorithm\",\"authors\":\"Junfeng Wang, Jiayue Yang, Lidun\",\"doi\":\"10.1007/s00530-024-01454-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.</p>\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00530-024-01454-1\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01454-1","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

社交媒体上大量机器人账户的出现造成了负面的社会影响。在大多数情况下,机器人账号和真人账号的分布不平衡,导致代表性不足,少数类型样本的性能较差。图神经网络能有效利用用户互动,被广泛用于处理图结构数据,在机器人检测中取得了良好的性能。然而,以往基于图神经网络的机器人检测方法大多考虑了类不平衡的影响。然而,在图结构数据中,由于标记节点的位置和结构不同而导致的不平衡,使得 GNN 的处理结果容易偏向较大的类别。由于没有考虑图结构特有的连接性问题,节点的分类性能并不理想。因此,针对现有方案的不足,本文提出了一种基于少数加权和异常连通性边际损失的类不平衡节点分类算法,将机器学习领域传统的不平衡分类思想扩展到图结构数据,共同处理数量不平衡和图结构异常连通性问题,提高 GNN 对连接异常的感知能力。在节点特征聚合阶段,对少数类进行加权聚合。在超采样阶段,使用 SMOTE 算法处理不平衡数据,同时考虑节点表示和拓扑结构。同时训练边缘生成器对关系信息进行建模,结合异常连通性边际损失,加强模型对连通性信息的学习,大大提高了边缘生成器的质量。最后,我们对一个公开的数据集进行了评估,实验结果表明它在分类不平衡节点方面取得了良好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Wacml: based on graph neural network for imbalanced node classification algorithm

The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
期刊最新文献
Vitamin B12: prevention of human beings from lethal diseases and its food application. Current status and obstacles of narrowing yield gaps of four major crops. Cold shock treatment alleviates pitting in sweet cherry fruit by enhancing antioxidant enzymes activity and regulating membrane lipid metabolism. Removal of proteins and lipids affects structure, in vitro digestion and physicochemical properties of rice flour modified by heat-moisture treatment. Investigating the impact of climate variables on the organic honey yield in Turkey using XGBoost machine learning.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1