SDCG: Silhouette-based Deep Clustering with GNN for Improved Graph Node Clustering

Hyesoo Shin, Eunjo Jang, Sojeong Kim, Ki Yong Lee
{"title":"SDCG: Silhouette-based Deep Clustering with GNN for Improved Graph Node Clustering","authors":"Hyesoo Shin, Eunjo Jang, Sojeong Kim, Ki Yong Lee","doi":"10.1109/SERA57763.2023.10197683","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) are powerful tools for analyzing graph-structured data in various fields because of their great expressive power for graph data. They use a message-passing mechanism to update node embeddings, which are then used for tasks such as node classification and link prediction. Recently, node embeddings have also been used in research on graph node clustering, which aims to group similar nodes based on their features and graph topology. However, traditional methods for node clustering have a limitation in that GNNs only focus on generating node embeddings without considering the ultimate objective of clustering. To address this issue, a novel technique called \"Deep Clustering\" has been proposed, which integrates both node embedding and clustering stages. This requires defining a new loss function by simultaneously minimizing the GNN loss and the clustering loss. Our proposed loss function incorporates not only the distance within clusters but also the distance between clusters by applying the Silhouette coefficient, which enables us to achieve better clustering results. In this paper, we propose a Silhouette-based Deep Clustering with GNN (SDCG) to more effectively cluster nodes in a graph by iteratively training the embedding model to produce embedding vectors with improved clustering results. Through extensive experiments, we demonstrate that SDCG outperforms the conventional approach of performing embedding and clustering independently.","PeriodicalId":211080,"journal":{"name":"2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications (SERA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SERA57763.2023.10197683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Graph Neural Networks (GNNs) are powerful tools for analyzing graph-structured data in various fields because of their great expressive power for graph data. They use a message-passing mechanism to update node embeddings, which are then used for tasks such as node classification and link prediction. Recently, node embeddings have also been used in research on graph node clustering, which aims to group similar nodes based on their features and graph topology. However, traditional methods for node clustering have a limitation in that GNNs only focus on generating node embeddings without considering the ultimate objective of clustering. To address this issue, a novel technique called "Deep Clustering" has been proposed, which integrates both node embedding and clustering stages. This requires defining a new loss function by simultaneously minimizing the GNN loss and the clustering loss. Our proposed loss function incorporates not only the distance within clusters but also the distance between clusters by applying the Silhouette coefficient, which enables us to achieve better clustering results. In this paper, we propose a Silhouette-based Deep Clustering with GNN (SDCG) to more effectively cluster nodes in a graph by iteratively training the embedding model to produce embedding vectors with improved clustering results. Through extensive experiments, we demonstrate that SDCG outperforms the conventional approach of performing embedding and clustering independently.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SDCG:基于轮廓的深度聚类与改进图节点聚类的GNN
图神经网络(Graph Neural Networks, gnn)对图数据具有很强的表达能力,是分析各种领域图结构数据的有力工具。它们使用消息传递机制来更新节点嵌入,然后将其用于节点分类和链接预测等任务。最近,节点嵌入也被用于图节点聚类的研究,其目的是根据节点的特征和图的拓扑结构对相似的节点进行分组。然而,传统的节点聚类方法存在一个局限性,即gnn只关注生成节点嵌入,而不考虑聚类的最终目的。为了解决这个问题,一种新的技术被称为“深度聚类”,它集成了节点嵌入和聚类两个阶段。这需要定义一个新的损失函数,同时最小化GNN损失和聚类损失。我们提出的损失函数不仅包含聚类内的距离,还包含聚类之间的距离,通过应用Silhouette系数,使我们能够获得更好的聚类结果。在本文中,我们提出了一种基于轮廓的GNN深度聚类(SDCG)方法,通过迭代训练嵌入模型来产生嵌入向量,从而提高聚类结果,从而更有效地聚类图中的节点。通过大量的实验,我们证明SDCG优于传统的独立执行嵌入和聚类的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enhancing Students’ Job Seeking Process Through A Digital Badging System Classification of Multilingual Medical Documents using Deep Learning Data-Driven Smart Manufacturing Technologies for Prop Shop Systems Identifying Code Tampering Using A Bytecode Comparison Analysis Tool Evaluating the Performance of Containerized Webservers against web servers on Virtual Machines using Bombardment and Siege
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1