利用图卷积网络的深度潜位置模型进行聚类

IF 1.4 4区计算机科学 Q2 STATISTICS & PROBABILITY Advances in Data Analysis and Classification Pub Date : 2024-03-12 DOI:10.1007/s11634-024-00583-9

Dingge Liang, Marco Corneli, Charles Bouveyron, Pierre Latouche

{"title":"利用图卷积网络的深度潜位置模型进行聚类","authors":"Dingge Liang, Marco Corneli, Charles Bouveyron, Pierre Latouche","doi":"10.1007/s11634-024-00583-9","DOIUrl":null,"url":null,"abstract":"<p>With the significant increase of interactions between individuals through numeric means, clustering of nodes in graphs has become a fundamental approach for analyzing large and complex networks. In this work, we propose the deep latent position model (DeepLPM), an end-to-end generative clustering approach which combines the widely used latent position model (LPM) for network analysis with a graph convolutional network encoding strategy. Moreover, an original estimation algorithm is introduced to integrate the explicit optimization of the posterior clustering probabilities via variational inference and the implicit optimization using stochastic gradient descent for graph reconstruction. Numerical experiments on simulated scenarios highlight the ability of DeepLPM to self-penalize the evidence lower bound for selecting the number of clusters, demonstrating its clustering capabilities compared to state-of-the-art methods. Finally, DeepLPM is further applied to an ecclesiastical network in Merovingian Gaul and to a citation network Cora to illustrate the practical interest in exploring large and complex real-world networks.</p>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"35 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering by deep latent position model with graph convolutional network\",\"authors\":\"Dingge Liang, Marco Corneli, Charles Bouveyron, Pierre Latouche\",\"doi\":\"10.1007/s11634-024-00583-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With the significant increase of interactions between individuals through numeric means, clustering of nodes in graphs has become a fundamental approach for analyzing large and complex networks. In this work, we propose the deep latent position model (DeepLPM), an end-to-end generative clustering approach which combines the widely used latent position model (LPM) for network analysis with a graph convolutional network encoding strategy. Moreover, an original estimation algorithm is introduced to integrate the explicit optimization of the posterior clustering probabilities via variational inference and the implicit optimization using stochastic gradient descent for graph reconstruction. Numerical experiments on simulated scenarios highlight the ability of DeepLPM to self-penalize the evidence lower bound for selecting the number of clusters, demonstrating its clustering capabilities compared to state-of-the-art methods. Finally, DeepLPM is further applied to an ecclesiastical network in Merovingian Gaul and to a citation network Cora to illustrate the practical interest in exploring large and complex real-world networks.</p>\",\"PeriodicalId\":49270,\"journal\":{\"name\":\"Advances in Data Analysis and Classification\",\"volume\":\"35 1\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Data Analysis and Classification\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11634-024-00583-9\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Data Analysis and Classification","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11634-024-00583-9","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

摘要

随着个体间通过数字手段进行交互的显著增加，图中节点的聚类已成为分析大型复杂网络的基本方法。在这项工作中，我们提出了深度潜在位置模型（DeepLPM），这是一种端到端的生成聚类方法，它将广泛用于网络分析的潜在位置模型（LPM）与图卷积网络编码策略相结合。此外，还引入了一种独创的估计算法，将通过变异推理对后验聚类概率的显式优化和使用随机梯度下降进行图重构的隐式优化整合在一起。在模拟场景上进行的数值实验凸显了 DeepLPM 在选择聚类数量时对证据下限进行自我惩罚的能力，证明了它与最先进方法相比的聚类能力。最后，DeepLPM 进一步应用于梅罗文高卢的教会网络和科拉的引文网络，以说明探索大型复杂现实世界网络的实际意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Clustering by deep latent position model with graph convolutional network

With the significant increase of interactions between individuals through numeric means, clustering of nodes in graphs has become a fundamental approach for analyzing large and complex networks. In this work, we propose the deep latent position model (DeepLPM), an end-to-end generative clustering approach which combines the widely used latent position model (LPM) for network analysis with a graph convolutional network encoding strategy. Moreover, an original estimation algorithm is introduced to integrate the explicit optimization of the posterior clustering probabilities via variational inference and the implicit optimization using stochastic gradient descent for graph reconstruction. Numerical experiments on simulated scenarios highlight the ability of DeepLPM to self-penalize the evidence lower bound for selecting the number of clusters, demonstrating its clustering capabilities compared to state-of-the-art methods. Finally, DeepLPM is further applied to an ecclesiastical network in Merovingian Gaul and to a citation network Cora to illustrate the practical interest in exploring large and complex real-world networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Advances in Data Analysis and Classification STATISTICS & PROBABILITY-

CiteScore

3.40

自引率

6.20%

发文量

审稿时长

>12 weeks

期刊介绍： The international journal Advances in Data Analysis and Classification (ADAC) is designed as a forum for high standard publications on research and applications concerning the extraction of knowable aspects from many types of data. It publishes articles on such topics as structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; methods for the extraction of knowledge from data, and applications of advanced methods in specific domains of practice. Articles illustrate how new domain-specific knowledge can be made available from data by skillful use of data analysis methods. The journal also publishes survey papers that outline, and illuminate the basic ideas and techniques of special approaches.