通过随机块模型在简单超图中进行基于模型的聚类

IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Scandinavian Journal of Statistics Pub Date : 2024-09-18 DOI:10.1111/sjos.12754
Luca Brusa, Catherine Matias
{"title":"通过随机块模型在简单超图中进行基于模型的聚类","authors":"Luca Brusa, Catherine Matias","doi":"10.1111/sjos.12754","DOIUrl":null,"url":null,"abstract":"We propose a model to address the overlooked problem of node clustering in simple hypergraphs. Simple hypergraphs are suitable when a node may not appear multiple times in the same hyperedge, such as in co‐authorship datasets. Our model generalizes the stochastic blockmodel for graphs and assumes the existence of latent node groups and hyperedges are conditionally independent given these groups. We first establish the generic identifiability of the model parameters. We then develop a variational approximation Expectation‐Maximization algorithm for parameter inference and node clustering, and derive a statistical criterion for model selection. To illustrate the performance of our <jats:styled-content>R</jats:styled-content> package <jats:styled-content>HyperSBM</jats:styled-content>, we compare it with other node clustering methods using synthetic data generated from the model, as well as from a line clustering experiment and a co‐authorship dataset.","PeriodicalId":49567,"journal":{"name":"Scandinavian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Model‐based clustering in simple hypergraphs through a stochastic blockmodel\",\"authors\":\"Luca Brusa, Catherine Matias\",\"doi\":\"10.1111/sjos.12754\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a model to address the overlooked problem of node clustering in simple hypergraphs. Simple hypergraphs are suitable when a node may not appear multiple times in the same hyperedge, such as in co‐authorship datasets. Our model generalizes the stochastic blockmodel for graphs and assumes the existence of latent node groups and hyperedges are conditionally independent given these groups. We first establish the generic identifiability of the model parameters. We then develop a variational approximation Expectation‐Maximization algorithm for parameter inference and node clustering, and derive a statistical criterion for model selection. To illustrate the performance of our <jats:styled-content>R</jats:styled-content> package <jats:styled-content>HyperSBM</jats:styled-content>, we compare it with other node clustering methods using synthetic data generated from the model, as well as from a line clustering experiment and a co‐authorship dataset.\",\"PeriodicalId\":49567,\"journal\":{\"name\":\"Scandinavian Journal of Statistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scandinavian Journal of Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1111/sjos.12754\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scandinavian Journal of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1111/sjos.12754","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

摘要

我们提出了一个模型来解决简单超图中被忽视的节点聚类问题。简单超图适用于一个节点可能不会多次出现在同一个超节点中的情况,例如在共同作者数据集中。我们的模型概括了图的随机块模型,并假定存在潜在的节点群组,而超图在这些群组中是有条件独立的。我们首先建立了模型参数的通用可识别性。然后,我们开发了一种用于参数推断和节点聚类的变分近似期望最大化算法,并推导出一种用于模型选择的统计标准。为了说明我们的 R 软件包 HyperSBM 的性能,我们使用该模型生成的合成数据以及行聚类实验和共同作者数据集,将其与其他节点聚类方法进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Model‐based clustering in simple hypergraphs through a stochastic blockmodel
We propose a model to address the overlooked problem of node clustering in simple hypergraphs. Simple hypergraphs are suitable when a node may not appear multiple times in the same hyperedge, such as in co‐authorship datasets. Our model generalizes the stochastic blockmodel for graphs and assumes the existence of latent node groups and hyperedges are conditionally independent given these groups. We first establish the generic identifiability of the model parameters. We then develop a variational approximation Expectation‐Maximization algorithm for parameter inference and node clustering, and derive a statistical criterion for model selection. To illustrate the performance of our R package HyperSBM, we compare it with other node clustering methods using synthetic data generated from the model, as well as from a line clustering experiment and a co‐authorship dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Scandinavian Journal of Statistics
Scandinavian Journal of Statistics 数学-统计学与概率论
CiteScore
1.80
自引率
0.00%
发文量
61
审稿时长
6-12 weeks
期刊介绍: The Scandinavian Journal of Statistics is internationally recognised as one of the leading statistical journals in the world. It was founded in 1974 by four Scandinavian statistical societies. Today more than eighty per cent of the manuscripts are submitted from outside Scandinavia. It is an international journal devoted to reporting significant and innovative original contributions to statistical methodology, both theory and applications. The journal specializes in statistical modelling showing particular appreciation of the underlying substantive research problems. The emergence of specialized methods for analysing longitudinal and spatial data is just one example of an area of important methodological development in which the Scandinavian Journal of Statistics has a particular niche.
期刊最新文献
Model‐based clustering in simple hypergraphs through a stochastic blockmodel Some approximations to the path formula for some nonlinear models Tobit models for count time series On some publications of Sir David Cox Looking back: Selected contributions by C. R. Rao to multivariate analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1