A Comparative Analysis of Community Detection Agglomerative Technique Algorithms and Metrics on Citation Network

Sandeep Kumar Rachamadugu, Pushphavathi Thotadara Parameshwarappa
{"title":"A Comparative Analysis of Community Detection Agglomerative Technique Algorithms and Metrics on Citation Network","authors":"Sandeep Kumar Rachamadugu, Pushphavathi Thotadara Parameshwarappa","doi":"10.33166/aetic.2023.04.001","DOIUrl":null,"url":null,"abstract":"Social Network Analysis is a discipline that represents social relationships as a network of nodes and edges. The construction of social network with clusters will contribute in sharing the common characteristics or behaviour of a group. Partitioning the graph into modules is said to be a community. Communities are meant to symbolize actual social groups that share common characteristics. Citation network is one of the social networks with directed graphs where one paper will cite another paper and so on. Citation networks will assist the researcher in choosing research directions and evaluating research impacts. By constructing the citation networks with communities will direct the user to identify the similarity of documents which are interrelated to one or more domains. This paper introduces the agglomerative technique algorithms and metrics to a directed graph which determines the most influential nodes and group of similar nodes. The two stages required to construct the communities are how to generate network with communities and how to quantify the network performance. The strength and a quality of a network is quantified in terms of metrics like modularity, normalized mutual information (NMI), betweenness centrality, and F-Measure. The suitable community detection techniques and metrics for a citation graph were introduced in this paper. In the field of community detection, it is common practice to categorize algorithms according to the mathematical techniques they employ, and then compare them on benchmark graphs featuring a particular type of assortative community structure. The algorithms are applied for a sample citation sub data is extracted from DBLP, ACM, MAG and some additional sources which is taken from and consists of 101 nodes (nc) with 621 edges € and formed 64 communities. The key attributes in dataset are id, title, abstract, references SLM uses local optimisation and scalability to improve community detection in complicated networks. Unlike traditional methods, the proposed LS-SLM algorithm is identified that the modularity is increased by 12.65%, NMI increased by 2.31%, betweenness centrality by 3.18% and F-Score by 4.05%. The SLM algorithm outperforms existing methods in finding significant and well-defined communities, making it a promising community detection breakthrough.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":"149 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Emerging Technologies in Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33166/aetic.2023.04.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Social Network Analysis is a discipline that represents social relationships as a network of nodes and edges. The construction of social network with clusters will contribute in sharing the common characteristics or behaviour of a group. Partitioning the graph into modules is said to be a community. Communities are meant to symbolize actual social groups that share common characteristics. Citation network is one of the social networks with directed graphs where one paper will cite another paper and so on. Citation networks will assist the researcher in choosing research directions and evaluating research impacts. By constructing the citation networks with communities will direct the user to identify the similarity of documents which are interrelated to one or more domains. This paper introduces the agglomerative technique algorithms and metrics to a directed graph which determines the most influential nodes and group of similar nodes. The two stages required to construct the communities are how to generate network with communities and how to quantify the network performance. The strength and a quality of a network is quantified in terms of metrics like modularity, normalized mutual information (NMI), betweenness centrality, and F-Measure. The suitable community detection techniques and metrics for a citation graph were introduced in this paper. In the field of community detection, it is common practice to categorize algorithms according to the mathematical techniques they employ, and then compare them on benchmark graphs featuring a particular type of assortative community structure. The algorithms are applied for a sample citation sub data is extracted from DBLP, ACM, MAG and some additional sources which is taken from and consists of 101 nodes (nc) with 621 edges € and formed 64 communities. The key attributes in dataset are id, title, abstract, references SLM uses local optimisation and scalability to improve community detection in complicated networks. Unlike traditional methods, the proposed LS-SLM algorithm is identified that the modularity is increased by 12.65%, NMI increased by 2.31%, betweenness centrality by 3.18% and F-Score by 4.05%. The SLM algorithm outperforms existing methods in finding significant and well-defined communities, making it a promising community detection breakthrough.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
引文网络上社区检测聚合技术、算法和度量的比较分析
社会网络分析是一门将社会关系表示为节点和边缘网络的学科。具有集群的社会网络的构建有助于共享群体的共同特征或行为。将图划分为模块称为社区。社区象征着具有共同特征的实际社会群体。引文网络是一篇论文引用另一篇论文等具有有向图的社交网络。引文网络将有助于研究者选择研究方向和评估研究影响。通过构建带有社区的引文网络,可以指导用户识别与一个或多个领域相关的文献的相似度。本文介绍了有向图的聚类技术、算法和度量,以确定最具影响力的节点和相似节点组。构建社区所需要的两个阶段是如何产生有社区的网络和如何量化网络绩效。网络的强度和质量可以用模块化、标准化互信息(NMI)、中间性中心性和F-Measure等指标来量化。本文介绍了适用于引文图的社区检测技术和指标。在社区检测领域,通常的做法是根据它们使用的数学技术对算法进行分类,然后在具有特定类型的分类社区结构的基准图上对它们进行比较。该算法应用于一个样本引用子数据,该子数据从DBLP、ACM、MAG和一些其他来源中提取,由101个节点(nc)组成,有621条边,形成64个社区。数据集的关键属性是id、title、abstract、references。SLM利用局部优化和可扩展性来提高复杂网络中的社区检测。与传统方法相比,本文提出的LS-SLM算法模块性提高了12.65%,NMI提高了2.31%,中间中心性提高了3.18%,F-Score提高了4.05%。SLM算法在寻找重要且定义良好的社区方面优于现有方法,使其成为有希望的社区检测突破。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Annals of Emerging Technologies in Computing
Annals of Emerging Technologies in Computing Computer Science-Computer Science (all)
CiteScore
3.50
自引率
0.00%
发文量
26
期刊最新文献
The Proposal of Countermeasures for DeepFake Voices on Social Media Considering Waveform and Text Embedding Lightweight Model for Occlusion Removal from Face Images A Torpor-based Enhanced Security Model for CSMA/CA Protocol in Wireless Networks Enhancing Robot Navigation Efficiency Using Cellular Automata with Active Cells Wildfire Prediction in the United States Using Time Series Forecasting Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1