Fast RankCIus Algorithm via Dynamic Rank Score Tracking on Bi-type Information Networks

Kotaro Yamazaki, Shohei Matsugu, Hiroaki Shiokawa, H. Kitagawa
{"title":"Fast RankCIus Algorithm via Dynamic Rank Score Tracking on Bi-type Information Networks","authors":"Kotaro Yamazaki, Shohei Matsugu, Hiroaki Shiokawa, H. Kitagawa","doi":"10.1145/3366030.3366051","DOIUrl":null,"url":null,"abstract":"Given a bi-type information network, which is an extended model of well-known bipartite graphs, how can clusters be efficiently found in graphs? Graph clustering is now a fundamental tool to understand overviews from graph-structured data. The RankClus framework accurately performs clustering for bi-type information networks using ranking-based graph clustering techniques. It integrates a graph ranking algorithms such as PageRank or HITS into graph clustering procedures to improve the clustering quality. However, this integration incurs a high computational cost to handle large bi-type information networks since RankClus repeatedly computes the ranking algorithm for all nodes and edges until the clustering procedure converges. To overcome this runtime limitation, herein we present a novel RankClus algorithm that reduces the running time for large bi-type information networks. Our proposed method employs dynamic graph processing techniques into the ranking procedures included in RankClus. By dynamically updating ranking results, our proposal reduces the number of computed nodes and edges during repeated ranking procedures. We experimentally verify using real-world datasets that our proposed method successfully reduces the running time while maintaining the clustering quality of RankClus.","PeriodicalId":446280,"journal":{"name":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366030.3366051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Given a bi-type information network, which is an extended model of well-known bipartite graphs, how can clusters be efficiently found in graphs? Graph clustering is now a fundamental tool to understand overviews from graph-structured data. The RankClus framework accurately performs clustering for bi-type information networks using ranking-based graph clustering techniques. It integrates a graph ranking algorithms such as PageRank or HITS into graph clustering procedures to improve the clustering quality. However, this integration incurs a high computational cost to handle large bi-type information networks since RankClus repeatedly computes the ranking algorithm for all nodes and edges until the clustering procedure converges. To overcome this runtime limitation, herein we present a novel RankClus algorithm that reduces the running time for large bi-type information networks. Our proposed method employs dynamic graph processing techniques into the ranking procedures included in RankClus. By dynamically updating ranking results, our proposal reduces the number of computed nodes and edges during repeated ranking procedures. We experimentally verify using real-world datasets that our proposed method successfully reduces the running time while maintaining the clustering quality of RankClus.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于动态排名分数跟踪的双类型信息网络快速RankCIus算法
给定一个双型信息网络,即众所周知的二部图的扩展模型,如何有效地在图中找到聚类?图聚类现在是理解图结构数据概述的基本工具。RankClus框架使用基于排名的图聚类技术准确地执行双类型信息网络的聚类。它将PageRank或HITS等图排序算法集成到图聚类过程中,以提高聚类质量。然而,由于RankClus反复计算所有节点和边的排序算法,直到聚类过程收敛,因此这种集成在处理大型双类型信息网络时会产生很高的计算成本。为了克服这种运行时间限制,本文提出了一种新的RankClus算法,该算法减少了大型双类型信息网络的运行时间。我们提出的方法将动态图处理技术应用到RankClus中的排序过程中。通过动态更新排序结果,我们的建议减少了在重复排序过程中计算节点和边的数量。我们使用真实的数据集进行实验验证,我们提出的方法成功地减少了运行时间,同时保持了RankClus的聚类质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Crawling Method with No Parameters for Geo-social Data based on Road Maps PLDSD Fake News Classification Based on Subjective Language Computing Ranges for Temporal Parameters of Composed Web Services Microbiological Water Quality Test Results Extraction from Mobile Photographs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1