Mostofa Kamal Rasel, Mohammad Rezwanul Huq, Mohammad Arifuzzaman
{"title":"GraphIdx: An efficient indexing technique for accelerating graph data mining","authors":"Mostofa Kamal Rasel, Mohammad Rezwanul Huq, Mohammad Arifuzzaman","doi":"10.1016/j.simpa.2024.100632","DOIUrl":null,"url":null,"abstract":"<div><p>Many graph mining algorithms process large graphs with several passes and suffers from huge I/O cost. GraphIdx, an open-source C library, facilitates a memory-efficient indexing of large graphs to reduce that I/O cost. GraphIdx indexes a block of graph data for a set of nodes based on the empirical evaluation of edges. Due to the indexed graph, graph mining algorithms can access and process only the related nodes and their edges instead of scanning entire graph. As a result, the number of I/Os is significantly reduced. Moreover, GraphIdx accredited algorithms can process graphs in parallel due to the indexed data.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"20 ","pages":"Article 100632"},"PeriodicalIF":1.3000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000204/pdfft?md5=1f5c30286b7c1be0b0b30cc7644c0f53&pid=1-s2.0-S2665963824000204-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Software Impacts","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2665963824000204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Many graph mining algorithms process large graphs with several passes and suffers from huge I/O cost. GraphIdx, an open-source C library, facilitates a memory-efficient indexing of large graphs to reduce that I/O cost. GraphIdx indexes a block of graph data for a set of nodes based on the empirical evaluation of edges. Due to the indexed graph, graph mining algorithms can access and process only the related nodes and their edges instead of scanning entire graph. As a result, the number of I/Os is significantly reduced. Moreover, GraphIdx accredited algorithms can process graphs in parallel due to the indexed data.