ActiveReach: an active learning framework for approximate reachability query answering in large-scale graphs.

IF 2.4 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Frontiers in Big Data Pub Date : 2024-11-19 eCollection Date: 2024-01-01 DOI:10.3389/fdata.2024.1427104
Zohreh Raghebi, Farnoush Banaei-Kashani
{"title":"ActiveReach: an active learning framework for approximate reachability query answering in large-scale graphs.","authors":"Zohreh Raghebi, Farnoush Banaei-Kashani","doi":"10.3389/fdata.2024.1427104","DOIUrl":null,"url":null,"abstract":"<p><p>With graph reachability query, one can answer whether there exists a path between two query vertices in a given graph. The existing reachability query processing solutions use traditional reachability index structures and can only compute exact answers, which may take a long time to resolve in large graphs. In contrast, with an approximate reachability query, one can offer a compromise by enabling users to strike a trade-off between query time and the accuracy of the query result. In this study, we propose a framework, dubbed ActiveReach, for learning index structures to answer approximate reachability query. ActiveReach is a two-phase framework that focuses on embedding nodes in a reachability space. In the first phase, we leverage node attributes and positional information to create reachability-aware embeddings for each node. These embeddings are then used as nodes' attributes in the second phase. In the second phase, we incorporate the new attributes and include reachability information as labels in the training data to generate embeddings in a reachability space. In addition, computing reachability for all training data may not be practical. Therefore, selecting a subset of data to compute reachability effectively and enhance reachability prediction performance is challenging. ActiveReach addresses this challenge by employing an active learning approach in the second phase to selectively compute reachability for a subset of node pairs, thus learning the approximate reachability for the entire graph. Our extensive experimental study with various real attributed large-scale graphs demonstrates the effectiveness of each component of our framework.</p>","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"7 ","pages":"1427104"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611874/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Big Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdata.2024.1427104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

With graph reachability query, one can answer whether there exists a path between two query vertices in a given graph. The existing reachability query processing solutions use traditional reachability index structures and can only compute exact answers, which may take a long time to resolve in large graphs. In contrast, with an approximate reachability query, one can offer a compromise by enabling users to strike a trade-off between query time and the accuracy of the query result. In this study, we propose a framework, dubbed ActiveReach, for learning index structures to answer approximate reachability query. ActiveReach is a two-phase framework that focuses on embedding nodes in a reachability space. In the first phase, we leverage node attributes and positional information to create reachability-aware embeddings for each node. These embeddings are then used as nodes' attributes in the second phase. In the second phase, we incorporate the new attributes and include reachability information as labels in the training data to generate embeddings in a reachability space. In addition, computing reachability for all training data may not be practical. Therefore, selecting a subset of data to compute reachability effectively and enhance reachability prediction performance is challenging. ActiveReach addresses this challenge by employing an active learning approach in the second phase to selectively compute reachability for a subset of node pairs, thus learning the approximate reachability for the entire graph. Our extensive experimental study with various real attributed large-scale graphs demonstrates the effectiveness of each component of our framework.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ActiveReach:一个主动学习框架,用于大规模图中的近似可达性查询回答。
通过图可达性查询,可以回答给定图中两个查询顶点之间是否存在路径。现有的可达性查询处理方案使用传统的可达性索引结构,只能计算精确的答案,这可能需要很长时间才能在大型图中解析。相比之下,对于近似可达性查询,可以提供一种折衷方案,允许用户在查询时间和查询结果的准确性之间进行权衡。在本研究中,我们提出了一个名为ActiveReach的框架,用于学习索引结构来回答近似可达性查询。ActiveReach是一个两阶段框架,专注于在可达性空间中嵌入节点。在第一阶段,我们利用节点属性和位置信息为每个节点创建可达性感知嵌入。然后在第二阶段将这些嵌入用作节点的属性。在第二阶段,我们结合新的属性,并将可达性信息作为标签包含在训练数据中,以在可达性空间中生成嵌入。此外,计算所有训练数据的可达性可能不切实际。因此,选择一个数据子集来有效地计算可达性并提高可达性预测性能是一个挑战。ActiveReach通过在第二阶段采用主动学习方法来解决这一挑战,有选择地计算节点对子集的可达性,从而学习整个图的近似可达性。我们对各种真实属性大规模图进行了广泛的实验研究,证明了我们框架的每个组成部分的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.20
自引率
3.20%
发文量
122
审稿时长
13 weeks
期刊最新文献
Prediction model of middle school student performance based on MBSO and MDBO-BP-Adaboost method. Multi-source data recognition and fusion algorithm based on a two-layer genetic algorithm-back propagation model. Constructing a metadata knowledge graph as an atlas for demystifying AI pipeline optimization. Editorial: Cybersecurity and artificial intelligence: advances, challenges, opportunities, threats. Exploring infodemiology: unraveling the intricate relationships among stress, headaches, migraines, and suicide through Google Trends analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1