Branch Code: A Labeling Scheme for Efficient Query Answering on Trees

Yanghua Xiao, Ji Hong, Wanyun Cui, Zhenying He, Wei Wang, Guodong Feng
{"title":"Branch Code: A Labeling Scheme for Efficient Query Answering on Trees","authors":"Yanghua Xiao, Ji Hong, Wanyun Cui, Zhenying He, Wei Wang, Guodong Feng","doi":"10.1109/ICDE.2012.71","DOIUrl":null,"url":null,"abstract":"Labeling schemes lie at the core of query processing for many tree-structured data such as XML data that is flooding the web. A labeling scheme that can simultaneously and efficiently support various relationship queries on trees (such as parent/children, descendant/ancestor, etc.), computation of lowest common ancestors (LCA) and update of trees, is desired for effective and efficient management of tree-structured data. Although a variety of labeling schemes such as prefix-based labeling, interval-based labeling and prime-based labeling as well as their variants have been available to us for encoding static and dynamic trees, these labeling schemes usually show weakness in one aspect or another. In this paper, we propose an integer-based labeling scheme branch code as well as its compressed version as our major solution to simultaneously support efficient query processing on both static and dynamic ordered trees with affordable storage cost. The proposed branch code can answer common queries on ordered trees in constant time, which comes at the cost of consuming O(N log N) storage. To reduce storage cost to O(N), a compressed branch code is further developed. We also give a relationship determination algorithm purely using compressed branch code, which is of quite low possibility to produce false positive results as verified by experimental results. With the support of splay trees, branch code can also support dynamic trees so that updates and queries can be implemented with O(log N) amortized cost. All the results above are either theoretically proved or verified by experimental studies.","PeriodicalId":321608,"journal":{"name":"2012 IEEE 28th International Conference on Data Engineering","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 28th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2012.71","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Labeling schemes lie at the core of query processing for many tree-structured data such as XML data that is flooding the web. A labeling scheme that can simultaneously and efficiently support various relationship queries on trees (such as parent/children, descendant/ancestor, etc.), computation of lowest common ancestors (LCA) and update of trees, is desired for effective and efficient management of tree-structured data. Although a variety of labeling schemes such as prefix-based labeling, interval-based labeling and prime-based labeling as well as their variants have been available to us for encoding static and dynamic trees, these labeling schemes usually show weakness in one aspect or another. In this paper, we propose an integer-based labeling scheme branch code as well as its compressed version as our major solution to simultaneously support efficient query processing on both static and dynamic ordered trees with affordable storage cost. The proposed branch code can answer common queries on ordered trees in constant time, which comes at the cost of consuming O(N log N) storage. To reduce storage cost to O(N), a compressed branch code is further developed. We also give a relationship determination algorithm purely using compressed branch code, which is of quite low possibility to produce false positive results as verified by experimental results. With the support of splay trees, branch code can also support dynamic trees so that updates and queries can be implemented with O(log N) amortized cost. All the results above are either theoretically proved or verified by experimental studies.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
分支代码:树上高效查询应答的标记方案
标记方案是许多树状结构数据(如web上泛滥的XML数据)查询处理的核心。为了有效地管理树状结构数据,需要一种能够同时有效地支持树上各种关系查询(如父/子、后代/祖先等)、最低共同祖先(LCA)计算和树的更新的标记方案。尽管各种各样的标记方案,如基于前缀的标记、基于间隔的标记和基于素数的标记及其变体,已经为我们提供了用于编码静态和动态树的方法,但这些标记方案通常在一个或另一个方面显示出弱点。在本文中,我们提出了一个基于整数的标记方案分支代码及其压缩版本作为我们的主要解决方案,同时支持在静态和动态有序树上高效的查询处理,并且存储成本可承受。所建议的分支代码可以在常量时间内回答对有序树的常见查询,这是以消耗O(N log N)存储为代价的。为了将存储成本降低到0 (N),进一步开发了压缩分支代码。我们还给出了一种纯使用压缩分支代码的关系确定算法,实验结果证明该算法产生假阳性结果的可能性很小。有了展开树的支持,分支代码也可以支持动态树,这样更新和查询就可以以O(log N)平摊代价实现。以上结果都得到了理论证明和实验研究的验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Keyword Query Reformulation on Structured Data Accuracy-Aware Uncertain Stream Databases Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks Project Daytona: Data Analytics as a Cloud Service Automatic Extraction of Structured Web Data with Domain Knowledge
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1