A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses

Hyoung-Geun An, Jae-Jin Koh
{"title":"A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses","authors":"Hyoung-Geun An, Jae-Jin Koh","doi":"10.3745/KIPSTD.2012.19D.1.001","DOIUrl":null,"url":null,"abstract":"As the size of the data warehouse is large, the selection of indices on the data warehouse affects the efficiency of the query processing of the data warehouse. Indices induce the lower query processing cost, but they occupy the large storage areas and induce the index maintenance cost which are accompanied by database updates. The bitmap join indices are well applied when we optimize the star join queries which join a fact table and many dimension tables and the selection on dimension tables in data warehouses. Though the bitmap join indices with the binary representations induce the lower storage cost, the task to select the indexing attributes among the huge candidate attributes which are generated is difficult. The processes of index selection are to reduce the number of candidate attributes to be indexed and then select the indexing attributes. In this paper on bitmap join index selection problem we reduce the number of candidate attributes by the data mining techniques. Compared to the existing techniques which reduce the number of candidate attributes by the frequencies of attributes we consider the frequencies of attributes and the size of dimension tables and the size of the tuples of the dimension tables and the page size of disk. We use the mining of the frequent itemsets as mining techniques and reduce the great number of candidate attributes. We make the bitmap join indices which have the least costs and the least storage area adapted to storage constraints by using the cost functions applied to the bitmap join indices of the candidate attributes. We compare the existing techniques and ours and analyze them in order to evaluate the efficiencies of ours.","PeriodicalId":348746,"journal":{"name":"The Kips Transactions:partd","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Kips Transactions:partd","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3745/KIPSTD.2012.19D.1.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

As the size of the data warehouse is large, the selection of indices on the data warehouse affects the efficiency of the query processing of the data warehouse. Indices induce the lower query processing cost, but they occupy the large storage areas and induce the index maintenance cost which are accompanied by database updates. The bitmap join indices are well applied when we optimize the star join queries which join a fact table and many dimension tables and the selection on dimension tables in data warehouses. Though the bitmap join indices with the binary representations induce the lower storage cost, the task to select the indexing attributes among the huge candidate attributes which are generated is difficult. The processes of index selection are to reduce the number of candidate attributes to be indexed and then select the indexing attributes. In this paper on bitmap join index selection problem we reduce the number of candidate attributes by the data mining techniques. Compared to the existing techniques which reduce the number of candidate attributes by the frequencies of attributes we consider the frequencies of attributes and the size of dimension tables and the size of the tuples of the dimension tables and the page size of disk. We use the mining of the frequent itemsets as mining techniques and reduce the great number of candidate attributes. We make the bitmap join indices which have the least costs and the least storage area adapted to storage constraints by using the cost functions applied to the bitmap join indices of the candidate attributes. We compare the existing techniques and ours and analyze them in order to evaluate the efficiencies of ours.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
关系数据仓库中选择位图连接索引加速复杂查询的研究
由于数据仓库的规模较大,数据仓库上索引的选择会影响数据仓库查询处理的效率。索引的查询处理成本较低,但索引占用的存储空间较大,索引维护成本也较高,而且还伴随着数据库的更新。位图连接索引在优化星型连接查询(连接一个事实表和多个维度表)和数据仓库中维度表的选择时得到了很好的应用。虽然采用二进制表示的位图连接索引具有较低的存储成本,但从生成的大量候选属性中选择索引属性是一项困难的任务。索引选择的过程是减少待索引的候选属性的数量,然后选择索引属性。针对位图连接索引选择问题,采用数据挖掘技术减少候选属性的数量。与现有的通过属性频率来减少候选属性数量的技术相比,我们考虑了属性频率和维度表的大小以及维度表的元组的大小和磁盘的页面大小。我们使用频繁项集的挖掘作为挖掘技术,减少了大量的候选属性。我们利用应用于候选属性的位图连接索引的代价函数,使具有最小代价和最小存储面积的位图连接索引适应存储约束。我们比较现有的技术和我们的技术,并对它们进行分析,以评估我们的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Web Document Classification Based on Hangeul Morpheme and Keyword Analyses Identification of the Extension Points of Design Patterns Based on Reference Flows A QoS-aware Service Selection Method for Configuring Web Service Composition TK-Indexing : An Indexing Method for SNS Data Based on NoSQL Analysis of Power Consumption for Embedded Software using UML State Machine Diagram
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1