{"title":"A predictable storage model for scalable parallel DW","authors":"J. Costa, J. Cecílio, P. Martins, P. Furtado","doi":"10.1145/2076623.2076628","DOIUrl":null,"url":null,"abstract":"Star schema model, has been widely used as the facto DW storage organization on RDBMS. Business measures are stored in a central fact table along with a set of foreign keys referencing dimension tables. While this storage organization offers a good trade-off between storage size and performance for a single node, it doesn't scale in a predictable manner in shared-nothing parallel architectures. Although fact tables can be linearly partitioned among nodes, the same doesn't apply to dimensions, which unbalances (increases) the dimensions/fact_table size ratio, and consequently introduces limits to the number of parallel nodes. In this paper we propose and evaluate a parallel DW storage model, that overcomes these limitations and deliver optimal speed-up and scale-up capabilities with top efficiency. We use the TPC-H benchmark to evaluate the scalability and efficiency of the proposed model.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"29 1","pages":"26-33"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Database Engineering and Applications Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2076623.2076628","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Star schema model, has been widely used as the facto DW storage organization on RDBMS. Business measures are stored in a central fact table along with a set of foreign keys referencing dimension tables. While this storage organization offers a good trade-off between storage size and performance for a single node, it doesn't scale in a predictable manner in shared-nothing parallel architectures. Although fact tables can be linearly partitioned among nodes, the same doesn't apply to dimensions, which unbalances (increases) the dimensions/fact_table size ratio, and consequently introduces limits to the number of parallel nodes. In this paper we propose and evaluate a parallel DW storage model, that overcomes these limitations and deliver optimal speed-up and scale-up capabilities with top efficiency. We use the TPC-H benchmark to evaluate the scalability and efficiency of the proposed model.