Mei Wang, Jiaoling Zhou, Yue Li, Xiaoling Xia, Jiajin Le
{"title":"基于列存储的数据重用策略","authors":"Mei Wang, Jiaoling Zhou, Yue Li, Xiaoling Xia, Jiajin Le","doi":"10.1109/DASC.2013.56","DOIUrl":null,"url":null,"abstract":"Data reusing is an important way to save storage capacity and improve query efficiency in the management of massive data. The column-store architecture stores data from the same column continuously, which greatly improves the performance of 'read optimization' application and moreover increases the feasibility and flexibility of data reusing. In this paper, we propose a novel reusing method based on the column-store data warehouse. Firstly, we propose an improved iMAP method based on the schema mapping technique to generate as more candidate reusable columns as possible and then conduct further filter on these candidate data, which greatly reduces the complexity of reusable data detection. Based on the column-store architecture, we then propose the reuse implement at the storage layer. The method for query execution based on reusable data is provided finally. The experiment results conducted on the real data sets indicate that the presented strategy can reduce the storage space and query execution time efficiently.","PeriodicalId":179557,"journal":{"name":"2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Data Reusing Strategy Based on Column-Stores\",\"authors\":\"Mei Wang, Jiaoling Zhou, Yue Li, Xiaoling Xia, Jiajin Le\",\"doi\":\"10.1109/DASC.2013.56\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data reusing is an important way to save storage capacity and improve query efficiency in the management of massive data. The column-store architecture stores data from the same column continuously, which greatly improves the performance of 'read optimization' application and moreover increases the feasibility and flexibility of data reusing. In this paper, we propose a novel reusing method based on the column-store data warehouse. Firstly, we propose an improved iMAP method based on the schema mapping technique to generate as more candidate reusable columns as possible and then conduct further filter on these candidate data, which greatly reduces the complexity of reusable data detection. Based on the column-store architecture, we then propose the reuse implement at the storage layer. The method for query execution based on reusable data is provided finally. The experiment results conducted on the real data sets indicate that the presented strategy can reduce the storage space and query execution time efficiently.\",\"PeriodicalId\":179557,\"journal\":{\"name\":\"2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DASC.2013.56\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DASC.2013.56","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data reusing is an important way to save storage capacity and improve query efficiency in the management of massive data. The column-store architecture stores data from the same column continuously, which greatly improves the performance of 'read optimization' application and moreover increases the feasibility and flexibility of data reusing. In this paper, we propose a novel reusing method based on the column-store data warehouse. Firstly, we propose an improved iMAP method based on the schema mapping technique to generate as more candidate reusable columns as possible and then conduct further filter on these candidate data, which greatly reduces the complexity of reusable data detection. Based on the column-store architecture, we then propose the reuse implement at the storage layer. The method for query execution based on reusable data is provided finally. The experiment results conducted on the real data sets indicate that the presented strategy can reduce the storage space and query execution time efficiently.