{"title":"Data partitioning over data streams based on change-aware sampling","authors":"Yongli Wang, Hong-bing Xu, Yisheng Dong, Xue-jun Liu, Jiang-bo Qian","doi":"10.1109/ICEBE.2005.47","DOIUrl":null,"url":null,"abstract":"A novel data partitioning method adapted to a distributed parallel streams processing system for power industry is proposed. This method uses change-aware sampling algorithm that can guarantee low error to describe the distribution characteristics of the data-values first. And then it uses an improved heuristic constructing equal depth histograms algorithm to generate approximate partition vector efficiently. Experiments results on actual data prove that the proposed method is efficient, practical and suitable for time-varying data streams processing","PeriodicalId":118472,"journal":{"name":"IEEE International Conference on e-Business Engineering (ICEBE'05)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on e-Business Engineering (ICEBE'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEBE.2005.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A novel data partitioning method adapted to a distributed parallel streams processing system for power industry is proposed. This method uses change-aware sampling algorithm that can guarantee low error to describe the distribution characteristics of the data-values first. And then it uses an improved heuristic constructing equal depth histograms algorithm to generate approximate partition vector efficiently. Experiments results on actual data prove that the proposed method is efficient, practical and suitable for time-varying data streams processing