Jeffrey Considine, Feifei Li, G. Kollios, J. Byers
{"title":"Approximate aggregation techniques for sensor databases","authors":"Jeffrey Considine, Feifei Li, G. Kollios, J. Byers","doi":"10.1109/ICDE.2004.1320018","DOIUrl":null,"url":null,"abstract":"In the emerging area of sensor-based systems, a significant challenge is to develop scalable, fault-tolerant methods to extract useful information from the data the sensors collect. An approach to this data management problem is the use of sensor database systems, exemplified by TinyDB and Cougar, which allow users to perform aggregation queries such as MIN, COUNT and AVG on a sensor network. Due to power and range constraints, centralized approaches are generally impractical, so most systems use in-network aggregation to reduce network traffic. However, these aggregation strategies become bandwidth-intensive when combined with the fault-tolerant, multipath routing methods often used in these environments. For example, duplicate-sensitive aggregates such as SUM cannot be computed exactly using substantially less bandwidth than explicit enumeration. To avoid this expense, we investigate the use of approximate in-network aggregation using small sketches. Our contributions are as follows: 1) we generalize well known duplicate-insensitive sketches for approximating COUNT to handle SUM, 2) we present and analyze methods for using sketches to produce accurate results with low communication and computation overhead, and 3) we present an extensive experimental validation of our methods.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"628","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 20th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2004.1320018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 628
Abstract
In the emerging area of sensor-based systems, a significant challenge is to develop scalable, fault-tolerant methods to extract useful information from the data the sensors collect. An approach to this data management problem is the use of sensor database systems, exemplified by TinyDB and Cougar, which allow users to perform aggregation queries such as MIN, COUNT and AVG on a sensor network. Due to power and range constraints, centralized approaches are generally impractical, so most systems use in-network aggregation to reduce network traffic. However, these aggregation strategies become bandwidth-intensive when combined with the fault-tolerant, multipath routing methods often used in these environments. For example, duplicate-sensitive aggregates such as SUM cannot be computed exactly using substantially less bandwidth than explicit enumeration. To avoid this expense, we investigate the use of approximate in-network aggregation using small sketches. Our contributions are as follows: 1) we generalize well known duplicate-insensitive sketches for approximating COUNT to handle SUM, 2) we present and analyze methods for using sketches to produce accurate results with low communication and computation overhead, and 3) we present an extensive experimental validation of our methods.