{"title":"分布式网络中的高效窗口聚合","authors":"W. Yue, Lawrence Benson, T. Rabl","doi":"10.48786/edbt.2023.52","DOIUrl":null,"url":null,"abstract":"Stream processing is widely applied in industry as well as in research to process unbounded data streams. In many use cases, specific data streams are processed by multiple continuous queries. Current systems group events of an unbounded data stream into bounded windows to produce results of individual queries in a timely fashion. For multiple concurrent queries, multiple concurrent and usually overlapping windows are generated. To reduce redundant computations and share partial results, state-of-the-art solutions divide windows into slices and then share the results of those slices. However, this is only applicable for queries with the same aggregation function and window measure, as in the case of overlaps for sliding windows. For multiple queries on the same stream with different aggregation functions and window measures, partial results cannot be shared. Furthermore, data streams are produced from devices that are distributed in large decentralized networks. Current systems cannot process queries on decentralized data streams efficiently. All queries in a decentralized network are either computed centrally or processed individually without exploiting partial results across queries. We present Desis, a stream processing system that can efficiently process multiple stream aggregation queries. We propose an aggregation engine that can share partial results between multiple queries with different window types, measures, and aggregation functions. In decentralized networks, Desis moves computation to data sources and shares overlapping computation as early as possible between queries. Desis outperforms existing solutions by orders of magnitude in throughput when processing multiple queries and can scale to millions of queries. In a decentralized setup, Desis can save up to 99% of network traffic and scale performance linearly.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"2 1","pages":"618-631"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Desis: Efficient Window Aggregation in Decentralized Networks\",\"authors\":\"W. Yue, Lawrence Benson, T. Rabl\",\"doi\":\"10.48786/edbt.2023.52\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stream processing is widely applied in industry as well as in research to process unbounded data streams. In many use cases, specific data streams are processed by multiple continuous queries. Current systems group events of an unbounded data stream into bounded windows to produce results of individual queries in a timely fashion. For multiple concurrent queries, multiple concurrent and usually overlapping windows are generated. To reduce redundant computations and share partial results, state-of-the-art solutions divide windows into slices and then share the results of those slices. However, this is only applicable for queries with the same aggregation function and window measure, as in the case of overlaps for sliding windows. For multiple queries on the same stream with different aggregation functions and window measures, partial results cannot be shared. Furthermore, data streams are produced from devices that are distributed in large decentralized networks. Current systems cannot process queries on decentralized data streams efficiently. All queries in a decentralized network are either computed centrally or processed individually without exploiting partial results across queries. We present Desis, a stream processing system that can efficiently process multiple stream aggregation queries. We propose an aggregation engine that can share partial results between multiple queries with different window types, measures, and aggregation functions. In decentralized networks, Desis moves computation to data sources and shares overlapping computation as early as possible between queries. Desis outperforms existing solutions by orders of magnitude in throughput when processing multiple queries and can scale to millions of queries. In a decentralized setup, Desis can save up to 99% of network traffic and scale performance linearly.\",\"PeriodicalId\":88813,\"journal\":{\"name\":\"Advances in database technology : proceedings. International Conference on Extending Database Technology\",\"volume\":\"2 1\",\"pages\":\"618-631\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in database technology : proceedings. International Conference on Extending Database Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48786/edbt.2023.52\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in database technology : proceedings. International Conference on Extending Database Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48786/edbt.2023.52","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Desis: Efficient Window Aggregation in Decentralized Networks
Stream processing is widely applied in industry as well as in research to process unbounded data streams. In many use cases, specific data streams are processed by multiple continuous queries. Current systems group events of an unbounded data stream into bounded windows to produce results of individual queries in a timely fashion. For multiple concurrent queries, multiple concurrent and usually overlapping windows are generated. To reduce redundant computations and share partial results, state-of-the-art solutions divide windows into slices and then share the results of those slices. However, this is only applicable for queries with the same aggregation function and window measure, as in the case of overlaps for sliding windows. For multiple queries on the same stream with different aggregation functions and window measures, partial results cannot be shared. Furthermore, data streams are produced from devices that are distributed in large decentralized networks. Current systems cannot process queries on decentralized data streams efficiently. All queries in a decentralized network are either computed centrally or processed individually without exploiting partial results across queries. We present Desis, a stream processing system that can efficiently process multiple stream aggregation queries. We propose an aggregation engine that can share partial results between multiple queries with different window types, measures, and aggregation functions. In decentralized networks, Desis moves computation to data sources and shares overlapping computation as early as possible between queries. Desis outperforms existing solutions by orders of magnitude in throughput when processing multiple queries and can scale to millions of queries. In a decentralized setup, Desis can save up to 99% of network traffic and scale performance linearly.