Yidan Wang, Z. Tari, Xiaoran Huang, Albert Y. Zomaya
{"title":"一种基于网络感知和分区的数据流处理资源管理方案","authors":"Yidan Wang, Z. Tari, Xiaoran Huang, Albert Y. Zomaya","doi":"10.1145/3337821.3337870","DOIUrl":null,"url":null,"abstract":"With the increasing demand for data-driven decision making, there is an urgent need for processing geographically distributed data streams in real-time. The existing scheduling and resource management schemes efficiently optimize stream processing performance with the awareness of resource, quality-of-service, and network traffic. However, the correlation between network delay and inter-operator communication pattern is not well-understood. In this study, we propose a network-aware and partition-based resource management scheme to deal with the ever-changing network condition and data communication in stream processing. The proposed approach applies operator fusion by considering the computational demand of individual operators and the inter-operator communication patterns. It maps the fused operators to the clustered hosts with the weighted shortest processing time heuristic. Meanwhile, we established a 3-dimensional coordinate system for prompt reflection of the network condition, real-time traffic, and resource availability. We evaluated the proposed approach against two benchmarks, and the results demonstrate the efficiency in throughput and resource utilization. We also conducted a case study and implemented a prototype system supported by the proposed approach that aims to utilize the stream processing paradigm for pedestrian behavior analysis. The prototype application estimates walking time for a given path according to the real crowd traffic. The promising evaluation results of processing performance further illustrate the efficiency of the proposed approach.","PeriodicalId":405273,"journal":{"name":"Proceedings of the 48th International Conference on Parallel Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Network-aware and Partition-based Resource Management Scheme for Data Stream Processing\",\"authors\":\"Yidan Wang, Z. Tari, Xiaoran Huang, Albert Y. Zomaya\",\"doi\":\"10.1145/3337821.3337870\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the increasing demand for data-driven decision making, there is an urgent need for processing geographically distributed data streams in real-time. The existing scheduling and resource management schemes efficiently optimize stream processing performance with the awareness of resource, quality-of-service, and network traffic. However, the correlation between network delay and inter-operator communication pattern is not well-understood. In this study, we propose a network-aware and partition-based resource management scheme to deal with the ever-changing network condition and data communication in stream processing. The proposed approach applies operator fusion by considering the computational demand of individual operators and the inter-operator communication patterns. It maps the fused operators to the clustered hosts with the weighted shortest processing time heuristic. Meanwhile, we established a 3-dimensional coordinate system for prompt reflection of the network condition, real-time traffic, and resource availability. We evaluated the proposed approach against two benchmarks, and the results demonstrate the efficiency in throughput and resource utilization. We also conducted a case study and implemented a prototype system supported by the proposed approach that aims to utilize the stream processing paradigm for pedestrian behavior analysis. The prototype application estimates walking time for a given path according to the real crowd traffic. The promising evaluation results of processing performance further illustrate the efficiency of the proposed approach.\",\"PeriodicalId\":405273,\"journal\":{\"name\":\"Proceedings of the 48th International Conference on Parallel Processing\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 48th International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3337821.3337870\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 48th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3337821.3337870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Network-aware and Partition-based Resource Management Scheme for Data Stream Processing
With the increasing demand for data-driven decision making, there is an urgent need for processing geographically distributed data streams in real-time. The existing scheduling and resource management schemes efficiently optimize stream processing performance with the awareness of resource, quality-of-service, and network traffic. However, the correlation between network delay and inter-operator communication pattern is not well-understood. In this study, we propose a network-aware and partition-based resource management scheme to deal with the ever-changing network condition and data communication in stream processing. The proposed approach applies operator fusion by considering the computational demand of individual operators and the inter-operator communication patterns. It maps the fused operators to the clustered hosts with the weighted shortest processing time heuristic. Meanwhile, we established a 3-dimensional coordinate system for prompt reflection of the network condition, real-time traffic, and resource availability. We evaluated the proposed approach against two benchmarks, and the results demonstrate the efficiency in throughput and resource utilization. We also conducted a case study and implemented a prototype system supported by the proposed approach that aims to utilize the stream processing paradigm for pedestrian behavior analysis. The prototype application estimates walking time for a given path according to the real crowd traffic. The promising evaluation results of processing performance further illustrate the efficiency of the proposed approach.