{"title":"Dragon:一个轻量级、高性能的分布式流处理引擎","authors":"A. Harwood, M. Read, Gayashan Amarasinghe","doi":"10.1109/ICDCS47774.2020.00177","DOIUrl":null,"url":null,"abstract":"The performance of a distributed stream processing engine is traditionally considered in terms of fundamental measurements of latency and throughput. Recently, Apache Storm has demonstrated sub-millisecond latencies for inter-component tuple transmission, though it does so through aggressive throttling that leads to strict throughput limitations in order to keep tuple queues near empty. On the other hand, Apache Heron has excellent throughput characteristics, especially when operating near unstable conditions, but its inter-component latencies typically start around 10 milliseconds. Both of these systems require roughly 650MB of installation space. We have developed Dragon, loosely based on the same API as Storm and Heron, that is both lightweight, requiring just 7.5MB of installation space, and competitive in performance to Storm and Heron. In this paper we show experiments with all three systems using the Word Count benchmark. Dragon achieves throughput characteristics near to that of Heron and inter-component latencies less than 10ms under high load. In particular, Dragon’s maximum latency is significantly less that Storm’s maximum latency under high load. Finally Dragon managed to remain stable at higher effective throughput than Heron. We believe Dragon is a good \"allrounder\" solution and is particularly suitable for Edge computing applications, given its small installation footprint.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dragon: A Lightweight, High Performance Distributed Stream Processing Engine\",\"authors\":\"A. Harwood, M. Read, Gayashan Amarasinghe\",\"doi\":\"10.1109/ICDCS47774.2020.00177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance of a distributed stream processing engine is traditionally considered in terms of fundamental measurements of latency and throughput. Recently, Apache Storm has demonstrated sub-millisecond latencies for inter-component tuple transmission, though it does so through aggressive throttling that leads to strict throughput limitations in order to keep tuple queues near empty. On the other hand, Apache Heron has excellent throughput characteristics, especially when operating near unstable conditions, but its inter-component latencies typically start around 10 milliseconds. Both of these systems require roughly 650MB of installation space. We have developed Dragon, loosely based on the same API as Storm and Heron, that is both lightweight, requiring just 7.5MB of installation space, and competitive in performance to Storm and Heron. In this paper we show experiments with all three systems using the Word Count benchmark. Dragon achieves throughput characteristics near to that of Heron and inter-component latencies less than 10ms under high load. In particular, Dragon’s maximum latency is significantly less that Storm’s maximum latency under high load. Finally Dragon managed to remain stable at higher effective throughput than Heron. We believe Dragon is a good \\\"allrounder\\\" solution and is particularly suitable for Edge computing applications, given its small installation footprint.\",\"PeriodicalId\":158630,\"journal\":{\"name\":\"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDCS47774.2020.00177\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS47774.2020.00177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dragon: A Lightweight, High Performance Distributed Stream Processing Engine
The performance of a distributed stream processing engine is traditionally considered in terms of fundamental measurements of latency and throughput. Recently, Apache Storm has demonstrated sub-millisecond latencies for inter-component tuple transmission, though it does so through aggressive throttling that leads to strict throughput limitations in order to keep tuple queues near empty. On the other hand, Apache Heron has excellent throughput characteristics, especially when operating near unstable conditions, but its inter-component latencies typically start around 10 milliseconds. Both of these systems require roughly 650MB of installation space. We have developed Dragon, loosely based on the same API as Storm and Heron, that is both lightweight, requiring just 7.5MB of installation space, and competitive in performance to Storm and Heron. In this paper we show experiments with all three systems using the Word Count benchmark. Dragon achieves throughput characteristics near to that of Heron and inter-component latencies less than 10ms under high load. In particular, Dragon’s maximum latency is significantly less that Storm’s maximum latency under high load. Finally Dragon managed to remain stable at higher effective throughput than Heron. We believe Dragon is a good "allrounder" solution and is particularly suitable for Edge computing applications, given its small installation footprint.