{"title":"Latency-conscious dataflow reconfiguration","authors":"Moritz Hoffmann, Frank McSherry, Andrea Lattuada","doi":"10.1145/3206333.3206334","DOIUrl":null,"url":null,"abstract":"We propose a prototype incremental data migration mechanism for stateful distributed data-parallel dataflow engines with latency objectives. When compared to existing scaling mechanisms, our prototype has the following differentiating characteristics: (i) the mechanism provides tunable granularity for avoiding latency spikes, (ii) reconfigurations can be prepared ahead of time to avoid runtime coordination, and (iii) the implementation only relies on existing dataflow APIs and need not require system modifications. We demonstrate our proposal on example computations with varying amounts of state that needs to be migrated, which is a non-trivial task for systems like Dhalion and Flink. Our implementation, prototyped on Timely Dataflow, provides a scalable stateful operator template compatible with existing APIs that carefully reorganizes data to minimize migration overhead. Compared to naïve approaches we reduce service latencies by orders of magnitude.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3206333.3206334","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
We propose a prototype incremental data migration mechanism for stateful distributed data-parallel dataflow engines with latency objectives. When compared to existing scaling mechanisms, our prototype has the following differentiating characteristics: (i) the mechanism provides tunable granularity for avoiding latency spikes, (ii) reconfigurations can be prepared ahead of time to avoid runtime coordination, and (iii) the implementation only relies on existing dataflow APIs and need not require system modifications. We demonstrate our proposal on example computations with varying amounts of state that needs to be migrated, which is a non-trivial task for systems like Dhalion and Flink. Our implementation, prototyped on Timely Dataflow, provides a scalable stateful operator template compatible with existing APIs that carefully reorganizes data to minimize migration overhead. Compared to naïve approaches we reduce service latencies by orders of magnitude.