{"title":"关键任务应用程序的持续数据流更新策略","authors":"Charith Wickramaarachchi, Yogesh L. Simmhan","doi":"10.1109/eScience.2013.35","DOIUrl":null,"url":null,"abstract":"Continuous data flows complement scientific work-flows by allowing composition of real time data ingest and analytics pipelines to process data streams from pervasive sensors and \"always-on\" scientific instruments. Such data flows are mission-critical applications that cannot suffer downtime, need to operate consistently, and are long running, but may need to be updated to fix bugs or add features. This poses the problem: How do we update the continuous dataflow application with minimal disruption? In this paper, we formalize different types of dataflow update models for continuous dataflow applications, and identify the qualitative and quantitative metrics to be considered when choosing an update strategy. We propose five dataflow update strategies, and analytically characterize their performance trade-offs. We validate one of these consistent, low-latency update strategies using the Floe dataflow engine for an eEngineering application from the Smart Power Grid domain, and show its relative performance benefits against a naïve update strategy.","PeriodicalId":325272,"journal":{"name":"2013 IEEE 9th International Conference on e-Science","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Continuous Dataflow Update Strategies for Mission-Critical Applications\",\"authors\":\"Charith Wickramaarachchi, Yogesh L. Simmhan\",\"doi\":\"10.1109/eScience.2013.35\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Continuous data flows complement scientific work-flows by allowing composition of real time data ingest and analytics pipelines to process data streams from pervasive sensors and \\\"always-on\\\" scientific instruments. Such data flows are mission-critical applications that cannot suffer downtime, need to operate consistently, and are long running, but may need to be updated to fix bugs or add features. This poses the problem: How do we update the continuous dataflow application with minimal disruption? In this paper, we formalize different types of dataflow update models for continuous dataflow applications, and identify the qualitative and quantitative metrics to be considered when choosing an update strategy. We propose five dataflow update strategies, and analytically characterize their performance trade-offs. We validate one of these consistent, low-latency update strategies using the Floe dataflow engine for an eEngineering application from the Smart Power Grid domain, and show its relative performance benefits against a naïve update strategy.\",\"PeriodicalId\":325272,\"journal\":{\"name\":\"2013 IEEE 9th International Conference on e-Science\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 9th International Conference on e-Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/eScience.2013.35\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 9th International Conference on e-Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2013.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Continuous Dataflow Update Strategies for Mission-Critical Applications
Continuous data flows complement scientific work-flows by allowing composition of real time data ingest and analytics pipelines to process data streams from pervasive sensors and "always-on" scientific instruments. Such data flows are mission-critical applications that cannot suffer downtime, need to operate consistently, and are long running, but may need to be updated to fix bugs or add features. This poses the problem: How do we update the continuous dataflow application with minimal disruption? In this paper, we formalize different types of dataflow update models for continuous dataflow applications, and identify the qualitative and quantitative metrics to be considered when choosing an update strategy. We propose five dataflow update strategies, and analytically characterize their performance trade-offs. We validate one of these consistent, low-latency update strategies using the Floe dataflow engine for an eEngineering application from the Smart Power Grid domain, and show its relative performance benefits against a naïve update strategy.