{"title":"数据局部性对具有实时数据密集型应用程序的SaaS云性能的影响","authors":"Georgios L. Stavrinides, H. Karatza","doi":"10.1109/DISTRA.2017.8167683","DOIUrl":null,"url":null,"abstract":"As cloud computing continues to gain momentum, big data analytics are now offered as Software as a Service (SaaS). Besides the heterogeneity and multi-tenancy of the underlying virtualized environment, scheduling such real-time, data-intensive, embarrassingly parallel applications in a SaaS cloud involves another serious challenge: data locality. Consequently, data-aware scheduling policies should be employed, in order to effectively exploit data locality, while at the same time taking into account the other attributes of the workload and the characteristics of the resources. Towards this direction, we investigate via simulation the impact of data locality on the performance of a SaaS cloud, where real-time, data-intensive bags-of-tasks are scheduled dynamically, under various data availability conditions. A non-data-aware baseline scheduling policy is compared with two proposed data-aware heuristics, in an attempt to shed light on the effect of data locality awareness on the system performance.","PeriodicalId":109971,"journal":{"name":"2017 IEEE/ACM 21st International Symposium on Distributed Simulation and Real Time Applications (DS-RT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"The impact of data locality on the performance of a SaaS cloud with real-time data-intensive applications\",\"authors\":\"Georgios L. Stavrinides, H. Karatza\",\"doi\":\"10.1109/DISTRA.2017.8167683\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As cloud computing continues to gain momentum, big data analytics are now offered as Software as a Service (SaaS). Besides the heterogeneity and multi-tenancy of the underlying virtualized environment, scheduling such real-time, data-intensive, embarrassingly parallel applications in a SaaS cloud involves another serious challenge: data locality. Consequently, data-aware scheduling policies should be employed, in order to effectively exploit data locality, while at the same time taking into account the other attributes of the workload and the characteristics of the resources. Towards this direction, we investigate via simulation the impact of data locality on the performance of a SaaS cloud, where real-time, data-intensive bags-of-tasks are scheduled dynamically, under various data availability conditions. A non-data-aware baseline scheduling policy is compared with two proposed data-aware heuristics, in an attempt to shed light on the effect of data locality awareness on the system performance.\",\"PeriodicalId\":109971,\"journal\":{\"name\":\"2017 IEEE/ACM 21st International Symposium on Distributed Simulation and Real Time Applications (DS-RT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE/ACM 21st International Symposium on Distributed Simulation and Real Time Applications (DS-RT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DISTRA.2017.8167683\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 21st International Symposium on Distributed Simulation and Real Time Applications (DS-RT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DISTRA.2017.8167683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The impact of data locality on the performance of a SaaS cloud with real-time data-intensive applications
As cloud computing continues to gain momentum, big data analytics are now offered as Software as a Service (SaaS). Besides the heterogeneity and multi-tenancy of the underlying virtualized environment, scheduling such real-time, data-intensive, embarrassingly parallel applications in a SaaS cloud involves another serious challenge: data locality. Consequently, data-aware scheduling policies should be employed, in order to effectively exploit data locality, while at the same time taking into account the other attributes of the workload and the characteristics of the resources. Towards this direction, we investigate via simulation the impact of data locality on the performance of a SaaS cloud, where real-time, data-intensive bags-of-tasks are scheduled dynamically, under various data availability conditions. A non-data-aware baseline scheduling policy is compared with two proposed data-aware heuristics, in an attempt to shed light on the effect of data locality awareness on the system performance.