{"title":"在云中与多个服务提供商经济有效地存储科学数据集的算法","authors":"Dong Yuan, X. Liu, Li-zhen Cui, Tiantian Zhang, Wenhao Li, Dahai Cao, Yun Yang","doi":"10.1109/eScience.2013.34","DOIUrl":null,"url":null,"abstract":"The proliferation of cloud computing allows scientists to deploy computation and data intensive applications without infrastructure investment, where large generated datasets can be flexibly stored with multiple cloud service providers. Due to the pay-as-you-go model, the total application cost largely depends on the usage of computation, storage and bandwidth resources, and cutting the cost of cloud-based data storage becomes a big concern for deploying scientific applications in the cloud. In this paper, we propose a novel algorithm that can automatically decide whether a generated dataset should be 1) stored in the current cloud, 2) deleted and re-generated whenever reused or 3) transferred to cheaper cloud service for storage. The algorithm finds the trade-off among computation, storage and bandwidth costs in the cloud, which are three key factors for the cost of storing generated application datasets with multiple cloud service providers. Simulations conducted with popular cloud service providers' pricing models show that the proposed algorithm is highly cost-effective to be utilised in the cloud.","PeriodicalId":325272,"journal":{"name":"2013 IEEE 9th International Conference on e-Science","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"An Algorithm for Cost-Effectively Storing Scientific Datasets with Multiple Service Providers in the Cloud\",\"authors\":\"Dong Yuan, X. Liu, Li-zhen Cui, Tiantian Zhang, Wenhao Li, Dahai Cao, Yun Yang\",\"doi\":\"10.1109/eScience.2013.34\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The proliferation of cloud computing allows scientists to deploy computation and data intensive applications without infrastructure investment, where large generated datasets can be flexibly stored with multiple cloud service providers. Due to the pay-as-you-go model, the total application cost largely depends on the usage of computation, storage and bandwidth resources, and cutting the cost of cloud-based data storage becomes a big concern for deploying scientific applications in the cloud. In this paper, we propose a novel algorithm that can automatically decide whether a generated dataset should be 1) stored in the current cloud, 2) deleted and re-generated whenever reused or 3) transferred to cheaper cloud service for storage. The algorithm finds the trade-off among computation, storage and bandwidth costs in the cloud, which are three key factors for the cost of storing generated application datasets with multiple cloud service providers. Simulations conducted with popular cloud service providers' pricing models show that the proposed algorithm is highly cost-effective to be utilised in the cloud.\",\"PeriodicalId\":325272,\"journal\":{\"name\":\"2013 IEEE 9th International Conference on e-Science\",\"volume\":\"2015 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 9th International Conference on e-Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/eScience.2013.34\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 9th International Conference on e-Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2013.34","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Algorithm for Cost-Effectively Storing Scientific Datasets with Multiple Service Providers in the Cloud
The proliferation of cloud computing allows scientists to deploy computation and data intensive applications without infrastructure investment, where large generated datasets can be flexibly stored with multiple cloud service providers. Due to the pay-as-you-go model, the total application cost largely depends on the usage of computation, storage and bandwidth resources, and cutting the cost of cloud-based data storage becomes a big concern for deploying scientific applications in the cloud. In this paper, we propose a novel algorithm that can automatically decide whether a generated dataset should be 1) stored in the current cloud, 2) deleted and re-generated whenever reused or 3) transferred to cheaper cloud service for storage. The algorithm finds the trade-off among computation, storage and bandwidth costs in the cloud, which are three key factors for the cost of storing generated application datasets with multiple cloud service providers. Simulations conducted with popular cloud service providers' pricing models show that the proposed algorithm is highly cost-effective to be utilised in the cloud.