{"title":"Survey of failures and fault tolerance in cloud","authors":"S. Prathiba, S. Sowvarnica","doi":"10.1109/ICCCT2.2017.7972271","DOIUrl":null,"url":null,"abstract":"Cloud computing provides support for hosting client's application. Cloud is a distributed platform that provides hardware, software and network resources to both execute consumer's application and also to store and mange user's data. Cloud is also used to execute scientific workflow applications that are in general complex in nature when compared to other applications. Since cloud is a distributed platform, it is more prone to errors and failures. In such an environment, avoiding a failure is difficult and identifying the source of failure is also complex. Because of this, fault tolerance mechanisms are implemented on the cloud platform. This ensures that even if there are failures in the environment, critical data of the client is not lost and user's application running on cloud is not affected in any manner. Fault tolerance mechanisms also help in improving the cloud's performance by proving the services to the users as required on demand. In this paper a survey of existing fault tolerance mechanisms for the cloud platform are discussed. This paper also discusses the failures, fault tolerant clustering methods and fault tolerant models that are specific for scientific workflow applications.","PeriodicalId":445567,"journal":{"name":"2017 2nd International Conference on Computing and Communications Technologies (ICCCT)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd International Conference on Computing and Communications Technologies (ICCCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT2.2017.7972271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29
Abstract
Cloud computing provides support for hosting client's application. Cloud is a distributed platform that provides hardware, software and network resources to both execute consumer's application and also to store and mange user's data. Cloud is also used to execute scientific workflow applications that are in general complex in nature when compared to other applications. Since cloud is a distributed platform, it is more prone to errors and failures. In such an environment, avoiding a failure is difficult and identifying the source of failure is also complex. Because of this, fault tolerance mechanisms are implemented on the cloud platform. This ensures that even if there are failures in the environment, critical data of the client is not lost and user's application running on cloud is not affected in any manner. Fault tolerance mechanisms also help in improving the cloud's performance by proving the services to the users as required on demand. In this paper a survey of existing fault tolerance mechanisms for the cloud platform are discussed. This paper also discusses the failures, fault tolerant clustering methods and fault tolerant models that are specific for scientific workflow applications.