{"title":"Enhancing Cloud Availability via Intelligent Monitoring using Time Series Database and Machine Learning","authors":"","doi":"10.4018/ijssci.285591","DOIUrl":null,"url":null,"abstract":"Every cloud provider, wishes to provide 99.9999% availabil- ity for the systems provisioned and operated by them for the customer i.e. may it be SaaS or PaaS or IaaS model, the availability of the system must be greater than 99.9999%.It becomes vital for the provider to mon- itor the systems and take proactive measures to reduce the downtime.In an ideal scenario, the support colleagues (24*7 technical support) must be aware of the on-going issues in the production systems before it is raised as an incident by the customer. But currently, there is no effective alert monitoring solutions for the same. The proposed solution presented in this paper is to have a central alert monitoring tool for all cloud so- lutions offered by the cloud provider. The central alert monitoring tool constantly observes the time series database which contains metric val- ues populated by HA and compares the incoming metric values with the defined thresholds. When a metric value exceeds the defined threshold, using machine learning techniques the monitoring tool decides & takes actions.","PeriodicalId":29913,"journal":{"name":"International Journal of Software Science and Computational Intelligence-IJSSCI","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Software Science and Computational Intelligence-IJSSCI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijssci.285591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Every cloud provider, wishes to provide 99.9999% availabil- ity for the systems provisioned and operated by them for the customer i.e. may it be SaaS or PaaS or IaaS model, the availability of the system must be greater than 99.9999%.It becomes vital for the provider to mon- itor the systems and take proactive measures to reduce the downtime.In an ideal scenario, the support colleagues (24*7 technical support) must be aware of the on-going issues in the production systems before it is raised as an incident by the customer. But currently, there is no effective alert monitoring solutions for the same. The proposed solution presented in this paper is to have a central alert monitoring tool for all cloud so- lutions offered by the cloud provider. The central alert monitoring tool constantly observes the time series database which contains metric val- ues populated by HA and compares the incoming metric values with the defined thresholds. When a metric value exceeds the defined threshold, using machine learning techniques the monitoring tool decides & takes actions.