Harikrishna Bommala , Uma Maheswari V. , Rajanikanth Aluvalu , Swapna Mudrakola
{"title":"面向云环境的机器学习作业失效分析与预测模型","authors":"Harikrishna Bommala , Uma Maheswari V. , Rajanikanth Aluvalu , Swapna Mudrakola","doi":"10.1016/j.hcc.2023.100165","DOIUrl":null,"url":null,"abstract":"<div><p>Reliable and accessible cloud applications are essential for the future of ubiquitous computing, smart appliances, and electronic health. Owing to the vastness and diversity of the cloud, a most cloud services, both physical and logical services have failed. Using currently accessible traces, we assessed and characterized the behaviors of successful and unsuccessful activities. We devised and implemented a method to forecast which jobs will fail. The proposed method optimizes cloud applications more efficiently in terms of resource usage. Using Google Cluster, Mustang, and Trinity traces, which are publicly available, an in-depth evaluation of the proposed model was conducted. The traces were also fed into several different machine learning models to select the most reliable model. Our efficiency analysis proves that the model performs well in terms of accuracy, F1-score, and recall. Several factors, such as failure of forecasting work, design of scheduling algorithms, modification of priority criteria, and restriction of task resubmission, may increase cloud service dependability and availability.</p></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"3 4","pages":"Article 100165"},"PeriodicalIF":3.2000,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667295223000636/pdfft?md5=bfe61b5b8fb7fd53b685e1c9be60171b&pid=1-s2.0-S2667295223000636-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Machine learning job failure analysis and prediction model for the cloud environment\",\"authors\":\"Harikrishna Bommala , Uma Maheswari V. , Rajanikanth Aluvalu , Swapna Mudrakola\",\"doi\":\"10.1016/j.hcc.2023.100165\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Reliable and accessible cloud applications are essential for the future of ubiquitous computing, smart appliances, and electronic health. Owing to the vastness and diversity of the cloud, a most cloud services, both physical and logical services have failed. Using currently accessible traces, we assessed and characterized the behaviors of successful and unsuccessful activities. We devised and implemented a method to forecast which jobs will fail. The proposed method optimizes cloud applications more efficiently in terms of resource usage. Using Google Cluster, Mustang, and Trinity traces, which are publicly available, an in-depth evaluation of the proposed model was conducted. The traces were also fed into several different machine learning models to select the most reliable model. Our efficiency analysis proves that the model performs well in terms of accuracy, F1-score, and recall. Several factors, such as failure of forecasting work, design of scheduling algorithms, modification of priority criteria, and restriction of task resubmission, may increase cloud service dependability and availability.</p></div>\",\"PeriodicalId\":100605,\"journal\":{\"name\":\"High-Confidence Computing\",\"volume\":\"3 4\",\"pages\":\"Article 100165\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2023-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2667295223000636/pdfft?md5=bfe61b5b8fb7fd53b685e1c9be60171b&pid=1-s2.0-S2667295223000636-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"High-Confidence Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667295223000636\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667295223000636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Machine learning job failure analysis and prediction model for the cloud environment
Reliable and accessible cloud applications are essential for the future of ubiquitous computing, smart appliances, and electronic health. Owing to the vastness and diversity of the cloud, a most cloud services, both physical and logical services have failed. Using currently accessible traces, we assessed and characterized the behaviors of successful and unsuccessful activities. We devised and implemented a method to forecast which jobs will fail. The proposed method optimizes cloud applications more efficiently in terms of resource usage. Using Google Cluster, Mustang, and Trinity traces, which are publicly available, an in-depth evaluation of the proposed model was conducted. The traces were also fed into several different machine learning models to select the most reliable model. Our efficiency analysis proves that the model performs well in terms of accuracy, F1-score, and recall. Several factors, such as failure of forecasting work, design of scheduling algorithms, modification of priority criteria, and restriction of task resubmission, may increase cloud service dependability and availability.