{"title":"利用因果关系搜索和推理算法分析微服务架构系统中的弹性风险","authors":"Kanglin Yin, Qingfeng Du, Juan Qiu","doi":"10.1504/ijwgs.2020.10030149","DOIUrl":null,"url":null,"abstract":"The microservice architecture has already become the mainstream architecture pattern of web service applications in recent years. However, compared with traditional software architectures, the microservice architecture has a more sophisticated deployment structure, which makes it have to face more potential risks with greater diversity of fault symptoms. Microservice practitioners started to use the word 'resilience' to describe the capability of coping with different unexpected conditions. How to judge whether a system environment disruption is a risk of microservice resilience, and how to analyse resilience risks before the system is released, are the research questions in microservice development. As the practice of chaos engineering has solved the problem of resilience risk identification, this paper focuses on how to analyse identified resilience risks in microservice architecture systems, and a resilience risk analysis method is proposed. Based on performance monitoring data collected during chaos experiments, the analysis method uses the causality search algorithm to build causality graphs of performance indicators, and generates causality chains to system operators by the causality inference algorithm. The effectiveness of the proposed approach is proved by conducting a case study on a microservice architecture system.","PeriodicalId":54935,"journal":{"name":"International Journal of Web and Grid Services","volume":"105 1","pages":"147-171"},"PeriodicalIF":1.0000,"publicationDate":"2020-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analyse resilience risks in microservice architecture systems with causality search and inference algorithms\",\"authors\":\"Kanglin Yin, Qingfeng Du, Juan Qiu\",\"doi\":\"10.1504/ijwgs.2020.10030149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The microservice architecture has already become the mainstream architecture pattern of web service applications in recent years. However, compared with traditional software architectures, the microservice architecture has a more sophisticated deployment structure, which makes it have to face more potential risks with greater diversity of fault symptoms. Microservice practitioners started to use the word 'resilience' to describe the capability of coping with different unexpected conditions. How to judge whether a system environment disruption is a risk of microservice resilience, and how to analyse resilience risks before the system is released, are the research questions in microservice development. As the practice of chaos engineering has solved the problem of resilience risk identification, this paper focuses on how to analyse identified resilience risks in microservice architecture systems, and a resilience risk analysis method is proposed. Based on performance monitoring data collected during chaos experiments, the analysis method uses the causality search algorithm to build causality graphs of performance indicators, and generates causality chains to system operators by the causality inference algorithm. The effectiveness of the proposed approach is proved by conducting a case study on a microservice architecture system.\",\"PeriodicalId\":54935,\"journal\":{\"name\":\"International Journal of Web and Grid Services\",\"volume\":\"105 1\",\"pages\":\"147-171\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2020-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Web and Grid Services\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1504/ijwgs.2020.10030149\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Web and Grid Services","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1504/ijwgs.2020.10030149","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Analyse resilience risks in microservice architecture systems with causality search and inference algorithms
The microservice architecture has already become the mainstream architecture pattern of web service applications in recent years. However, compared with traditional software architectures, the microservice architecture has a more sophisticated deployment structure, which makes it have to face more potential risks with greater diversity of fault symptoms. Microservice practitioners started to use the word 'resilience' to describe the capability of coping with different unexpected conditions. How to judge whether a system environment disruption is a risk of microservice resilience, and how to analyse resilience risks before the system is released, are the research questions in microservice development. As the practice of chaos engineering has solved the problem of resilience risk identification, this paper focuses on how to analyse identified resilience risks in microservice architecture systems, and a resilience risk analysis method is proposed. Based on performance monitoring data collected during chaos experiments, the analysis method uses the causality search algorithm to build causality graphs of performance indicators, and generates causality chains to system operators by the causality inference algorithm. The effectiveness of the proposed approach is proved by conducting a case study on a microservice architecture system.
期刊介绍:
Web services are providing declarative interfaces to services offered by systems on the Internet, including messaging protocols, standard interfaces, directory services, as well as security layers, for efficient/effective business application integration. Grid computing has emerged as a global platform to support organisations for coordinated sharing of distributed data, applications, and processes. It has also started to leverage web services to define standard interfaces for business services. IJWGS addresses web and grid service technology, emphasising issues of architecture, implementation, and standardisation.