{"title":"可信赖的基于人工智能的云应用性能诊断系统综述","authors":"Ruyue Xin, Jingye Wang, Peng Chen, Zhiming Zhao","doi":"10.1145/3701740","DOIUrl":null,"url":null,"abstract":"Performance diagnosis systems are defined as detecting abnormal performance phenomena and play a crucial role in cloud applications. An effective performance diagnosis system is often developed based on artificial intelligence (AI) approaches, which can be summarized into a general framework from data to models. However, the AI-based framework has potential hazards that could degrade the user experience and trust. For example, a lack of data privacy may compromise the security of AI models, and low robustness can be hard to apply in complex cloud environments. Therefore, defining the requirements for building a trustworthy AI-based performance diagnosis system has become essential. This article systematically reviews trustworthiness requirements in AI-based performance diagnosis systems. We first introduce trustworthiness requirements and extract six key requirements from a technical perspective, including data privacy, fairness, robustness, explainability, efficiency, and human intervention. We then unify these requirements into a general performance diagnosis framework, ranging from data collection to model development. Next, we comprehensively provide related works for each component and concrete actions to improve trustworthiness in the framework. Finally, we identify possible research directions and challenges for the future development of trustworthy AI-based performance diagnosis systems.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"57 1","pages":""},"PeriodicalIF":23.8000,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Trustworthy AI-based Performance Diagnosis Systems for Cloud Applications: A Review\",\"authors\":\"Ruyue Xin, Jingye Wang, Peng Chen, Zhiming Zhao\",\"doi\":\"10.1145/3701740\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Performance diagnosis systems are defined as detecting abnormal performance phenomena and play a crucial role in cloud applications. An effective performance diagnosis system is often developed based on artificial intelligence (AI) approaches, which can be summarized into a general framework from data to models. However, the AI-based framework has potential hazards that could degrade the user experience and trust. For example, a lack of data privacy may compromise the security of AI models, and low robustness can be hard to apply in complex cloud environments. Therefore, defining the requirements for building a trustworthy AI-based performance diagnosis system has become essential. This article systematically reviews trustworthiness requirements in AI-based performance diagnosis systems. We first introduce trustworthiness requirements and extract six key requirements from a technical perspective, including data privacy, fairness, robustness, explainability, efficiency, and human intervention. We then unify these requirements into a general performance diagnosis framework, ranging from data collection to model development. Next, we comprehensively provide related works for each component and concrete actions to improve trustworthiness in the framework. Finally, we identify possible research directions and challenges for the future development of trustworthy AI-based performance diagnosis systems.\",\"PeriodicalId\":50926,\"journal\":{\"name\":\"ACM Computing Surveys\",\"volume\":\"57 1\",\"pages\":\"\"},\"PeriodicalIF\":23.8000,\"publicationDate\":\"2025-01-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Computing Surveys\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3701740\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3701740","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Trustworthy AI-based Performance Diagnosis Systems for Cloud Applications: A Review
Performance diagnosis systems are defined as detecting abnormal performance phenomena and play a crucial role in cloud applications. An effective performance diagnosis system is often developed based on artificial intelligence (AI) approaches, which can be summarized into a general framework from data to models. However, the AI-based framework has potential hazards that could degrade the user experience and trust. For example, a lack of data privacy may compromise the security of AI models, and low robustness can be hard to apply in complex cloud environments. Therefore, defining the requirements for building a trustworthy AI-based performance diagnosis system has become essential. This article systematically reviews trustworthiness requirements in AI-based performance diagnosis systems. We first introduce trustworthiness requirements and extract six key requirements from a technical perspective, including data privacy, fairness, robustness, explainability, efficiency, and human intervention. We then unify these requirements into a general performance diagnosis framework, ranging from data collection to model development. Next, we comprehensively provide related works for each component and concrete actions to improve trustworthiness in the framework. Finally, we identify possible research directions and challenges for the future development of trustworthy AI-based performance diagnosis systems.
期刊介绍:
ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods.
ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.