Pub Date : 2022-10-01DOI: 10.1109/ISSREW55968.2022.00071
Joaquim Manuel Silva Cardoso Rodrigues, J. E. Ferreira Ribeiro, Ademar Aguiar
Despite documentation being considered the pri-mary challenge to agile methods in safety-critical software systems development [1], agile would be of particular interest to improve changeability while providing efficiency and effective-ness to all the phases of software development. In this work, we created mechanisms for automating document processing and management to improve the efficiency and effectiveness of documentation activities of safety-critical software systems development, most concretely in the aerospace domain. The implemented tools were co-designed and validated iteratively in the concrete industrial context of Critical Software (CSW) projects, within a wider research work towards continuous certification [3]. We interviewed Critical Software professionals to validate our solution, collected feedback on the implemented tools and got insights for future work. The tools were also the target of synthetic tests that allowed us to conclude that document automation is possible in the critical-safety software development industry and carries several benefits. The developed tools are not yet qualified in compliance with the DO-330 standard (Tools Qualification).
{"title":"Improving Documentation Agility in Safety-Critical Software Systems Development For Aerospace","authors":"Joaquim Manuel Silva Cardoso Rodrigues, J. E. Ferreira Ribeiro, Ademar Aguiar","doi":"10.1109/ISSREW55968.2022.00071","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00071","url":null,"abstract":"Despite documentation being considered the pri-mary challenge to agile methods in safety-critical software systems development [1], agile would be of particular interest to improve changeability while providing efficiency and effective-ness to all the phases of software development. In this work, we created mechanisms for automating document processing and management to improve the efficiency and effectiveness of documentation activities of safety-critical software systems development, most concretely in the aerospace domain. The implemented tools were co-designed and validated iteratively in the concrete industrial context of Critical Software (CSW) projects, within a wider research work towards continuous certification [3]. We interviewed Critical Software professionals to validate our solution, collected feedback on the implemented tools and got insights for future work. The tools were also the target of synthetic tests that allowed us to conclude that document automation is possible in the critical-safety software development industry and carries several benefits. The developed tools are not yet qualified in compliance with the DO-330 standard (Tools Qualification).","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131278586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-01DOI: 10.1109/ISSREW55968.2022.00090
J. Defranco, M. Kassab, P. Laplante
Safety and trust are two of the most important features in a critical system. A critical system is one that must be highly reliable in that it not only completes its mission but causes zero harm to the public. The problem is testing a critical system, especially if it employs artificial intelligence (AI). The challenge is critical AI systems (CAIS) may cause unpredictable events and conditions that cannot be modeled during critical error testing. Proxy systems (non-critical prototype) are needed to test the critical system. We present a five-dimensional CAIS taxonomy and a weighting system to map system characteristics to a testing proxy in order to determine equivalent proxy systems to build and test. Ultimately this CAIS taxonomy and weighting system is a way forward to develop a set of proxy systems to use for critical error testing.
{"title":"A Taxonomy of Critical AI System Characteristics for Use in Proxy System Testing","authors":"J. Defranco, M. Kassab, P. Laplante","doi":"10.1109/ISSREW55968.2022.00090","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00090","url":null,"abstract":"Safety and trust are two of the most important features in a critical system. A critical system is one that must be highly reliable in that it not only completes its mission but causes zero harm to the public. The problem is testing a critical system, especially if it employs artificial intelligence (AI). The challenge is critical AI systems (CAIS) may cause unpredictable events and conditions that cannot be modeled during critical error testing. Proxy systems (non-critical prototype) are needed to test the critical system. We present a five-dimensional CAIS taxonomy and a weighting system to map system characteristics to a testing proxy in order to determine equivalent proxy systems to build and test. Ultimately this CAIS taxonomy and weighting system is a way forward to develop a set of proxy systems to use for critical error testing.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115749115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-01DOI: 10.1109/ISSREW55968.2022.00091
Bart Kemper
The use of Artificial Intelligence and Machine Learning technology may seem to be the tools needed to combat media-inspired “lone wolf attacks” by implementing the concept of “stochastic terrorism,” targeting harmful media influences. Machine Learning is in current use to sort through social media data to assess hate speech. Artificial Intelligence is in current use to interpret the data and trends processed by Machine Learning for tasks such as finding criminal networks. The question becomes “can stochastic terrorism be proven” and “should this be implemented.” Labeling someone as a “terrorist,” regardless of any modifier for the term, tags the person or group for severe, potentially lethal, response by the government and the community. Criminal accusation cannot ethically be done casually or without sufficient cause. Due to documented problems with bias in all aspects of the issue, using these computational tools to establish legal causation between media statements by pundits, politicians, or others and the violence of “lone wolf” actors would not meet the requirements of US jurisprudence or the ethical principles for Artificial Intelligence of being explainable, transparent, and responsible.
{"title":"AI and Stochastic Terrorism – Should it be done?","authors":"Bart Kemper","doi":"10.1109/ISSREW55968.2022.00091","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00091","url":null,"abstract":"The use of Artificial Intelligence and Machine Learning technology may seem to be the tools needed to combat media-inspired “lone wolf attacks” by implementing the concept of “stochastic terrorism,” targeting harmful media influences. Machine Learning is in current use to sort through social media data to assess hate speech. Artificial Intelligence is in current use to interpret the data and trends processed by Machine Learning for tasks such as finding criminal networks. The question becomes “can stochastic terrorism be proven” and “should this be implemented.” Labeling someone as a “terrorist,” regardless of any modifier for the term, tags the person or group for severe, potentially lethal, response by the government and the community. Criminal accusation cannot ethically be done casually or without sufficient cause. Due to documented problems with bias in all aspects of the issue, using these computational tools to establish legal causation between media statements by pundits, politicians, or others and the violence of “lone wolf” actors would not meet the requirements of US jurisprudence or the ethical principles for Artificial Intelligence of being explainable, transparent, and responsible.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131467168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-01DOI: 10.1109/ISSREW55968.2022.00066
Yancai Zhou, Chen Zhang, Kai Jia, Dongdong Zhao, Jianwen Xiang
Software aging refers to the phenomenon of sys-tem performance degradation and eventual failure caused by Aging-Related Bugs (ARBs). Software aging seriously affects the reliability and availability of software systems. To discover and remove ARBs, ARBs prediction is presented, and most of them only employed static code metrics to predict those buggy codes. However, static code metrics do not capture the syntactic and semantic features of the code, which are important to building accurate prediction models. To address this problem, we design a deep neural network by combining the bidirectional long short-term memory (BLSTM) and the attention mechanism to extract context-sensitive semantic features of the code. In addition, we apply a weakly supervised oversampling (WSO) method to alleviate class imbalance problems in datasets. We named our framework ABLSTM-WSO. We conduct experiments with five classifiers on two widely used open-source projects(MySQL and Linux) and use AUC, Balance, and F1-score as the evaluation metrics. Experimental results show that ABLSTM-WSO can significantly improve the ARBs prediction performance.
软件老化是指由老化相关bug (aging - related Bugs, arb)引起的系统性能下降和最终失效的现象。软件老化严重影响软件系统的可靠性和可用性。为了发现和删除arb,提出了arb预测方法,大多数方法仅使用静态代码度量来预测这些有bug的代码。然而,静态代码度量不能捕获代码的语法和语义特征,而这些特征对于构建准确的预测模型是很重要的。为了解决这一问题,我们设计了一个深度神经网络,结合双向长短期记忆(BLSTM)和注意机制来提取代码的上下文敏感语义特征。此外,我们应用弱监督过采样(WSO)方法来缓解数据集中的类不平衡问题。我们将我们的框架命名为ABLSTM-WSO。我们在两个广泛使用的开源项目(MySQL和Linux)上使用五个分类器进行实验,并使用AUC, Balance和F1-score作为评估指标。实验结果表明,ABLSTM-WSO能显著提高arb的预测性能。
{"title":"A Software Aging-Related Bug Prediction Framework Based on Deep Learning and Weakly Supervised Oversampling","authors":"Yancai Zhou, Chen Zhang, Kai Jia, Dongdong Zhao, Jianwen Xiang","doi":"10.1109/ISSREW55968.2022.00066","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00066","url":null,"abstract":"Software aging refers to the phenomenon of sys-tem performance degradation and eventual failure caused by Aging-Related Bugs (ARBs). Software aging seriously affects the reliability and availability of software systems. To discover and remove ARBs, ARBs prediction is presented, and most of them only employed static code metrics to predict those buggy codes. However, static code metrics do not capture the syntactic and semantic features of the code, which are important to building accurate prediction models. To address this problem, we design a deep neural network by combining the bidirectional long short-term memory (BLSTM) and the attention mechanism to extract context-sensitive semantic features of the code. In addition, we apply a weakly supervised oversampling (WSO) method to alleviate class imbalance problems in datasets. We named our framework ABLSTM-WSO. We conduct experiments with five classifiers on two widely used open-source projects(MySQL and Linux) and use AUC, Balance, and F1-score as the evaluation metrics. Experimental results show that ABLSTM-WSO can significantly improve the ARBs prediction performance.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126429542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-01DOI: 10.1109/ISSREW55968.2022.00098
M. Feather, Philip C. Slingerland, S. Guerrini, Max Spolaor
We are developing guidance for space domain assurance personnel on how to assure Artificial intelligence (AI) and Machine Learning (ML) systems. Key to such guidance will be an assurance process for these personnel, who may be unfamiliar with such systems, to follow. We are investigating one such process, the “Assurance of Machine Learning in Autonomous Systems (AMLAS)” from the University of York, UK. To gauge its suitability, we are (retrospectively) applying it to a safety critical AIIML system in the space domain. We report here on our experience so far in applying this process.
{"title":"Assurance Guidance for Machine Learning in a Safety-Critical System","authors":"M. Feather, Philip C. Slingerland, S. Guerrini, Max Spolaor","doi":"10.1109/ISSREW55968.2022.00098","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00098","url":null,"abstract":"We are developing guidance for space domain assurance personnel on how to assure Artificial intelligence (AI) and Machine Learning (ML) systems. Key to such guidance will be an assurance process for these personnel, who may be unfamiliar with such systems, to follow. We are investigating one such process, the “Assurance of Machine Learning in Autonomous Systems (AMLAS)” from the University of York, UK. To gauge its suitability, we are (retrospectively) applying it to a safety critical AIIML system in the space domain. We report here on our experience so far in applying this process.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121798246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-30DOI: 10.1109/ISSREW55968.2022.00095
Simon Diemert, J. Weber
Modern systems are designed to operate in increasingly variable and uncertain environments. Not only are these environments complex, in the sense that they contain a tremendous number of variables, but they also change over time. Systems must be able to adjust their behaviour at run-time to manage these uncertainties. These “self-adaptive systems” have been studied extensively. This paper proposes a definition of a safety-critical self-adaptive system and then describes a taxonomy for classifying adaptations into different types based on their impact on the system's safety and the system's safety case. The taxonomy expresses criteria for classification and then describes specific criteria that the safety case for a self-adaptive system must satisfy, depending on the type of adaptations performed. Each type in the taxonomy is illustrated using the example of a safety-critical self-adaptive water heating system.
{"title":"Safety-Critical Adaptation in Self-Adaptive Systems","authors":"Simon Diemert, J. Weber","doi":"10.1109/ISSREW55968.2022.00095","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00095","url":null,"abstract":"Modern systems are designed to operate in increasingly variable and uncertain environments. Not only are these environments complex, in the sense that they contain a tremendous number of variables, but they also change over time. Systems must be able to adjust their behaviour at run-time to manage these uncertainties. These “self-adaptive systems” have been studied extensively. This paper proposes a definition of a safety-critical self-adaptive system and then describes a taxonomy for classifying adaptations into different types based on their impact on the system's safety and the system's safety case. The taxonomy expresses criteria for classification and then describes specific criteria that the safety case for a self-adaptive system must satisfy, depending on the type of adaptations performed. Each type in the taxonomy is illustrated using the example of a safety-critical self-adaptive water heating system.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115453291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-08DOI: 10.1109/ISSREW55968.2022.00039
Advaita Datar, Amey Zare, A. Asia, R. Venkatesh, Susheel Kumar, U. Shrotri
Insurance companies rely on their Legacy Insurance System (LIS) to govern day-to-day operations. These LIS operate as per the company's business rules that are formally specified in Calculation Specification (CS) sheets. To meet ever-changing business demands, insurance companies are increasingly trans-forming their outdated LIS to modern Policy Administration Systems (PAS). Quality Assurance (QA) of such PAS involves manual validation of calculations' implementation against the corresponding CS sheets from the LIS. This manual QA approach is effort-intensive and error-prone, which may fail to detect inconsistencies in PAS implementations and ultimately result in monetary loss. To address this challenge, we propose a novel low-code/no-code technique to automatically validate PAS imple-mentation against CS sheets. Our technique has been evaluated on a digital transformation project of a large insurance company on 12 real-world calculations through 254 policies. The evaluation resulted in effort savings of approximately 92 percent against the conventional manual validation approach.
{"title":"Automated Validation of Insurance Applications against Calculation Specifications","authors":"Advaita Datar, Amey Zare, A. Asia, R. Venkatesh, Susheel Kumar, U. Shrotri","doi":"10.1109/ISSREW55968.2022.00039","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00039","url":null,"abstract":"Insurance companies rely on their Legacy Insurance System (LIS) to govern day-to-day operations. These LIS operate as per the company's business rules that are formally specified in Calculation Specification (CS) sheets. To meet ever-changing business demands, insurance companies are increasingly trans-forming their outdated LIS to modern Policy Administration Systems (PAS). Quality Assurance (QA) of such PAS involves manual validation of calculations' implementation against the corresponding CS sheets from the LIS. This manual QA approach is effort-intensive and error-prone, which may fail to detect inconsistencies in PAS implementations and ultimately result in monetary loss. To address this challenge, we propose a novel low-code/no-code technique to automatically validate PAS imple-mentation against CS sheets. Our technique has been evaluated on a digital transformation project of a large insurance company on 12 real-world calculations through 254 policies. The evaluation resulted in effort savings of approximately 92 percent against the conventional manual validation approach.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134554138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-05DOI: 10.1109/ISSREW55968.2022.00058
Marco Barletta, M. Cinque, L. Simone, Raffaele Della Corte, Giorgio Farina, D. Ottaviano
Orchestration systems are becoming a key component to automatically manage distributed computing resources in many fields with criticality requirements like Industry 4.0 (14.0). However, they are mainly linked to OS-level virtualization, which is known to suffer from reduced isolation. In this paper, we propose RunPHI with the aim of integrating partitioning hypervisors, as a solution for assuring strong isolation, with OS-level orchestration systems. The purpose is to enable container orchestration in mixed-criticality systems with isolation requirements through partitioned containers.
{"title":"RunPHI: Enabling Mixed-criticality Containers via Partitioning Hypervisors in Industry 4.0","authors":"Marco Barletta, M. Cinque, L. Simone, Raffaele Della Corte, Giorgio Farina, D. Ottaviano","doi":"10.1109/ISSREW55968.2022.00058","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00058","url":null,"abstract":"Orchestration systems are becoming a key component to automatically manage distributed computing resources in many fields with criticality requirements like Industry 4.0 (14.0). However, they are mainly linked to OS-level virtualization, which is known to suffer from reduced isolation. In this paper, we propose RunPHI with the aim of integrating partitioning hypervisors, as a solution for assuring strong isolation, with OS-level orchestration systems. The purpose is to enable container orchestration in mixed-criticality systems with isolation requirements through partitioned containers.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"364 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121652479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.1109/ISSREW55968.2022.00067
Carmine Cesarano, Domenico Cotroneo, L. Simone
Partitioning hypervisor solutions are becoming increasingly popular, to ensure stringent security and safety requirements related to isolation between co-hosted applications and to make more efficient use of available hardware resources. However, assessment and certification of isolation requirements remain a challenge and it is not trivial to understand what and how to test to validate these properties. Although the high-level requirements to be verified are mentioned in the different security- and safety-related standards, there is a lack of precise guidelines for the evaluator. This guidance should be comprehensive, generalizable to different products that implement partitioning, and tied specifically to lower-level requirements. The goal of this work is to provide a systematic framework that addresses this need.
{"title":"Towards Assessing Isolation Properties in Partitioning Hypervisors","authors":"Carmine Cesarano, Domenico Cotroneo, L. Simone","doi":"10.1109/ISSREW55968.2022.00067","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00067","url":null,"abstract":"Partitioning hypervisor solutions are becoming increasingly popular, to ensure stringent security and safety requirements related to isolation between co-hosted applications and to make more efficient use of available hardware resources. However, assessment and certification of isolation requirements remain a challenge and it is not trivial to understand what and how to test to validate these properties. Although the high-level requirements to be verified are mentioned in the different security- and safety-related standards, there is a lack of precise guidelines for the evaluator. This guidance should be comprehensive, generalizable to different products that implement partitioning, and tied specifically to lower-level requirements. The goal of this work is to provide a systematic framework that addresses this need.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123352900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-28DOI: 10.1109/ISSREW55968.2022.00041
Tianyi Yang, Baitong Li, Jiacheng Shen, Yuxin Su, Yongqiang Yang, Michael R. Lyu
Interactions between cloud services result in service dependencies. Evaluating and managing the cascading impacts caused by service dependencies is critical to the reliability of cloud systems. This paper summarizes the dependency types in cloud systems and demonstrates the design of the Dependency Management System (DMS), a platform for managing the service dependencies in the production cloud system. DMS features full-lifecycle support for service reliability (i.e., initial service deployment, service upgrade, proactive architectural optimization, and reactive failure mitigation) and refined characterization of the intensity of dependencies.
{"title":"Managing Service Dependency for Cloud Reliability: The Industrial Practice","authors":"Tianyi Yang, Baitong Li, Jiacheng Shen, Yuxin Su, Yongqiang Yang, Michael R. Lyu","doi":"10.1109/ISSREW55968.2022.00041","DOIUrl":"https://doi.org/10.1109/ISSREW55968.2022.00041","url":null,"abstract":"Interactions between cloud services result in service dependencies. Evaluating and managing the cascading impacts caused by service dependencies is critical to the reliability of cloud systems. This paper summarizes the dependency types in cloud systems and demonstrates the design of the Dependency Management System (DMS), a platform for managing the service dependencies in the production cloud system. DMS features full-lifecycle support for service reliability (i.e., initial service deployment, service upgrade, proactive architectural optimization, and reactive failure mitigation) and refined characterization of the intensity of dependencies.","PeriodicalId":178302,"journal":{"name":"2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130210783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}