Towards a More Reliable Interpretation of Defect Models

2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion) Pub Date : 2019-05-25 DOI:10.1109/ICSE-Companion.2019.00084

Jirayus Jiarpakdee

{"title":"Towards a More Reliable Interpretation of Defect Models","authors":"Jirayus Jiarpakdee","doi":"10.1109/ICSE-Companion.2019.00084","DOIUrl":null,"url":null,"abstract":"Software Quality Assurance (SQA) activities are exercised to ensure high-quality software systems. Defect models help developers identify the most risky modules to prioritise their limited SQA resources. The interpretation of defect models also helps managers understand what factors impact software quality to chart quality improvement plans. Unfortunately, the commonly-used interpretation techniques (e.g., ANOVA for logistic regression and variable importance for random forests) only explain defect models at the high level (e.g., what factors impact software quality). Researchers and practitioners also raise concerns about a lack of explainability of defect models that hinders the adoption in practice. This thesis hypothesises that: A lack of explainability poses a critical challenge when adopting defect models in practice. To validate the hypothesis, we formulate 3 research questions, i.e., (1) what is the best defect modelling workflow that produces the most accurate and reliable interpretation of defect models?, (2) what is the best technique for explaining the predictions of defect models?, and (3) how do practitioners perceive when adopting explainable defect models? Through case studies of publicly-available open-source and industrial software systems, the results show that correlated variables impact the interpretation of defect models and must be mitigated; our proposed feature selection technique, AutoSpearman, is the only studied feature selection technique that can automatically mitigate correlated variables with a little impact on model performance; and the instance-level interpretation of defect models is needed to derive actionable insights to guide operational and technical decisions in SQA efforts.","PeriodicalId":273100,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE-Companion.2019.00084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Software Quality Assurance (SQA) activities are exercised to ensure high-quality software systems. Defect models help developers identify the most risky modules to prioritise their limited SQA resources. The interpretation of defect models also helps managers understand what factors impact software quality to chart quality improvement plans. Unfortunately, the commonly-used interpretation techniques (e.g., ANOVA for logistic regression and variable importance for random forests) only explain defect models at the high level (e.g., what factors impact software quality). Researchers and practitioners also raise concerns about a lack of explainability of defect models that hinders the adoption in practice. This thesis hypothesises that: A lack of explainability poses a critical challenge when adopting defect models in practice. To validate the hypothesis, we formulate 3 research questions, i.e., (1) what is the best defect modelling workflow that produces the most accurate and reliable interpretation of defect models?, (2) what is the best technique for explaining the predictions of defect models?, and (3) how do practitioners perceive when adopting explainable defect models? Through case studies of publicly-available open-source and industrial software systems, the results show that correlated variables impact the interpretation of defect models and must be mitigated; our proposed feature selection technique, AutoSpearman, is the only studied feature selection technique that can automatically mitigate correlated variables with a little impact on model performance; and the instance-level interpretation of defect models is needed to derive actionable insights to guide operational and technical decisions in SQA efforts.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

迈向更可靠的缺陷模型解释

执行软件质量保证(SQA)活动以确保高质量的软件系统。缺陷模型帮助开发人员识别风险最大的模块，以确定有限的SQA资源的优先级。缺陷模型的解释也帮助管理人员理解影响软件质量的因素，从而制定质量改进计划。不幸的是，常用的解释技术(例如，逻辑回归的方差分析和随机森林的可变重要性)只能在高层次上解释缺陷模型(例如，什么因素影响软件质量)。研究人员和实践者还提出了对缺陷模型缺乏可解释性的担忧，这阻碍了在实践中的采用。本文假设:在实践中采用缺陷模型时，缺乏可解释性是一个关键的挑战。为了验证这个假设，我们提出了3个研究问题，即，(1)产生最准确和可靠的缺陷模型解释的最佳缺陷建模工作流是什么?(2)解释缺陷模型预测的最佳技术是什么?(3)当采用可解释的缺陷模型时，从业者是如何感知的?通过对公开可用的开源软件和工业软件系统的案例研究，结果表明相关变量影响缺陷模型的解释，必须加以缓解;我们提出的特征选择技术AutoSpearman是唯一一种可以自动减轻相关变量而对模型性能影响很小的特征选择技术;并且需要缺陷模型的实例级解释来派生可操作的见解，以指导SQA工作中的操作和技术决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)

自引率

0.00%

发文量