基于静态代码度量的人类级别有序可维护性预测

Markus Schnappinger, Arnaud Fietzke, A. Pretschner
{"title":"基于静态代码度量的人类级别有序可维护性预测","authors":"Markus Schnappinger, Arnaud Fietzke, A. Pretschner","doi":"10.1145/3463274.3463315","DOIUrl":null,"url":null,"abstract":"One of the greatest challenges in software quality control is the efficient and effective measurement of maintainability. Thorough expert assessments are precise yet slow and expensive, whereas automated static analysis yields imprecise yet rapid feedback. Several machine learning approaches aim to integrate the advantages of both concepts. However, most prior studies did not adhere to expert judgment and predicted the number of changed lines as a proxy for maintainability, or were biased towards a small group of experts. In contrast, the present study builds on a manually labeled and validated dataset. Prediction is done using static code metrics where we found simple structural metrics such as the size of a class and its methods to yield the highest predictive power towards maintainability. Using just a small set of these metrics, our models can distinguish easy from hard to maintain code with an F-score of 91.3% and AUC of 82.3%. In addition, we perform a more detailed ordinal classification and compare the quality of the classification with the performance of experts. Here, we use the deviations between the individual expert’s ratings and the eventually determined consensus of all experts. In sum, our models achieve the same level of performance as an average human expert. In fact, the obtained accuracy and mean squared error outperform human performance. We hence argue that our models provide an automated and trustworthy prediction of software maintainability.","PeriodicalId":328024,"journal":{"name":"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Human-level Ordinal Maintainability Prediction Based on Static Code Metrics\",\"authors\":\"Markus Schnappinger, Arnaud Fietzke, A. Pretschner\",\"doi\":\"10.1145/3463274.3463315\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the greatest challenges in software quality control is the efficient and effective measurement of maintainability. Thorough expert assessments are precise yet slow and expensive, whereas automated static analysis yields imprecise yet rapid feedback. Several machine learning approaches aim to integrate the advantages of both concepts. However, most prior studies did not adhere to expert judgment and predicted the number of changed lines as a proxy for maintainability, or were biased towards a small group of experts. In contrast, the present study builds on a manually labeled and validated dataset. Prediction is done using static code metrics where we found simple structural metrics such as the size of a class and its methods to yield the highest predictive power towards maintainability. Using just a small set of these metrics, our models can distinguish easy from hard to maintain code with an F-score of 91.3% and AUC of 82.3%. In addition, we perform a more detailed ordinal classification and compare the quality of the classification with the performance of experts. Here, we use the deviations between the individual expert’s ratings and the eventually determined consensus of all experts. In sum, our models achieve the same level of performance as an average human expert. In fact, the obtained accuracy and mean squared error outperform human performance. We hence argue that our models provide an automated and trustworthy prediction of software maintainability.\",\"PeriodicalId\":328024,\"journal\":{\"name\":\"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3463274.3463315\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on Evaluation and Assessment in Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3463274.3463315","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

软件质量控制中最大的挑战之一是对可维护性的有效度量。彻底的专家评估是精确的,但缓慢而昂贵,而自动化的静态分析产生不精确但快速的反馈。一些机器学习方法旨在整合这两个概念的优势。然而,大多数先前的研究并没有坚持专家的判断,并预测改变的行数作为可维护性的代理,或者偏向于一小群专家。相比之下,目前的研究建立在人工标记和验证的数据集上。预测是使用静态代码度量完成的,我们发现简单的结构度量(如类的大小及其方法)对可维护性产生最高的预测能力。仅使用这些指标中的一小部分,我们的模型就可以区分容易维护的代码和难以维护的代码,f值为91.3%,AUC为82.3%。此外,我们进行了更详细的顺序分类,并将分类质量与专家的表现进行了比较。在这里,我们使用个别专家的评级和所有专家最终确定的共识之间的偏差。总而言之,我们的模型达到了与普通人类专家相同的性能水平。事实上,获得的精度和均方误差优于人类的表现。因此,我们认为我们的模型提供了对软件可维护性的自动化和可靠的预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Human-level Ordinal Maintainability Prediction Based on Static Code Metrics
One of the greatest challenges in software quality control is the efficient and effective measurement of maintainability. Thorough expert assessments are precise yet slow and expensive, whereas automated static analysis yields imprecise yet rapid feedback. Several machine learning approaches aim to integrate the advantages of both concepts. However, most prior studies did not adhere to expert judgment and predicted the number of changed lines as a proxy for maintainability, or were biased towards a small group of experts. In contrast, the present study builds on a manually labeled and validated dataset. Prediction is done using static code metrics where we found simple structural metrics such as the size of a class and its methods to yield the highest predictive power towards maintainability. Using just a small set of these metrics, our models can distinguish easy from hard to maintain code with an F-score of 91.3% and AUC of 82.3%. In addition, we perform a more detailed ordinal classification and compare the quality of the classification with the performance of experts. Here, we use the deviations between the individual expert’s ratings and the eventually determined consensus of all experts. In sum, our models achieve the same level of performance as an average human expert. In fact, the obtained accuracy and mean squared error outperform human performance. We hence argue that our models provide an automated and trustworthy prediction of software maintainability.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
About the Assessment of Grey Literature in Software Engineering Towards an Automated Classification Approach for Software Engineering Research Fog Based Energy Efficient Process Framework for Smart Building Open Data-driven Usability Improvements of Static Code Analysis and its Challenges Towards a corpus for credibility assessment in software practitioner blog articles
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1