缺失值对基于类比的软件工作量估算方法AQUA预测精度的影响分析

First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007) Pub Date : 2007-09-20 DOI:10.1109/ESEM.2007.10

Jingzhou Li, Ahmed Al-Emran, G. Ruhe

{"title":"缺失值对基于类比的软件工作量估算方法AQUA预测精度的影响分析","authors":"Jingzhou Li, Ahmed Al-Emran, G. Ruhe","doi":"10.1109/ESEM.2007.10","DOIUrl":null,"url":null,"abstract":"Effort estimation by analogy (EBA) is often confronted with missing values. Our former analogy- based method AUQA is able to tolerate missing values in the data set, but it is unclear how the percentage of missing values impacts the prediction accuracy and if there is an upper bound for how big this percentage might become in order to guarantee the applicability of AQUA. This paper investigates these questions through an impact analysis. The impact analysis is conducted for seven data sets being of different size and having different initial percentages of missing values. The major results are that (i) we confirm the intuition that the more missing values, the poorer the prediction accuracy of AQUA; (ii) there is a quadratic dependency between the prediction accuracy and the percentage of missing values; and (Hi) the upper limit of missing values for the applicability of AQUA is determined as 40%. These results are obtained in the context of AQUA. Further analysis is necessary for other ways of applying EBA, such as using different similarity measures or analogy adaptation methods from those used in AQUA. For that purpose, the experimental design in this study can be adapted.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"103","resultStr":"{\"title\":\"Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA\",\"authors\":\"Jingzhou Li, Ahmed Al-Emran, G. Ruhe\",\"doi\":\"10.1109/ESEM.2007.10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Effort estimation by analogy (EBA) is often confronted with missing values. Our former analogy- based method AUQA is able to tolerate missing values in the data set, but it is unclear how the percentage of missing values impacts the prediction accuracy and if there is an upper bound for how big this percentage might become in order to guarantee the applicability of AQUA. This paper investigates these questions through an impact analysis. The impact analysis is conducted for seven data sets being of different size and having different initial percentages of missing values. The major results are that (i) we confirm the intuition that the more missing values, the poorer the prediction accuracy of AQUA; (ii) there is a quadratic dependency between the prediction accuracy and the percentage of missing values; and (Hi) the upper limit of missing values for the applicability of AQUA is determined as 40%. These results are obtained in the context of AQUA. Further analysis is necessary for other ways of applying EBA, such as using different similarity measures or analogy adaptation methods from those used in AQUA. For that purpose, the experimental design in this study can be adapted.\",\"PeriodicalId\":124420,\"journal\":{\"name\":\"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)\",\"volume\":\"118 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"103\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ESEM.2007.10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESEM.2007.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 103

摘要

通过类比进行的工作量估计(EBA)经常面临缺失值的问题。我们以前基于类比的方法AUQA能够容忍数据集中的缺失值，但不清楚缺失值的百分比如何影响预测精度，以及为了保证AQUA的适用性，是否存在该百分比可能变得多大的上限。本文通过影响分析来探讨这些问题。对七个不同大小、初始缺失值百分比不同的数据集进行影响分析。主要结果是:(i)我们证实了AQUA预测精度越差的直觉，即缺失值越多;(ii)预测精度与缺失值百分比之间存在二次依赖关系;(1)确定AQUA适用性的缺失值上限为40%。这些结果是在AQUA环境下获得的。对于其他应用EBA的方法，如使用与AQUA不同的相似性度量或类比适应方法，需要进一步分析。为此，本研究的实验设计可以进行调整。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA

Effort estimation by analogy (EBA) is often confronted with missing values. Our former analogy- based method AUQA is able to tolerate missing values in the data set, but it is unclear how the percentage of missing values impacts the prediction accuracy and if there is an upper bound for how big this percentage might become in order to guarantee the applicability of AQUA. This paper investigates these questions through an impact analysis. The impact analysis is conducted for seven data sets being of different size and having different initial percentages of missing values. The major results are that (i) we confirm the intuition that the more missing values, the poorer the prediction accuracy of AQUA; (ii) there is a quadratic dependency between the prediction accuracy and the percentage of missing values; and (Hi) the upper limit of missing values for the applicability of AQUA is determined as 40%. These results are obtained in the context of AQUA. Further analysis is necessary for other ways of applying EBA, such as using different similarity measures or analogy adaptation methods from those used in AQUA. For that purpose, the experimental design in this study can be adapted.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)

自引率

0.00%

发文量

期刊最新文献

Comparing Model Generated with Expert Generated IV&V Activity Plans Decision Support with EMPEROR A cost effectiveness indicator for software development Fine-Grained Software Metrics in Practice Automated Information Extraction from Empirical Software Engineering Literature: Is that possible?