缺失值对基于类比的软件工作量估算方法AQUA预测精度的影响分析

Jingzhou Li, Ahmed Al-Emran, G. Ruhe
{"title":"缺失值对基于类比的软件工作量估算方法AQUA预测精度的影响分析","authors":"Jingzhou Li, Ahmed Al-Emran, G. Ruhe","doi":"10.1109/ESEM.2007.10","DOIUrl":null,"url":null,"abstract":"Effort estimation by analogy (EBA) is often confronted with missing values. Our former analogy- based method AUQA is able to tolerate missing values in the data set, but it is unclear how the percentage of missing values impacts the prediction accuracy and if there is an upper bound for how big this percentage might become in order to guarantee the applicability of AQUA. This paper investigates these questions through an impact analysis. The impact analysis is conducted for seven data sets being of different size and having different initial percentages of missing values. The major results are that (i) we confirm the intuition that the more missing values, the poorer the prediction accuracy of AQUA; (ii) there is a quadratic dependency between the prediction accuracy and the percentage of missing values; and (Hi) the upper limit of missing values for the applicability of AQUA is determined as 40%. These results are obtained in the context of AQUA. Further analysis is necessary for other ways of applying EBA, such as using different similarity measures or analogy adaptation methods from those used in AQUA. For that purpose, the experimental design in this study can be adapted.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"103","resultStr":"{\"title\":\"Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA\",\"authors\":\"Jingzhou Li, Ahmed Al-Emran, G. Ruhe\",\"doi\":\"10.1109/ESEM.2007.10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Effort estimation by analogy (EBA) is often confronted with missing values. Our former analogy- based method AUQA is able to tolerate missing values in the data set, but it is unclear how the percentage of missing values impacts the prediction accuracy and if there is an upper bound for how big this percentage might become in order to guarantee the applicability of AQUA. This paper investigates these questions through an impact analysis. The impact analysis is conducted for seven data sets being of different size and having different initial percentages of missing values. The major results are that (i) we confirm the intuition that the more missing values, the poorer the prediction accuracy of AQUA; (ii) there is a quadratic dependency between the prediction accuracy and the percentage of missing values; and (Hi) the upper limit of missing values for the applicability of AQUA is determined as 40%. These results are obtained in the context of AQUA. Further analysis is necessary for other ways of applying EBA, such as using different similarity measures or analogy adaptation methods from those used in AQUA. For that purpose, the experimental design in this study can be adapted.\",\"PeriodicalId\":124420,\"journal\":{\"name\":\"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)\",\"volume\":\"118 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"103\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ESEM.2007.10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESEM.2007.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 103

摘要

通过类比进行的工作量估计(EBA)经常面临缺失值的问题。我们以前基于类比的方法AUQA能够容忍数据集中的缺失值,但不清楚缺失值的百分比如何影响预测精度,以及为了保证AQUA的适用性,是否存在该百分比可能变得多大的上限。本文通过影响分析来探讨这些问题。对七个不同大小、初始缺失值百分比不同的数据集进行影响分析。主要结果是:(i)我们证实了AQUA预测精度越差的直觉,即缺失值越多;(ii)预测精度与缺失值百分比之间存在二次依赖关系;(1)确定AQUA适用性的缺失值上限为40%。这些结果是在AQUA环境下获得的。对于其他应用EBA的方法,如使用与AQUA不同的相似性度量或类比适应方法,需要进一步分析。为此,本研究的实验设计可以进行调整。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA
Effort estimation by analogy (EBA) is often confronted with missing values. Our former analogy- based method AUQA is able to tolerate missing values in the data set, but it is unclear how the percentage of missing values impacts the prediction accuracy and if there is an upper bound for how big this percentage might become in order to guarantee the applicability of AQUA. This paper investigates these questions through an impact analysis. The impact analysis is conducted for seven data sets being of different size and having different initial percentages of missing values. The major results are that (i) we confirm the intuition that the more missing values, the poorer the prediction accuracy of AQUA; (ii) there is a quadratic dependency between the prediction accuracy and the percentage of missing values; and (Hi) the upper limit of missing values for the applicability of AQUA is determined as 40%. These results are obtained in the context of AQUA. Further analysis is necessary for other ways of applying EBA, such as using different similarity measures or analogy adaptation methods from those used in AQUA. For that purpose, the experimental design in this study can be adapted.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Comparing Model Generated with Expert Generated IV&V Activity Plans Decision Support with EMPEROR A cost effectiveness indicator for software development Fine-Grained Software Metrics in Practice Automated Information Extraction from Empirical Software Engineering Literature: Is that possible?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1