On the Relationship Between Story Points and Development Effort in Agile Open-Source Software

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement Pub Date : 2022-09-19 DOI:10.1145/3544902.3546238

Vali Tawosi, Federica Sarro

{"title":"On the Relationship Between Story Points and Development Effort in Agile Open-Source Software","authors":"Vali Tawosi, Federica Sarro","doi":"10.1145/3544902.3546238","DOIUrl":null,"url":null,"abstract":"Background: Previous work has provided some initial evidence that Story Point (SP) estimated by human-experts may not accurately reflect the effort needed to realise Agile software projects. Aims: In this paper, we aim to shed further light on the relationship between SP and Agile software development effort to understand the extent to which human-estimated SP is a good indicator of user story development effort expressed in terms of time needed to realise it. Method: To this end, we carry out a thorough empirical study involving a total of 37,440 unique user stories from 37 different open-source projects publicly available in the TAWOS dataset. For these user stories, we investigate the correlation between the issue development time (or its approximation when the actual time is not available) and the SP estimated by human-expert by using three widely-used correlation statistics (i.e., Pearson, Kendall and Spearman). Furthermore, we investigate SP estimations made by the human-experts in order to assess the extent to which they are consistent in their estimations throughout the project, i.e., we assess whether the development time of the issues is proportionate to the SP assigned to them. Results: The average results across the three correlation measures reveal that the correlation between the human-expert estimated SP and the approximated development time is strong for only 7% of the projects investigated, and medium (58%) or low (35%) for the remaining ones. Similar results are obtained when the actual development time is considered. Our empirical study also reveals that the estimation made is often not consistent throughout the project and the human estimator tends to misestimate in 78% of the cases. Conclusions: Our empirical results suggest that SP might not be an accurate indicator of open-source Agile software development effort expressed in terms of development time. The impact of its use as an indicator of effort should be explored in future work, for example as a cost-driver in automated effort estimation models or as the prediction target.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"143 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3544902.3546238","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Background: Previous work has provided some initial evidence that Story Point (SP) estimated by human-experts may not accurately reflect the effort needed to realise Agile software projects. Aims: In this paper, we aim to shed further light on the relationship between SP and Agile software development effort to understand the extent to which human-estimated SP is a good indicator of user story development effort expressed in terms of time needed to realise it. Method: To this end, we carry out a thorough empirical study involving a total of 37,440 unique user stories from 37 different open-source projects publicly available in the TAWOS dataset. For these user stories, we investigate the correlation between the issue development time (or its approximation when the actual time is not available) and the SP estimated by human-expert by using three widely-used correlation statistics (i.e., Pearson, Kendall and Spearman). Furthermore, we investigate SP estimations made by the human-experts in order to assess the extent to which they are consistent in their estimations throughout the project, i.e., we assess whether the development time of the issues is proportionate to the SP assigned to them. Results: The average results across the three correlation measures reveal that the correlation between the human-expert estimated SP and the approximated development time is strong for only 7% of the projects investigated, and medium (58%) or low (35%) for the remaining ones. Similar results are obtained when the actual development time is considered. Our empirical study also reveals that the estimation made is often not consistent throughout the project and the human estimator tends to misestimate in 78% of the cases. Conclusions: Our empirical results suggest that SP might not be an accurate indicator of open-source Agile software development effort expressed in terms of development time. The impact of its use as an indicator of effort should be explored in future work, for example as a cost-driver in automated effort estimation models or as the prediction target.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

敏捷开源软件中故事点与开发努力的关系

背景:以前的工作已经提供了一些初步的证据，证明人类专家估计的故事点(SP)可能不能准确地反映实现敏捷软件项目所需的工作量。目的:在本文中，我们的目标是进一步阐明SP与敏捷软件开发工作之间的关系，以了解在何种程度上，人类估计的SP是一个很好的指标，以实现它所需的时间来表示用户故事开发工作。方法:为此，我们对TAWOS数据集中公开的37个不同开源项目的37440个独特用户故事进行了深入的实证研究。对于这些用户故事，我们通过使用三种广泛使用的相关统计(即Pearson, Kendall和Spearman)来研究问题开发时间(或在实际时间不可用时其近似值)与人类专家估计的SP之间的相关性。此外，我们调查了人类专家所做的SP估计，以评估他们在整个项目中估计的一致性程度，即，我们评估问题的开发时间是否与分配给他们的SP成比例。结果:三个相关度量的平均结果表明，人类专家估计的SP和估计的开发时间之间的相关性只有7%的项目是强的，其余的项目是中等(58%)或低(35%)的。当考虑实际开发时间时，得到了类似的结果。我们的实证研究还表明，在整个项目中所做的估计往往是不一致的，在78%的情况下，人类估计者倾向于错误估计。结论:我们的实证结果表明，SP可能不是以开发时间表示的开源敏捷软件开发工作的准确指标。应该在未来的工作中探索将其用作工作指标的影响，例如作为自动化工作估计模型中的成本驱动因素或作为预测目标。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

自引率

0.00%

发文量