Statistical Errors in Software Engineering Experiments: A Preliminary Literature Review

Rolando P. Reyes Ch., Ó. Dieste, E. Fonseca C., N. Juristo
{"title":"Statistical Errors in Software Engineering Experiments: A Preliminary Literature Review","authors":"Rolando P. Reyes Ch., Ó. Dieste, E. Fonseca C., N. Juristo","doi":"10.1145/3180155.3180161","DOIUrl":null,"url":null,"abstract":"Background: Statistical concepts and techniques are often applied incorrectly, even in mature disciplines such as medicine or psychology. Surprisingly, there are very few works that study statistical problems in software engineering (SE). Aim: Assess the existence of statistical errors in SE experiments. Method: Compile the most common statistical errors in experimental disciplines. Survey experiments published in ICSE to assess whether errors occur in high quality SE publications. Results: The same errors as identified in others disciplines were found in ICSE experiments, where 30 of the reviewed papers included several error types such as: a) missing statistical hypotheses, b) missing sample size calculation, c) failure to assess statistical test assumptions, and d) uncorrected multiple testing. This rather large error rate is greater for research papers where experiments are confined to the validation section. The origin of the errors can be traced back to: a) researchers not having sufficient statistical training, and b) a profusion of exploratory research. Conclusions: This paper provides preliminary evidence that SE research suffers from the same statistical problems as other experimental disciplines. However, the SE community appears to be unaware of any shortcomings in its experiments, whereas other disciplines work hard to avoid these threats. Further research is necessary to find the underlying causes and set up corrective measures, but there are some potentially effective actions and are a priori easy to implement: a) improve the statistical training of SE researchers, and b) enforce quality assessment and reporting guidelines in SE publications.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"13 1","pages":"1195-1206"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3180155.3180161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

Background: Statistical concepts and techniques are often applied incorrectly, even in mature disciplines such as medicine or psychology. Surprisingly, there are very few works that study statistical problems in software engineering (SE). Aim: Assess the existence of statistical errors in SE experiments. Method: Compile the most common statistical errors in experimental disciplines. Survey experiments published in ICSE to assess whether errors occur in high quality SE publications. Results: The same errors as identified in others disciplines were found in ICSE experiments, where 30 of the reviewed papers included several error types such as: a) missing statistical hypotheses, b) missing sample size calculation, c) failure to assess statistical test assumptions, and d) uncorrected multiple testing. This rather large error rate is greater for research papers where experiments are confined to the validation section. The origin of the errors can be traced back to: a) researchers not having sufficient statistical training, and b) a profusion of exploratory research. Conclusions: This paper provides preliminary evidence that SE research suffers from the same statistical problems as other experimental disciplines. However, the SE community appears to be unaware of any shortcomings in its experiments, whereas other disciplines work hard to avoid these threats. Further research is necessary to find the underlying causes and set up corrective measures, but there are some potentially effective actions and are a priori easy to implement: a) improve the statistical training of SE researchers, and b) enforce quality assessment and reporting guidelines in SE publications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
软件工程实验中的统计误差:初步文献综述
背景:即使在医学或心理学等成熟学科中,统计概念和技术也经常被错误地应用。令人惊讶的是,很少有研究软件工程(SE)中的统计问题的著作。目的:评价SE实验中统计误差的存在性。方法:编制实验学科中最常见的统计误差。调查在ICSE上发表的实验,以评估高质量的SE出版物中是否发生错误。结果:在ICSE实验中发现了与其他学科相同的错误,其中30篇被审查的论文包括几种错误类型,如:a)遗漏统计假设,b)遗漏样本量计算,c)未能评估统计检验假设,d)未纠正的多重检验。对于实验仅限于验证部分的研究论文,这种相当大的错误率更大。错误的根源可以追溯到:a)研究人员没有足够的统计训练,b)大量的探索性研究。结论:本文提供了初步证据,表明SE研究存在与其他实验学科相同的统计问题。然而,SE社区似乎没有意识到其实验中的任何缺陷,而其他学科则努力避免这些威胁。进一步的研究是必要的,以找到潜在的原因和建立纠正措施,但有一些潜在的有效的行动,是先天容易实施的:a)提高SE研究人员的统计培训,b)在SE出版物中执行质量评估和报告指南。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Launch-Mode-Aware Context-Sensitive Activity Transition Analysis A Combinatorial Approach for Exposing Off-Nominal Behaviors Perses: Syntax-Guided Program Reduction Fine-Grained Test Minimization From UI Design Image to GUI Skeleton: A Neural Machine Translator to Bootstrap Mobile GUI Implementation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1