Mining Developer Mailing List to Predict Software Defects

Yu Zhang, Beijun Shen, Yuting Chen
{"title":"Mining Developer Mailing List to Predict Software Defects","authors":"Yu Zhang, Beijun Shen, Yuting Chen","doi":"10.1109/APSEC.2014.63","DOIUrl":null,"url":null,"abstract":"It has been studied that the communication among software stakeholders can be used to predict potential software defects. Yet researchers have rarely studied the relations between the software and the mailing lists of the developers. In this paper, we research on how to predict software defects by mining the mailing lists of the software developers. First, we extract both the structural and the unstructured information from mailing lists as metrics. The structural information is calculated through analyzing the social network hidden in the mailing lists, and the unstructured information is obtained through taking topical and textual analysis of the lists. Second, we design a mailing list-based approach to predicting software defects. We have also analyzed the software repository of several open source projects by linking their bug tracking data-bases to the mailing list archives. The experimental results provide empirical evidence that the mailing list metrics are related to software quality and can be used as predictors of defect-proneness. Furthermore, we found that (1) messages having certain structures may indicate some defect related files, (2) the sentiment and some topic-specific mailing models are of strong correlations with the software defects.","PeriodicalId":380881,"journal":{"name":"2014 21st Asia-Pacific Software Engineering Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 21st Asia-Pacific Software Engineering Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC.2014.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

It has been studied that the communication among software stakeholders can be used to predict potential software defects. Yet researchers have rarely studied the relations between the software and the mailing lists of the developers. In this paper, we research on how to predict software defects by mining the mailing lists of the software developers. First, we extract both the structural and the unstructured information from mailing lists as metrics. The structural information is calculated through analyzing the social network hidden in the mailing lists, and the unstructured information is obtained through taking topical and textual analysis of the lists. Second, we design a mailing list-based approach to predicting software defects. We have also analyzed the software repository of several open source projects by linking their bug tracking data-bases to the mailing list archives. The experimental results provide empirical evidence that the mailing list metrics are related to software quality and can be used as predictors of defect-proneness. Furthermore, we found that (1) messages having certain structures may indicate some defect related files, (2) the sentiment and some topic-specific mailing models are of strong correlations with the software defects.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
挖掘开发人员邮件列表以预测软件缺陷
已有研究表明,软件利益相关者之间的沟通可以用来预测潜在的软件缺陷。然而,研究人员很少研究软件和开发者邮件列表之间的关系。本文研究了如何通过挖掘软件开发人员的邮件列表来预测软件缺陷。首先,我们从邮件列表中提取结构化和非结构化信息作为度量。通过分析邮件列表中隐藏的社会网络来计算结构化信息,通过对邮件列表进行主题分析和文本分析来获得非结构化信息。其次,我们设计了一个基于邮件列表的方法来预测软件缺陷。我们还分析了几个开源项目的软件存储库,将它们的bug跟踪数据库链接到邮件列表存档。实验结果提供了经验证据,证明邮件列表度量与软件质量相关,并且可以用作缺陷倾向的预测因子。此外,我们发现(1)具有一定结构的消息可能表示一些与缺陷相关的文件,(2)情感和某些特定主题的邮件模型与软件缺陷有很强的相关性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
pIML -- An Interrupt Program Modelling Language for Real-Time and Embedded Systems What Community Contribution Pattern Says about Stability of Software Project? Guidelines for the Use of Function Block Diagram in Reactor Protection Systems Data Flow Based Integration Testing for Embedded System Using Interaction Model Model Checking of Software Product Lines in Presence of Nondeterminism and Probabilities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1