Cold-Start Software Analytics

Jin Guo, Mona Rahimi, J. Cleland-Huang, A. Rasin, J. Hayes, Michael Vierhauser
{"title":"Cold-Start Software Analytics","authors":"Jin Guo, Mona Rahimi, J. Cleland-Huang, A. Rasin, J. Hayes, Michael Vierhauser","doi":"10.1145/2901739.2901740","DOIUrl":null,"url":null,"abstract":"Software project artifacts such as source code, requirements, and change logs represent a gold-mine of actionable information. As a result, software analytic solutions have been developed to mine repositories and answer questions such as \"who is the expert?,'' \"which classes are fault prone?,'' or even \"who are the domain experts for these fault-prone classes?'' Analytics often require training and configuring in order to maximize performance within the context of each project. A cold-start problem exists when a function is applied within a project context without first configuring the analytic functions on project-specific data. This scenario exists because of the non-trivial effort necessary to instrument a project environment with candidate tools and algorithms and to empirically evaluate alternate configurations. We address the cold-start problem by comparatively evaluating `best-of-breed' and `profile-driven' solutions, both of which reuse known configurations in new project contexts. We describe and evaluate our approach against 20 project datasets for the three analytic areas of artifact connectivity, fault-prediction, and finding the expert, and show that the best-of-breed approach outperformed the profile-driven approach in all three areas; however, while it delivered acceptable results for artifact connectivity and find the expert, both techniques underperformed for cold-start fault prediction.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"8 1","pages":"142-153"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2901739.2901740","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

Abstract

Software project artifacts such as source code, requirements, and change logs represent a gold-mine of actionable information. As a result, software analytic solutions have been developed to mine repositories and answer questions such as "who is the expert?,'' "which classes are fault prone?,'' or even "who are the domain experts for these fault-prone classes?'' Analytics often require training and configuring in order to maximize performance within the context of each project. A cold-start problem exists when a function is applied within a project context without first configuring the analytic functions on project-specific data. This scenario exists because of the non-trivial effort necessary to instrument a project environment with candidate tools and algorithms and to empirically evaluate alternate configurations. We address the cold-start problem by comparatively evaluating `best-of-breed' and `profile-driven' solutions, both of which reuse known configurations in new project contexts. We describe and evaluate our approach against 20 project datasets for the three analytic areas of artifact connectivity, fault-prediction, and finding the expert, and show that the best-of-breed approach outperformed the profile-driven approach in all three areas; however, while it delivered acceptable results for artifact connectivity and find the expert, both techniques underperformed for cold-start fault prediction.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
冷启动软件分析
软件项目工件,如源代码、需求和变更日志,代表了可操作信息的金矿。因此,开发了软件分析解决方案来挖掘存储库并回答诸如“谁是专家?”,“哪些类别容易发生故障?”,甚至“谁是这些容易出错类的领域专家?”“分析通常需要培训和配置,以便在每个项目的上下文中最大化性能。当在项目上下文中应用函数而没有首先在项目特定数据上配置分析函数时,就会存在冷启动问题。这种情况之所以存在,是因为使用候选工具和算法对项目环境进行仪表化以及经验地评估备选配置所必需的重要工作。我们通过比较评估“同类最佳”和“配置文件驱动”的解决方案来解决冷启动问题,这两种解决方案都在新的项目环境中重用已知的配置。我们针对工件连接性、故障预测和寻找专家这三个分析领域的20个项目数据集描述和评估了我们的方法,并表明在所有三个领域中,最佳的方法都优于概要驱动的方法;然而,尽管它为工件连接性和寻找专家提供了可接受的结果,但这两种技术在冷启动故障预测方面表现不佳。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
MSR '20: 17th International Conference on Mining Software Repositories, Seoul, Republic of Korea, 29-30 June, 2020 Who you gonna call?: analyzing web requests in Android applications Cena słońca w projektowaniu architektonicznym Multi-extract and Multi-level Dataset of Mozilla Issue Tracking History Interactive Exploration of Developer Interaction Traces using a Hidden Markov Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1