Towards Exploring the Code Reuse from Stack Overflow during Software Development

Yuan Huang, Furen Xu, Hao-Jie Zhou, Xiangping Chen, Xiaocong Zhou, Tongjie Wang
{"title":"Towards Exploring the Code Reuse from Stack Overflow during Software Development","authors":"Yuan Huang, Furen Xu, Hao-Jie Zhou, Xiangping Chen, Xiaocong Zhou, Tongjie Wang","doi":"10.1145/3524610.3527923","DOIUrl":null,"url":null,"abstract":"As one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers ev-ery day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that functionally matches the programming problem they encounter in their development activities. To study how programmers reuse code on SO during project development, we conduct a comprehensive empirical study. First, to capture the development activities of pro-grammers, we collect 342,148 modified code snippets in commits from 793 open-source Java projects, and these modified code can reflect the programming problems encountered during development. We also collect the code snippets from 1,355,617 posts on SO. Then, we employ CCFinder to detect the code clone between the modified code from commits and the code from SO, and further analyze the code reuse when programmer solves a programming problem during development. We count the code reuse ratios of the modified code snippets in the commits of each project in different years, the results show that the average code reuse ratio is 6.32%, and the maximum is 8.38%. The code reuse ratio in project commits has increased year by year, and the proportion of code reuse in the newly established project is higher than that of old projects. We also find that some projects reuse the code snippets from many years ago. Additionally, we find that experienced developers seem to be more likely to reuse the knowledge on SO. Moreover, we find that the code reuse ratio in bug-related commits (6.67%) is slightly higher than that of in non-bug-related commits (6.59%). Furthermore, we also find that the code reuse ratio (14.44%) in Java class files that have undergone multiple modifications is more than double the overall code reuse ratio (6.32%).","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3524610.3527923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

As one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers ev-ery day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that functionally matches the programming problem they encounter in their development activities. To study how programmers reuse code on SO during project development, we conduct a comprehensive empirical study. First, to capture the development activities of pro-grammers, we collect 342,148 modified code snippets in commits from 793 open-source Java projects, and these modified code can reflect the programming problems encountered during development. We also collect the code snippets from 1,355,617 posts on SO. Then, we employ CCFinder to detect the code clone between the modified code from commits and the code from SO, and further analyze the code reuse when programmer solves a programming problem during development. We count the code reuse ratios of the modified code snippets in the commits of each project in different years, the results show that the average code reuse ratio is 6.32%, and the maximum is 8.38%. The code reuse ratio in project commits has increased year by year, and the proportion of code reuse in the newly established project is higher than that of old projects. We also find that some projects reuse the code snippets from many years ago. Additionally, we find that experienced developers seem to be more likely to reuse the knowledge on SO. Moreover, we find that the code reuse ratio in bug-related commits (6.67%) is slightly higher than that of in non-bug-related commits (6.59%). Furthermore, we also find that the code reuse ratio (14.44%) in Java class files that have undergone multiple modifications is more than double the overall code reuse ratio (6.32%).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从软件开发中的堆栈溢出探讨代码重用
作为最知名的程序员问答网站之一,Stack Overflow(即SO)每天为成千上万的开发人员提供服务。以前的工作表明,许多开发人员在找到与他们在开发活动中遇到的编程问题在功能上匹配的答案(来自SO)时重用SO上的代码片段。为了研究程序员在项目开发过程中如何在SO上重用代码,我们进行了全面的实证研究。首先,为了捕获程序员的开发活动,我们从793个开源Java项目的提交中收集了342148个修改过的代码片段,这些修改过的代码可以反映开发过程中遇到的编程问题。我们还从SO上的1,355,617篇文章中收集了代码片段。然后,我们使用CCFinder检测提交修改后的代码与SO修改后的代码之间的代码克隆,并进一步分析程序员在开发过程中解决编程问题时的代码重用情况。我们统计了各个项目在不同年份提交的修改代码片段的代码重用率,结果表明,平均代码重用率为6.32%,最大值为8.38%。项目提交中的代码重用率逐年提高,新建项目的代码重用率高于老项目。我们还发现一些项目重用了许多年前的代码片段。此外,我们发现经验丰富的开发人员似乎更有可能重用SO方面的知识。此外,我们发现bug相关提交的代码重用率(6.67%)略高于非bug相关提交的代码重用率(6.59%)。此外,我们还发现,经过多次修改的Java类文件中的代码重用率(14.44%)是总体代码重用率(6.32%)的两倍多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Context-based Cluster Fault Localization Fine-Grained Code-Comment Semantic Interaction Analysis Find Bugs in Static Bug Finders Self-Supervised Learning of Smart Contract Representations An Exploratory Study of Analyzing JavaScript Online Code Clones
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1