Are Comments on Stack Overflow Well Organized for Easy Retrieval by Developers?

Haoxiang Zhang, Shaowei Wang, T. Chen, Ahmed E. Hassan
{"title":"Are Comments on Stack Overflow Well Organized for Easy Retrieval by Developers?","authors":"Haoxiang Zhang, Shaowei Wang, T. Chen, Ahmed E. Hassan","doi":"10.1145/3434279","DOIUrl":null,"url":null,"abstract":"Many Stack Overflow answers have associated informative comments that can strengthen them and assist developers. A prior study found that comments can provide additional information to point out issues in their associated answer, such as the obsolescence of an answer. By showing more informative comments (e.g., the ones with higher scores) and hiding less informative ones, developers can more effectively retrieve information from the comments that are associated with an answer. Currently, Stack Overflow prioritizes the display of comments, and, as a result, 4.4 million comments (possibly including informative comments) are hidden by default from developers. In this study, we investigate whether this mechanism effectively organizes informative comments. We find that (1) the current comment organization mechanism does not work well due to the large amount of tie-scored comments (e.g., 87% of the comments have 0-score) and (2) in 97.3% of answers with hidden comments, at least one comment that is possibly informative is hidden while another comment with the same score is shown (i.e., unfairly hidden comments). The longest unfairly hidden comment is more likely to be informative than the shortest one. Our findings highlight that Stack Overflow should consider adjusting the comment organization mechanism to help developers effectively retrieve informative comments. Furthermore, we build a classifier that can effectively distinguish informative comments from uninformative comments. We also evaluate two alternative comment organization mechanisms (i.e., the Length mechanism and the Random mechanism) based on text similarity and the prediction of our classifier.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"11 1","pages":"1 - 31"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology (TOSEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3434279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Many Stack Overflow answers have associated informative comments that can strengthen them and assist developers. A prior study found that comments can provide additional information to point out issues in their associated answer, such as the obsolescence of an answer. By showing more informative comments (e.g., the ones with higher scores) and hiding less informative ones, developers can more effectively retrieve information from the comments that are associated with an answer. Currently, Stack Overflow prioritizes the display of comments, and, as a result, 4.4 million comments (possibly including informative comments) are hidden by default from developers. In this study, we investigate whether this mechanism effectively organizes informative comments. We find that (1) the current comment organization mechanism does not work well due to the large amount of tie-scored comments (e.g., 87% of the comments have 0-score) and (2) in 97.3% of answers with hidden comments, at least one comment that is possibly informative is hidden while another comment with the same score is shown (i.e., unfairly hidden comments). The longest unfairly hidden comment is more likely to be informative than the shortest one. Our findings highlight that Stack Overflow should consider adjusting the comment organization mechanism to help developers effectively retrieve informative comments. Furthermore, we build a classifier that can effectively distinguish informative comments from uninformative comments. We also evaluate two alternative comment organization mechanisms (i.e., the Length mechanism and the Random mechanism) based on text similarity and the prediction of our classifier.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
关于堆栈溢出的评论是否组织得很好,便于开发人员检索?
许多Stack Overflow答案都有相关的信息注释,这些注释可以加强它们并帮助开发人员。先前的一项研究发现,评论可以提供额外的信息,以指出与其相关的答案中的问题,例如答案的过时。通过显示更多信息的评论(例如,得分较高的评论)和隐藏较少信息的评论,开发人员可以更有效地从与答案相关的评论中检索信息。目前,Stack Overflow优先显示评论,因此,440万条评论(可能包括信息评论)默认情况下对开发人员隐藏。在本研究中,我们探讨了这一机制是否有效地组织了信息性评论。我们发现:(1)目前的评论组织机制没有很好地发挥作用,因为有大量的平均分评论(例如87%的评论为0分);(2)在97.3%的隐藏评论的回答中,至少有一条可能是有用的评论被隐藏,而另一条相同分数的评论被显示出来(即不公平隐藏评论)。最长的不公平隐藏评论比最短的更有可能提供信息。我们的研究结果强调Stack Overflow应该考虑调整评论组织机制,以帮助开发人员有效地检索信息评论。此外,我们建立了一个分类器,可以有效地区分信息评论和非信息评论。我们还基于文本相似度和分类器的预测评估了两种可选的评论组织机制(即长度机制和随机机制)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Turnover of Companies in OpenStack: Prevalence and Rationale Super-optimization of Smart Contracts Verification of Programs Sensitive to Heap Layout Assessing and Improving an Evaluation Dataset for Detecting Semantic Code Clones via Deep Learning Guaranteeing Timed Opacity using Parametric Timed Model Checking
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1