Python代码风格遵从堆栈溢出

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-26 DOI:10.1109/MSR.2019.00042

Nikolaos Bafatakis, Niels Boecker, Wenjie. Boon, Martin Cabello Salazar, J. Krinke, Gazi Oznacar, Robert White

{"title":"Python代码风格遵从堆栈溢出","authors":"Nikolaos Bafatakis, Niels Boecker, Wenjie. Boon, Martin Cabello Salazar, J. Krinke, Gazi Oznacar, Robert White","doi":"10.1109/MSR.2019.00042","DOIUrl":null,"url":null,"abstract":"Software developers all over the world use Stack Overflow (SO) to interact and exchange code snippets. Research also uses SO to harvest code snippets for use with recommendation systems. However, previous work has shown that code on SO may have quality issues, such as security or license problems. We analyse Python code on SO to determine its coding style compliance. From 1,962,535 code snippets tagged with 'python', we extracted 407,097 snippets of at least 6 statements of Python code. Surprisingly, 93.87% of the extracted snippets contain style violations, with an average of 0.7 violations per statement and a huge number of snippets with a considerably higher ratio. Researchers and developers should, therefore, be aware that code snippets on SO may not representative of good coding style. Furthermore, while user reputation seems to be unrelated to coding style compliance, for posts with vote scores in the range between -10 and 20, we found a strong correlation (r = -0.87, p < 10^-7) between the vote score a post received and the average number of violations per statement for snippets in such posts.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"52 1","pages":"210-214"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Python Coding Style Compliance on Stack Overflow\",\"authors\":\"Nikolaos Bafatakis, Niels Boecker, Wenjie. Boon, Martin Cabello Salazar, J. Krinke, Gazi Oznacar, Robert White\",\"doi\":\"10.1109/MSR.2019.00042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software developers all over the world use Stack Overflow (SO) to interact and exchange code snippets. Research also uses SO to harvest code snippets for use with recommendation systems. However, previous work has shown that code on SO may have quality issues, such as security or license problems. We analyse Python code on SO to determine its coding style compliance. From 1,962,535 code snippets tagged with 'python', we extracted 407,097 snippets of at least 6 statements of Python code. Surprisingly, 93.87% of the extracted snippets contain style violations, with an average of 0.7 violations per statement and a huge number of snippets with a considerably higher ratio. Researchers and developers should, therefore, be aware that code snippets on SO may not representative of good coding style. Furthermore, while user reputation seems to be unrelated to coding style compliance, for posts with vote scores in the range between -10 and 20, we found a strong correlation (r = -0.87, p < 10^-7) between the vote score a post received and the average number of violations per statement for snippets in such posts.\",\"PeriodicalId\":6706,\"journal\":{\"name\":\"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)\",\"volume\":\"52 1\",\"pages\":\"210-214\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSR.2019.00042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSR.2019.00042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

全世界的软件开发人员都使用Stack Overflow (SO)来交互和交换代码片段。研究人员还使用SO来收集推荐系统使用的代码片段。但是，以前的工作表明，SO上的代码可能存在质量问题，例如安全性或许可证问题。我们在SO上分析Python代码，以确定其编码风格的遵从性。从1,962,535个标有“python”的代码片段中，我们提取了至少6条python代码语句的407,097个片段。令人惊讶的是，93.87%的提取片段包含样式违规，平均每个语句有0.7个违规，而且大量片段的比例要高得多。因此，研究人员和开发人员应该意识到，SO上的代码片段可能并不代表良好的编码风格。此外，虽然用户声誉似乎与编码风格合规性无关，但对于投票得分在-10到20之间的帖子，我们发现帖子收到的投票得分与帖子中每个语句片段的平均违规次数之间存在很强的相关性(r = -0.87, p < 10^-7)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Python Coding Style Compliance on Stack Overflow

Software developers all over the world use Stack Overflow (SO) to interact and exchange code snippets. Research also uses SO to harvest code snippets for use with recommendation systems. However, previous work has shown that code on SO may have quality issues, such as security or license problems. We analyse Python code on SO to determine its coding style compliance. From 1,962,535 code snippets tagged with 'python', we extracted 407,097 snippets of at least 6 statements of Python code. Surprisingly, 93.87% of the extracted snippets contain style violations, with an average of 0.7 violations per statement and a huge number of snippets with a considerably higher ratio. Researchers and developers should, therefore, be aware that code snippets on SO may not representative of good coding style. Furthermore, while user reputation seems to be unrelated to coding style compliance, for posts with vote scores in the range between -10 and 20, we found a strong correlation (r = -0.87, p < 10^-7) between the vote score a post received and the average number of violations per statement for snippets in such posts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)

自引率

0.00%

发文量

期刊最新文献

SeSaMe: A Data Set of Semantically Similar Java Methods Lessons Learned from Using a Deep Tree-Based Model for Software Defect Prediction in Practice STRAIT: A Tool for Automated Software Reliability Growth Analysis Assessing Diffusion and Perception of Test Smells in Scala Projects An Empirical History of Permission Requests and Mistakes in Open Source Android Apps