Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement最新文献_第3页

Understanding the Implementation of Technical Measures in the Process of Data Privacy Compliance: A Qualitative Study 理解数据隐私合规过程中技术措施的实施:一项定性研究

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-08-18 DOI: 10.1145/3544902.3546234

Oleksandra Klymenko, O. Kosenkov, Stephen Meisenbacher, Parisa Elahidoost, D. Méndez, F. Matthes

Background: Modern privacy regulations, such as the General Data Protection Regulation (GDPR), address privacy in software systems in a technologically agnostic way by mentioning general ”technical measures” for data privacy compliance rather than dictating how these should be implemented. An understanding of the concept of technical measures and how exactly these can be handled in practice, however, is not trivial due to its interdisciplinary nature and the necessary technical-legal interactions. Aims: We aim to investigate how the concept of technical measures for data privacy compliance is understood in practice as well as the technical-legal interaction intrinsic to the process of implementing those technical measures. Methods: We follow a research design that is 1) exploratory in nature, 2) qualitative, and 3) interview-based, with 16 selected privacy professionals in the technical and legal domains. Results: Our results suggest that there is no clear mutual understanding and commonly accepted approach to handling technical measures. Both technical and legal roles are involved in the implementation of such measures. While they still often operate in separate spheres, a predominant opinion amongst the interviewees is to promote more interdisciplinary collaboration. Conclusions: Our empirical findings confirm the need for better interaction between legal and engineering teams when implementing technical measures for data privacy. We posit that interdisciplinary collaboration is paramount to a more complete understanding of technical measures, which currently lacks a mutually accepted notion. Yet, as strongly suggested by our results, there is still a lack of systematic approaches to such interaction. Therefore, the results strengthen our confidence in the need for further investigations into the technical-legal dynamic of data privacy compliance.

背景:现代隐私法规，如通用数据保护条例(GDPR)，通过提及数据隐私合规性的一般“技术措施”，而不是规定如何实施，以技术上不可知的方式解决软件系统中的隐私问题。然而，由于技术措施的跨学科性质和必要的技术-法律相互作用，了解技术措施的概念以及在实践中如何准确地处理这些措施并非微不足道。目的:我们旨在调查在实践中如何理解数据隐私合规技术措施的概念，以及实施这些技术措施过程中内在的技术-法律互动。方法:我们采用了1)探索性、2)定性和3)基于访谈的研究设计，选择了16名技术和法律领域的隐私专业人士。结果:我们的研究结果表明，在处理技术措施方面没有明确的相互理解和普遍接受的方法。这些措施的执行涉及技术和法律两方面的作用。虽然他们仍然经常在不同的领域运作，但受访者的主要意见是促进更多的跨学科合作。结论:我们的实证研究结果证实，在实施数据隐私技术措施时，法律和工程团队之间需要更好的互动。我们认为，跨学科合作对于更全面地理解目前缺乏相互接受的概念的技术措施是至关重要的。然而，正如我们的结果所强烈建议的那样，仍然缺乏这种相互作用的系统方法。因此，研究结果增强了我们对进一步调查数据隐私合规的技术-法律动态的信心。

{"title":"Understanding the Implementation of Technical Measures in the Process of Data Privacy Compliance: A Qualitative Study","authors":"Oleksandra Klymenko, O. Kosenkov, Stephen Meisenbacher, Parisa Elahidoost, D. Méndez, F. Matthes","doi":"10.1145/3544902.3546234","DOIUrl":"https://doi.org/10.1145/3544902.3546234","url":null,"abstract":"Background: Modern privacy regulations, such as the General Data Protection Regulation (GDPR), address privacy in software systems in a technologically agnostic way by mentioning general ”technical measures” for data privacy compliance rather than dictating how these should be implemented. An understanding of the concept of technical measures and how exactly these can be handled in practice, however, is not trivial due to its interdisciplinary nature and the necessary technical-legal interactions. Aims: We aim to investigate how the concept of technical measures for data privacy compliance is understood in practice as well as the technical-legal interaction intrinsic to the process of implementing those technical measures. Methods: We follow a research design that is 1) exploratory in nature, 2) qualitative, and 3) interview-based, with 16 selected privacy professionals in the technical and legal domains. Results: Our results suggest that there is no clear mutual understanding and commonly accepted approach to handling technical measures. Both technical and legal roles are involved in the implementation of such measures. While they still often operate in separate spheres, a predominant opinion amongst the interviewees is to promote more interdisciplinary collaboration. Conclusions: Our empirical findings confirm the need for better interaction between legal and engineering teams when implementing technical measures for data privacy. We posit that interdisciplinary collaboration is paramount to a more complete understanding of technical measures, which currently lacks a mutually accepted notion. Yet, as strongly suggested by our results, there is still a lack of systematic approaches to such interaction. Therefore, the results strengthen our confidence in the need for further investigations into the technical-legal dynamic of data privacy compliance.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130343440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Identifying Source Code File Experts 识别源代码文件专家

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-08-16 DOI: 10.1145/3544902.3546243

Otávio Cury da Costa Castro, G. Avelino, P. Neto, Ricardo Britto, M. T. Valente

Background: In software development, the identification of source code file experts is an important task. Identifying these experts helps to improve software maintenance and evolution activities, such as developing new features, code reviews, and bug fixes. Although some studies have proposed repository-mining techniques to automatically identify source code experts, there are still gaps in this area that can be explored. For example, investigating new variables related to source code knowledge and applying machine learning aiming to improve the performance of techniques to identify source code experts. Aim: The goal of this study is to investigate opportunities to improve the performance of existing techniques to recommend source code files experts. Method: We built an oracle by collecting data from the development history and surveying developers of 113 software projects. Then, we use this oracle to: (i) analyze the correlation between measures extracted from the development history and the developers’ source code knowledge and (ii) investigate the use of machine learning classifiers by evaluating their performance in identifying source code files experts. Results:First Authorship and Recency of Modification are the variables with the highest positive and negative correlations with source code knowledge, respectively. Machine learning classifiers outperformed the linear techniques (F-Measure = 71% to 73%) in the public dataset, but this advantage is not clear in the private dataset, with F-Measure ranging from 55% to 68% for the linear techniques and 58% to 67% for ML techniques. Conclusion: Overall, the linear techniques and the machine learning classifiers achieved similar performance, particularly if we analyze F-Measure. However, machine learning classifiers usually get higher precision while linear techniques obtained the highest recall values. Therefore, the choice of the best technique depends on the user’s tolerance to false positives and false negatives.

背景:在软件开发中，源代码文件专家的识别是一项重要任务。识别这些专家有助于改进软件维护和发展活动，例如开发新特性、代码审查和错误修复。尽管一些研究已经提出了自动识别源代码专家的存储库挖掘技术，但在这一领域仍然存在有待探索的空白。例如，调查与源代码知识相关的新变量，并应用旨在提高识别源代码专家的技术性能的机器学习。目的:本研究的目的是调查机会，以提高现有技术的性能，以推荐源代码文件专家。方法:通过对113个软件项目的开发人员进行调查，收集开发历史数据，构建了一个oracle数据库。然后，我们使用该oracle:(i)分析从开发历史中提取的度量与开发人员的源代码知识之间的相关性;(ii)通过评估机器学习分类器在识别源代码文件专家方面的性能来研究机器学习分类器的使用。结果:第一作者身份(First author)和修改近时性(recent of Modification)分别是与源代码知识正相关和负相关最高的变量。机器学习分类器在公共数据集中优于线性技术(F-Measure = 71%至73%)，但这种优势在私有数据集中并不明显，线性技术的F-Measure范围为55%至68%，ML技术的F-Measure范围为58%至67%。结论:总体而言，线性技术和机器学习分类器实现了相似的性能，特别是如果我们分析F-Measure。然而，机器学习分类器通常获得更高的精度，而线性技术获得最高的召回值。因此，最佳技术的选择取决于用户对假阳性和假阴性的容忍度。

{"title":"Identifying Source Code File Experts","authors":"Otávio Cury da Costa Castro, G. Avelino, P. Neto, Ricardo Britto, M. T. Valente","doi":"10.1145/3544902.3546243","DOIUrl":"https://doi.org/10.1145/3544902.3546243","url":null,"abstract":"Background: In software development, the identification of source code file experts is an important task. Identifying these experts helps to improve software maintenance and evolution activities, such as developing new features, code reviews, and bug fixes. Although some studies have proposed repository-mining techniques to automatically identify source code experts, there are still gaps in this area that can be explored. For example, investigating new variables related to source code knowledge and applying machine learning aiming to improve the performance of techniques to identify source code experts. Aim: The goal of this study is to investigate opportunities to improve the performance of existing techniques to recommend source code files experts. Method: We built an oracle by collecting data from the development history and surveying developers of 113 software projects. Then, we use this oracle to: (i) analyze the correlation between measures extracted from the development history and the developers’ source code knowledge and (ii) investigate the use of machine learning classifiers by evaluating their performance in identifying source code files experts. Results:First Authorship and Recency of Modification are the variables with the highest positive and negative correlations with source code knowledge, respectively. Machine learning classifiers outperformed the linear techniques (F-Measure = 71% to 73%) in the public dataset, but this advantage is not clear in the private dataset, with F-Measure ranging from 55% to 68% for the linear techniques and 58% to 67% for ML techniques. Conclusion: Overall, the linear techniques and the machine learning classifiers achieved similar performance, particularly if we analyze F-Measure. However, machine learning classifiers usually get higher precision while linear techniques obtained the highest recall values. Therefore, the choice of the best technique depends on the user’s tolerance to false positives and false negatives.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"203 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122434276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Investigating the Impact of Continuous Integration Practices on the Productivity and Quality of Open-Source Projects 调查持续集成实践对开源项目的生产力和质量的影响

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-08-04 DOI: 10.1145/3544902.3546244

Jadson Santos, D. A. D. Costa, U. Kulesza

Background: Much research has been conducted to investigate the impact of Continuous Integration (CI) on the productivity and quality of open-source projects. Most of studies have analyzed the impact of adopting a CI server service (e.g, Travis-CI) but did not analyze CI sub-practices. Aims: We aim to evaluate the impact of five CI sub-practices with respect to the productivity and quality of GitHub open-source projects. Method: We collect CI sub-practices of 90 relevant open-source projects for a period of 2 years. We use regression models to analyze whether projects upholding the CI sub-practices are more productive and/or generate fewer bugs. We also perform a qualitative document analysis to understand whether CI best practices are related to a higher quality of projects. Results: Our findings reveal a correlation between the Build Activity and Commit Activity sub-practices and the number of merged pull requests. We also observe a correlation between the Build Activity, Build Health and Time to Fix Broken Builds sub-practices and number of bug-related issues. The qualitative analysis reveals that projects with the best values for CI sub-practices face fewer CI-related problems compared to projects that exhibit the worst values for CI sub-practices. Conclusions: We recommend that projects should strive to uphold the several CI sub-practices as they can impact in the productivity and quality of projects.

背景:已经进行了很多研究来调查持续集成(CI)对开源项目的生产力和质量的影响。大多数研究分析了采用CI服务器服务(例如Travis-CI)的影响，但没有分析CI子实践。目标:我们的目标是评估五个CI子实践对GitHub开源项目的生产力和质量的影响。方法:收集90个相关开源项目的CI子实践，历时2年。我们使用回归模型来分析支持CI子实践的项目是否更有生产力和/或产生更少的错误。我们还执行定性文档分析，以了解CI最佳实践是否与更高质量的项目相关。结果:我们的发现揭示了构建活动和提交活动子实践与合并的拉请求数量之间的相关性。我们还观察到构建活动、构建运行状况和修复破碎构建子实践的时间与bug相关问题的数量之间的相关性。定性分析表明，与表现出CI子实践最差值的项目相比，具有CI子实践最佳值的项目面临较少的CI相关问题。结论:我们建议项目应该努力维护几个CI子实践，因为它们可以影响项目的生产力和质量。

{"title":"Investigating the Impact of Continuous Integration Practices on the Productivity and Quality of Open-Source Projects","authors":"Jadson Santos, D. A. D. Costa, U. Kulesza","doi":"10.1145/3544902.3546244","DOIUrl":"https://doi.org/10.1145/3544902.3546244","url":null,"abstract":"Background: Much research has been conducted to investigate the impact of Continuous Integration (CI) on the productivity and quality of open-source projects. Most of studies have analyzed the impact of adopting a CI server service (e.g, Travis-CI) but did not analyze CI sub-practices. Aims: We aim to evaluate the impact of five CI sub-practices with respect to the productivity and quality of GitHub open-source projects. Method: We collect CI sub-practices of 90 relevant open-source projects for a period of 2 years. We use regression models to analyze whether projects upholding the CI sub-practices are more productive and/or generate fewer bugs. We also perform a qualitative document analysis to understand whether CI best practices are related to a higher quality of projects. Results: Our findings reveal a correlation between the Build Activity and Commit Activity sub-practices and the number of merged pull requests. We also observe a correlation between the Build Activity, Build Health and Time to Fix Broken Builds sub-practices and number of bug-related issues. The qualitative analysis reveals that projects with the best values for CI sub-practices face fewer CI-related problems compared to projects that exhibit the worst values for CI sub-practices. Conclusions: We recommend that projects should strive to uphold the several CI sub-practices as they can impact in the productivity and quality of projects.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115887429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Example Driven Code Review Explanation 示例驱动代码审查解释

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-07-24 DOI: 10.1145/3544902.3546639

Shadikur Rahman, Umme Ayman Koana, Maleknaz Nayebi

Background: Code reviewing is an essential part of software development to ensure software quality. However, the abundance of review tasks and the intensity of the workload for reviewers negatively impact the quality of the reviews. The short review text is often unactionable. Aims: We propose the Example Driven Review Explanation (EDRE) method to facilitate the code review process by adding additional explanations through examples. EDRE recommends similar code reviews as examples to further explain a review and help a developer to understand the received reviews with less communication overhead. Method: Through an empirical study in an industrial setting and by analyzing 3,722 Code reviews across three open-source projects, we compared five methods of data retrieval, text classification, and text recommendation. Results: EDRE using TF-IDF word embedding along with an SVM classifier can provide practical examples for each code review with 92% F-score and 90% Accuracy. Conclusions: The example-based explanation is an established method for assisting experts in explaining decisions. EDRE can accurately provide a set of context-specific examples to facilitate the code review process in software teams.

背景:代码审查是软件开发中确保软件质量的重要环节。然而，评审任务的丰富性和评审人员工作量的强度对评审的质量产生了负面影响。简短的评论文本通常是不可操作的。目的:我们提出了示例驱动审查解释(EDRE)方法，通过示例添加额外的解释来促进代码审查过程。EDRE推荐类似的代码审查作为示例，以进一步解释审查，并帮助开发人员以更少的沟通开销来理解收到的审查。方法:通过在工业环境下的实证研究，通过分析三个开源项目的3722个代码审查，我们比较了数据检索、文本分类和文本推荐的五种方法。结果:EDRE使用TF-IDF词嵌入以及SVM分类器可以为每次代码审查提供实际示例，f得分为92%，准确率为90%。结论:基于实例的解释是一种辅助专家解释决策的既定方法。EDRE可以准确地提供一组特定于上下文的示例，以促进软件团队中的代码审查过程。

{"title":"Example Driven Code Review Explanation","authors":"Shadikur Rahman, Umme Ayman Koana, Maleknaz Nayebi","doi":"10.1145/3544902.3546639","DOIUrl":"https://doi.org/10.1145/3544902.3546639","url":null,"abstract":"Background: Code reviewing is an essential part of software development to ensure software quality. However, the abundance of review tasks and the intensity of the workload for reviewers negatively impact the quality of the reviews. The short review text is often unactionable. Aims: We propose the Example Driven Review Explanation (EDRE) method to facilitate the code review process by adding additional explanations through examples. EDRE recommends similar code reviews as examples to further explain a review and help a developer to understand the received reviews with less communication overhead. Method: Through an empirical study in an industrial setting and by analyzing 3,722 Code reviews across three open-source projects, we compared five methods of data retrieval, text classification, and text recommendation. Results: EDRE using TF-IDF word embedding along with an SVM classifier can provide practical examples for each code review with 92% F-score and 90% Accuracy. Conclusions: The example-based explanation is an established method for assisting experts in explaining decisions. EDRE can accurately provide a set of context-specific examples to facilitate the code review process in software teams.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127132948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Software Artifact Mining in Software Engineering Conferences: A Meta-Analysis 软件工程会议中的软件工件挖掘:一个元分析

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-07-18 DOI: 10.1145/3544902.3546239

Zeinab Abou Khalil, Stefano Zacchiroli

Background: Software development results in the production of various types of artifacts: source code, version control system metadata, bug reports, mailing list conversations, test data, etc. Empirical software engineering (ESE) has thrived mining those artifacts to uncover the inner workings of software development and improve its practices. But which artifacts are studied in the field is a moving target, which we study empirically in this paper. Aims: We quantitatively characterize the most frequently mined and co-mined software artifacts in ESE research and the research purposes they support. Method: We conduct a meta-analysis of artifact mining studies published in 11 top conferences in ESE, for a total of 9621 papers. We use natural language processing (NLP) techniques to characterize the types of software artifacts that are most often mined and their evolution over a 16-year period (2004–2020). We analyze the combinations of artifact types that are most often mined together, as well as the relationship between study purposes and mined artifacts. Results: We find that: (1) mining happens in the vast majority of analyzed papers, (2) source code and test data are the most mined artifacts, (3) there is an increasing interest in mining novel artifacts, together with source code, (4) researchers are most interested in the evaluation of software systems and use all possible empirical signals to support that goal. Conclusions: Our study presents a meta analysis of the usage of software artifacts in the field over a period of 16 years using NLP techniques.

背景:软件开发会产生各种类型的工件:源代码、版本控制系统元数据、bug报告、邮件列表对话、测试数据等。经验软件工程(ESE)通过挖掘这些工件来揭示软件开发的内部工作原理并改进其实践而蓬勃发展。但在该领域研究哪些伪影是一个移动的目标，本文对此进行了实证研究。目的:我们定量地描述了ESE研究中最频繁挖掘和共同挖掘的软件工件，以及它们支持的研究目的。方法:我们对发表在11个ESE顶级会议上的工件挖掘研究进行了荟萃分析，共计9621篇论文。我们使用自然语言处理(NLP)技术来描述最常被挖掘的软件工件类型及其在16年期间(2004-2020年)的演变。我们分析了最常一起挖掘的工件类型的组合，以及研究目的和挖掘的工件之间的关系。结果:我们发现:(1)挖掘发生在绝大多数被分析的论文中，(2)源代码和测试数据是挖掘最多的工件，(3)对挖掘新工件和源代码的兴趣越来越大，(4)研究人员对软件系统的评估最感兴趣，并使用所有可能的经验信号来支持这一目标。结论:我们的研究对该领域16年来使用NLP技术的软件工件的使用情况进行了荟萃分析。

{"title":"Software Artifact Mining in Software Engineering Conferences: A Meta-Analysis","authors":"Zeinab Abou Khalil, Stefano Zacchiroli","doi":"10.1145/3544902.3546239","DOIUrl":"https://doi.org/10.1145/3544902.3546239","url":null,"abstract":"Background: Software development results in the production of various types of artifacts: source code, version control system metadata, bug reports, mailing list conversations, test data, etc. Empirical software engineering (ESE) has thrived mining those artifacts to uncover the inner workings of software development and improve its practices. But which artifacts are studied in the field is a moving target, which we study empirically in this paper. Aims: We quantitatively characterize the most frequently mined and co-mined software artifacts in ESE research and the research purposes they support. Method: We conduct a meta-analysis of artifact mining studies published in 11 top conferences in ESE, for a total of 9621 papers. We use natural language processing (NLP) techniques to characterize the types of software artifacts that are most often mined and their evolution over a 16-year period (2004–2020). We analyze the combinations of artifact types that are most often mined together, as well as the relationship between study purposes and mined artifacts. Results: We find that: (1) mining happens in the vast majority of analyzed papers, (2) source code and test data are the most mined artifacts, (3) there is an increasing interest in mining novel artifacts, together with source code, (4) researchers are most interested in the evaluation of software systems and use all possible empirical signals to support that goal. Conclusions: Our study presents a meta analysis of the usage of software artifacts in the field over a period of 16 years using NLP techniques.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123859463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Exploratory Study on Regression Vulnerabilities 回归漏洞的探索性研究

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-07-05 DOI: 10.1145/3544902.3546250

Larissa Braz, Enrico Fregnan, Vivek Arora, Alberto Bacchelli

Background: Security regressions are vulnerabilities introduced in a previously unaffected software system. They often happen as a result of code changes (e.g., a bug fix) and can have severe effects. Aims: We aim to increase the understanding of security regressions. Method: To this aim, we perform an exploratory, mixed-method case study of Mozilla. First, we analyze 78 regression vulnerabilities and 72 bug reports where a bug fix introduced a regression vulnerability at Mozilla. We investigate how developers interact in these bug reports, how they perform the changes, and under what conditions they introduce these regressions. Second, we conduct five semi-structured interviews with as many Mozilla developers involved in the vulnerability-inducing fixes. Results: Security is not discussed during bug fixes. Developers’ main concerns are the complexity of the bug at hand and the community pressure to fix it. Developers do not to worry about regression vulnerabilities and assume tools will detect them. Indeed, dynamic analysis tools helped finding around 30% of these regressions. Conclusions: Although tool support helps identify regression vulnerabilities, it may not be enough to ensure security during bug fixes. Furthermore, our results call for further work on the security tooling support and their integration during bug fixes. Preprint: https://arxiv.org/abs/2207.01942 Data and materials: https://doi.org/10.5281/zenodo.6792317

背景:安全回归是在以前未受影响的软件系统中引入的漏洞。它们通常是代码更改(例如，bug修复)的结果，并且可能产生严重的影响。目标:我们的目标是增加对安全回归的理解。方法:为此，我们对Mozilla进行了一个探索性的、混合方法的案例研究。首先，我们分析了78个回归漏洞和72个错误报告，其中一个错误修复引入了Mozilla的回归漏洞。我们调查开发人员如何在这些bug报告中进行交互，他们如何执行更改，以及在什么条件下引入这些回归。其次，我们进行了五次半结构化访谈，采访了尽可能多的参与漏洞修复的Mozilla开发人员。结果:在bug修复期间不讨论安全性。开发人员主要担心的是当前bug的复杂性以及社区要求修复它的压力。开发人员不必担心回归漏洞，并假设工具会检测到它们。事实上，动态分析工具帮助发现了大约30%的回归。结论:尽管工具支持有助于识别回归漏洞，但它可能不足以确保错误修复期间的安全性。此外，我们的结果要求在错误修复期间对安全工具支持及其集成进行进一步的工作。预印本:https://arxiv.org/abs/2207.01942数据资料:https://doi.org/10.5281/zenodo.6792317

{"title":"An Exploratory Study on Regression Vulnerabilities","authors":"Larissa Braz, Enrico Fregnan, Vivek Arora, Alberto Bacchelli","doi":"10.1145/3544902.3546250","DOIUrl":"https://doi.org/10.1145/3544902.3546250","url":null,"abstract":"Background: Security regressions are vulnerabilities introduced in a previously unaffected software system. They often happen as a result of code changes (e.g., a bug fix) and can have severe effects. Aims: We aim to increase the understanding of security regressions. Method: To this aim, we perform an exploratory, mixed-method case study of Mozilla. First, we analyze 78 regression vulnerabilities and 72 bug reports where a bug fix introduced a regression vulnerability at Mozilla. We investigate how developers interact in these bug reports, how they perform the changes, and under what conditions they introduce these regressions. Second, we conduct five semi-structured interviews with as many Mozilla developers involved in the vulnerability-inducing fixes. Results: Security is not discussed during bug fixes. Developers’ main concerns are the complexity of the bug at hand and the community pressure to fix it. Developers do not to worry about regression vulnerabilities and assume tools will detect them. Indeed, dynamic analysis tools helped finding around 30% of these regressions. Conclusions: Although tool support helps identify regression vulnerabilities, it may not be enough to ensure security during bug fixes. Furthermore, our results call for further work on the security tooling support and their integration during bug fixes. Preprint: https://arxiv.org/abs/2207.01942 Data and materials: https://doi.org/10.5281/zenodo.6792317","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132619905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Meetings and Mood – Related or Not? Insights from Student Software Projects 会议和情绪是否相关?来自学生软件项目的见解

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-07-04 DOI: 10.1145/3544902.3546252

Jil Klunder, Oliver Karras

[Background:] Teamwork, coordination, and communication are a prerequisite for the timely completion of a software project. Meetings as a facilitator for coordination and communication are an established medium for information exchange. Analyses of meetings in software projects have shown that certain interactions in these meetings, such as proactive statements followed by supportive ones, influence the mood and motivation of a team, which in turn affects its productivity. So far, however, research has focused only on certain interactions at a detailed level, requiring a complex and fine-grained analysis of a meeting itself. [Aim:] In this paper, we investigate meetings from a more abstract perspective, focusing on the polarity of the statements, i.e., whether they appear to be positive, negative, or neutral. [Method:] We analyze the relationship between the polarity of statements in meetings and different social aspects, including conflicts as well as the mood before and after a meeting. [Results:] Our results emerge from 21 student software project meetings and show some interesting insights: (1) Positive mood before a meeting is both related to the amount of positive statements in the beginning, as well as throughout the whole meeting, (2) negative mood before the meeting only influences the amount of negative statements in the first quarter of the meeting, but not the whole meeting, and (3) the amount of positive and negative statements during the meeting has no influence on the mood afterwards. [Conclusions:] We conclude that the behaviour in meetings might rather influence short-term emotional states (feelings) than long-term emotional states (mood), which are more important for the project.

团队合作、协调和沟通是及时完成软件项目的先决条件。会议作为协调和沟通的促进者，是信息交流的既定媒介。对软件项目中的会议的分析表明，这些会议中的某些交互作用，例如紧跟着支持性发言的主动发言，会影响团队的情绪和动机，进而影响其生产力。然而，到目前为止，研究只集中在细节层面上的某些互动，需要对会议本身进行复杂而细致的分析。[目的:]在本文中，我们从一个更抽象的角度来研究会议，重点关注发言的极性，即它们是积极的、消极的还是中立的。[方法]我们分析了会议发言的极性与不同社会方面的关系，包括冲突以及会议前后的情绪。[结果]我们的结果来自21个学生软件项目会议，并展示了一些有趣的见解:(1)会议前的积极情绪不仅与会议开始时的积极言论数量有关，而且与整个会议有关;(2)会议前的消极情绪只影响会议第一季度的消极言论数量，而对整个会议没有影响;(3)会议期间的积极言论和消极言论数量对会议结束后的情绪没有影响。[结论]我们得出的结论是，会议中的行为可能会影响短期情绪状态(感觉)，而不是长期情绪状态(情绪)，后者对项目更重要。

{"title":"Meetings and Mood – Related or Not? Insights from Student Software Projects","authors":"Jil Klunder, Oliver Karras","doi":"10.1145/3544902.3546252","DOIUrl":"https://doi.org/10.1145/3544902.3546252","url":null,"abstract":"[Background:] Teamwork, coordination, and communication are a prerequisite for the timely completion of a software project. Meetings as a facilitator for coordination and communication are an established medium for information exchange. Analyses of meetings in software projects have shown that certain interactions in these meetings, such as proactive statements followed by supportive ones, influence the mood and motivation of a team, which in turn affects its productivity. So far, however, research has focused only on certain interactions at a detailed level, requiring a complex and fine-grained analysis of a meeting itself. [Aim:] In this paper, we investigate meetings from a more abstract perspective, focusing on the polarity of the statements, i.e., whether they appear to be positive, negative, or neutral. [Method:] We analyze the relationship between the polarity of statements in meetings and different social aspects, including conflicts as well as the mood before and after a meeting. [Results:] Our results emerge from 21 student software project meetings and show some interesting insights: (1) Positive mood before a meeting is both related to the amount of positive statements in the beginning, as well as throughout the whole meeting, (2) negative mood before the meeting only influences the amount of negative statements in the first quarter of the meeting, but not the whole meeting, and (3) the amount of positive and negative statements during the meeting has no influence on the mood afterwards. [Conclusions:] We conclude that the behaviour in meetings might rather influence short-term emotional states (feelings) than long-term emotional states (mood), which are more important for the project.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129801985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Potential Technical Debt and Its Resolution in Code Reviews: An Exploratory Study of the OpenStack and Qt Communities 代码审查中潜在的技术债务及其解决方案:对OpenStack和Qt社区的探索性研究

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-06-29 DOI: 10.1145/3544902.3546253

Liming Fu, Peng Liang, Z. Rasheed, Zengyang Li, Amjed Tahir, Xiaofeng Han

Background: Technical Debt (TD) refers to the situation where developers make trade-offs to achieve short-term goals at the expense of long-term code quality, which can have a negative impact on the quality of software systems. In the context of code review, such sub-optimal implementations have chances to be timely resolved during the review process before the code is merged. Therefore, we could consider them as Potential Technical Debt (PTD) since PTD will evolve into TD when it is injected into software systems without being resolved. Aim: To date, little is known about the extent to which PTD is identified in code reviews. Many tools have been provided to detect TD, but these tools lack consensus and a large amount of PTD are undetectable by tools while code review could help verify the quality of code that has been committed by identifying issues, such as PTD. To this end, we conducted an exploratory study in an attempt to understand the nature of PTD in code reviews and track down the resolution of PTD after being identified. Method: We randomly collected 2,030 review comments from the Nova project of OpenStack and the Qt Base project of Qt. We then manually checked these review comments, and obtained 163 PTD-related review comments for further analysis. Results: Our results show that: (1) PTD can be identified in code reviews but is not prevalent. (2) Design, defect, documentation, requirement, test, and code PTD are identified in code reviews, in which code and documentation PTD are the dominant. (3) 81.0% of the PTD identified in code reviews has been resolved by developers, and 78.0% of the resolved TD was resolved by developers within a week. (4) Code refactoring is the main practice used by developers to resolve the PTD identified in code reviews. Conclusions: Our findings indicate that: (1) review-based detection of PTD is seen as one of the trustworthy mechanisms in development, and (2) there is still a significant proportion of PTD (19.0%) remaining unresolved when injected into the software systems. Practitioners and researchers should establish effective strategies to manage and resolve PTD in development.

背景:技术债务(TD)指的是开发人员为了实现短期目标而牺牲长期代码质量的情况，这可能对软件系统的质量产生负面影响。在代码审查的环境中，这样的次优实现有机会在代码合并之前的审查过程中及时解决。因此，我们可以将它们视为潜在的技术债务(PTD)，因为当PTD被注入软件系统而不被解决时，它将演变成TD。目的:到目前为止，我们对代码审查中PTD的识别程度知之甚少。已经提供了许多工具来检测TD，但是这些工具缺乏共识，并且大量的PTD无法被工具检测到，而代码审查可以通过识别问题(例如PTD)来帮助验证已提交的代码的质量。为此，我们进行了一项探索性研究，试图了解代码审查中PTD的本质，并在确定PTD之后追踪其解决方案。方法:我们从OpenStack的Nova项目和Qt的Qt Base项目中随机抽取2030条评审意见，对这些评审意见进行人工核对，得到163条与pdd相关的评审意见进行进一步分析。结果:我们的结果表明:(1)PTD可以在代码审查中被识别出来，但并不普遍。(2)设计、缺陷、文档、需求、测试和代码PTD在代码审查中被识别，其中代码和文档PTD占主导地位。(3) 81.0%的代码评审中确定的PTD已被开发人员解决，78.0%的已解决的TD在一周内被开发人员解决。(4)代码重构是开发人员用来解决代码审查中发现的PTD的主要实践。结论:我们的研究结果表明:(1)基于评论的PTD检测被视为开发中值得信赖的机制之一;(2)在注入软件系统后，仍有很大比例的PTD(19.0%)未解决。从业者和研究人员应该建立有效的策略来管理和解决开发中的PTD。

{"title":"Potential Technical Debt and Its Resolution in Code Reviews: An Exploratory Study of the OpenStack and Qt Communities","authors":"Liming Fu, Peng Liang, Z. Rasheed, Zengyang Li, Amjed Tahir, Xiaofeng Han","doi":"10.1145/3544902.3546253","DOIUrl":"https://doi.org/10.1145/3544902.3546253","url":null,"abstract":"Background: Technical Debt (TD) refers to the situation where developers make trade-offs to achieve short-term goals at the expense of long-term code quality, which can have a negative impact on the quality of software systems. In the context of code review, such sub-optimal implementations have chances to be timely resolved during the review process before the code is merged. Therefore, we could consider them as Potential Technical Debt (PTD) since PTD will evolve into TD when it is injected into software systems without being resolved. Aim: To date, little is known about the extent to which PTD is identified in code reviews. Many tools have been provided to detect TD, but these tools lack consensus and a large amount of PTD are undetectable by tools while code review could help verify the quality of code that has been committed by identifying issues, such as PTD. To this end, we conducted an exploratory study in an attempt to understand the nature of PTD in code reviews and track down the resolution of PTD after being identified. Method: We randomly collected 2,030 review comments from the Nova project of OpenStack and the Qt Base project of Qt. We then manually checked these review comments, and obtained 163 PTD-related review comments for further analysis. Results: Our results show that: (1) PTD can be identified in code reviews but is not prevalent. (2) Design, defect, documentation, requirement, test, and code PTD are identified in code reviews, in which code and documentation PTD are the dominant. (3) 81.0% of the PTD identified in code reviews has been resolved by developers, and 78.0% of the resolved TD was resolved by developers within a week. (4) Code refactoring is the main practice used by developers to resolve the PTD identified in code reviews. Conclusions: Our findings indicate that: (1) review-based detection of PTD is seen as one of the trustworthy mechanisms in development, and (2) there is still a significant proportion of PTD (19.0%) remaining unresolved when injected into the software systems. Practitioners and researchers should establish effective strategies to manage and resolve PTD in development.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134172449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Heterogeneous Graph Neural Networks for Software Effort Estimation 异构图神经网络用于软件工作量估算

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-06-22 DOI: 10.1145/3544902.3546248

H. Phan, A. Jannesari

Background. Software effort can be measured by story point [35]. Story point estimation is important in software projects’ planning. Current approaches for automatically estimating story points focus on applying pre-trained embedding models and deep learning for text regression to solve this problem. These approaches require expensive embedding models and confront challenges that the sequence of text might not be an efficient representation for software issues which can be the combination of text and code. Aims. We propose HeteroSP, a tool for estimating story points from textual input of Agile software project issues. We select GPT2SP [12] and Deep-SE [8] as the baselines for comparison. Method. First, from the analysis of the story point dataset [8], we conclude that software issues are actually a mixture of natural language sentences with quoted code snippets and have problems related to large-size vocabulary. Second, we provide a module to normalize the input text including words and code tokens of the software issues. Third, we design an algorithm to convert an input software issue to a graph with different types of nodes and edges. Fourth, we construct a heterogeneous graph neural networks model with the support of fastText [6] for constructing initial node embedding to learn and predict the story points of new issues. Results. We did the comparison over three scenarios of estimation, including within project, cross-project within the repository, and cross-project cross repository with our baseline approaches. We achieve the average Mean Absolute Error (MAE) as 2.38, 2.61, and 2.63 for three scenarios. We outperform GPT2SP in 2/3 of the scenarios while outperforming Deep-SE in the most challenging scenario with significantly less amount of running time. We also compare our approaches with different homogeneous graph neural network models and the results show that the heterogeneous graph neural networks model outperforms the homogeneous models in story point estimation. For time performance, we achieve about 570 seconds as the time performance in both three processes: node embedding initialization, model construction, and story point estimation. HeterpSP’s artifacts are available at [22]. Conclusion. HeteroSP, a heterogeneous graph neural networks model for story point estimation, achieved good accuracy and running time.

背景。软件工作可以通过故事点[35]来度量。故事点评估在软件项目的规划中是很重要的。当前自动估计故事点的方法主要集中在应用预训练的嵌入模型和文本回归的深度学习来解决这个问题。这些方法需要昂贵的嵌入模型，并面临着文本序列可能不是软件问题(可能是文本和代码的组合)的有效表示的挑战。目标我们提出了HeteroSP，一个从敏捷软件项目问题的文本输入中估计故事点的工具。我们选择GPT2SP[12]和Deep-SE[8]作为基线进行比较。方法。首先，从对故事点数据集[8]的分析中，我们得出结论，软件问题实际上是自然语言句子与引用代码片段的混合，并且存在与大词汇量相关的问题。其次，我们提供了一个模块来规范输入文本，包括单词和代码标记的软件问题。第三，我们设计了一种算法，将输入软件问题转换为具有不同类型节点和边的图。第四，在fastText[6]的支持下，构建异构图神经网络模型，构建初始节点嵌入，学习和预测新问题的故事点。结果。我们对三种评估场景进行了比较，包括项目内、存储库内的跨项目，以及使用我们的基线方法的跨项目跨存储库。在三种情况下，平均绝对误差(MAE)分别为2.38、2.61和2.63。我们在2/3的场景中优于GPT2SP，而在最具挑战性的场景中优于Deep-SE，且运行时间显著减少。我们还将我们的方法与不同的同构图神经网络模型进行了比较，结果表明异构图神经网络模型在故事点估计方面优于同构模型。在时间性能方面，我们在节点嵌入初始化、模型构建和故事点估计这三个过程中都实现了大约570秒的时间性能。从b[22]可以获得HeterpSP的构件。结论。异构图神经网络模型HeteroSP在故事点估计方面取得了良好的准确性和运行时间。

{"title":"Heterogeneous Graph Neural Networks for Software Effort Estimation","authors":"H. Phan, A. Jannesari","doi":"10.1145/3544902.3546248","DOIUrl":"https://doi.org/10.1145/3544902.3546248","url":null,"abstract":"Background. Software effort can be measured by story point [35]. Story point estimation is important in software projects’ planning. Current approaches for automatically estimating story points focus on applying pre-trained embedding models and deep learning for text regression to solve this problem. These approaches require expensive embedding models and confront challenges that the sequence of text might not be an efficient representation for software issues which can be the combination of text and code. Aims. We propose HeteroSP, a tool for estimating story points from textual input of Agile software project issues. We select GPT2SP [12] and Deep-SE [8] as the baselines for comparison. Method. First, from the analysis of the story point dataset [8], we conclude that software issues are actually a mixture of natural language sentences with quoted code snippets and have problems related to large-size vocabulary. Second, we provide a module to normalize the input text including words and code tokens of the software issues. Third, we design an algorithm to convert an input software issue to a graph with different types of nodes and edges. Fourth, we construct a heterogeneous graph neural networks model with the support of fastText [6] for constructing initial node embedding to learn and predict the story points of new issues. Results. We did the comparison over three scenarios of estimation, including within project, cross-project within the repository, and cross-project cross repository with our baseline approaches. We achieve the average Mean Absolute Error (MAE) as 2.38, 2.61, and 2.63 for three scenarios. We outperform GPT2SP in 2/3 of the scenarios while outperforming Deep-SE in the most challenging scenario with significantly less amount of running time. We also compare our approaches with different homogeneous graph neural network models and the results show that the heterogeneous graph neural networks model outperforms the homogeneous models in story point estimation. For time performance, we achieve about 570 seconds as the time performance in both three processes: node embedding initialization, model construction, and story point estimation. HeterpSP’s artifacts are available at [22]. Conclusion. HeteroSP, a heterogeneous graph neural networks model for story point estimation, achieved good accuracy and running time.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133307144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

To What Extent Cognitive-Driven Development Improves Code Readability? 认知驱动开发在多大程度上提高了代码的可读性?

Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

Pub Date : 2022-06-21 DOI: 10.1145/3544902.3546241

Leonardo Barbosa, V. H. S. C. Pinto, A. Souza, G. Pinto

Background: Cognitive-Driven Development (CDD) is a coding design technique that aims to reduce developers’ cognitive effort in understanding a given code unit (e.g., a class). By following CDD design practices, it is expected that the coding units to be smaller and, thus, easier to maintain and evolve. However, it is so unknown whether these smaller code units coded using CDD standards are easier to understand. Aims: This work aims to assess how much CDD improves code readability. Method: To achieve this goal, we conducted a two-phase study. We start by inviting professional software developers to vote (and justify their rationale) on the most readable pair of code snippets (from a set of 10 pairs); one of the pairs was coded using CDD practices. We received 133 answers. In the second phase, we applied the state-of-the-art readability model to the 10-pairs of CDD-driven refactorings. Results: We observed some conflicting results. On the one hand, developers perceived that seven (out of 10) CDD-driven refactorings were more readable than their counterparts; for two other CDD-driven refactorings, developers were undecided, while only in one of the CDD-driven refactorings, developers preferred the original code snippet. On the other hand, we noticed that only one CDD-driven refactorings has better performance readability, assessed by state-of-the-art readability models. Conclusions: Our results provide initial evidence that CDD could be an exciting approach for software design.

背景:认知驱动开发(CDD)是一种编码设计技术，旨在减少开发人员在理解给定代码单元(例如，类)时的认知努力。通过遵循CDD设计实践，期望编码单元更小，因此更容易维护和发展。然而，使用CDD标准编码的这些较小的代码单元是否更容易理解还不清楚。目的:本工作旨在评估CDD在多大程度上提高了代码的可读性。方法:为达到这一目的，我们进行了两期研究。我们首先邀请专业软件开发人员投票(并证明他们的理由)选出最易读的代码片段对(从一组10对代码片段中);其中一对使用CDD实践进行编码。我们收到了133个答案。在第二阶段，我们将最先进的可读性模型应用于10对cdd驱动的重构。结果:我们观察到一些相互矛盾的结果。一方面，开发人员认为cdd驱动的重构中有7个(10个中有7个)比它们的同类更具可读性;对于另外两个cdd驱动的重构，开发人员没有做出决定，而只有在一个cdd驱动的重构中，开发人员更喜欢原始的代码片段。另一方面，我们注意到只有一个cdd驱动的重构具有更好的性能可读性，这是由最先进的可读性模型评估的。结论:我们的结果提供了CDD可能是一个令人兴奋的软件设计方法的初步证据。

{"title":"To What Extent Cognitive-Driven Development Improves Code Readability?","authors":"Leonardo Barbosa, V. H. S. C. Pinto, A. Souza, G. Pinto","doi":"10.1145/3544902.3546241","DOIUrl":"https://doi.org/10.1145/3544902.3546241","url":null,"abstract":"Background: Cognitive-Driven Development (CDD) is a coding design technique that aims to reduce developers’ cognitive effort in understanding a given code unit (e.g., a class). By following CDD design practices, it is expected that the coding units to be smaller and, thus, easier to maintain and evolve. However, it is so unknown whether these smaller code units coded using CDD standards are easier to understand. Aims: This work aims to assess how much CDD improves code readability. Method: To achieve this goal, we conducted a two-phase study. We start by inviting professional software developers to vote (and justify their rationale) on the most readable pair of code snippets (from a set of 10 pairs); one of the pairs was coded using CDD practices. We received 133 answers. In the second phase, we applied the state-of-the-art readability model to the 10-pairs of CDD-driven refactorings. Results: We observed some conflicting results. On the one hand, developers perceived that seven (out of 10) CDD-driven refactorings were more readable than their counterparts; for two other CDD-driven refactorings, developers were undecided, while only in one of the CDD-driven refactorings, developers preferred the original code snippet. On the other hand, we noticed that only one CDD-driven refactorings has better performance readability, assessed by state-of-the-art readability models. Conclusions: Our results provide initial evidence that CDD could be an exciting approach for software design.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124659836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2