Empirical Software Engineering最新文献_第7页

Understanding the characteristics and the role of visual issue reports 了解视觉问题报告的特点和作用

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-10 DOI: 10.1007/s10664-024-10459-3

Hiroki Kuramoto, Dong Wang, Masanari Kondo, Yutaro Kashiwa, Yasutaka Kamei, Naoyasu Ubayashi

Issue reports are a pivotal interface between developers and users for receiving information about bugs in their products. In practice, reproducing those bugs is challenging, since issue reports often contain incorrect information or lack sufficient information. Furthermore, the poor quality of issue reports would have the effect of delaying the entire bug-fixing process. To enhance bug comprehension and facilitate bug reproduction, GitHub Issue allows users to embed visuals such as images and videos to complement the textual description. Hence, we conduct an empirical study on 34 active GitHub repositories to quantitatively analyze the difference between visual issue reports and non-visual ones, and qualitatively analyze the characteristics of visuals and the usage of visuals in bug types. Our results show that visual issue reports have a significantly higher probability of reporting bugs. Visual reports also tend to receive the first comment and complete the conversation in a relatively shorter time. Visuals are frequently used to present the program behavior and the user interface, with the major purpose of introducing problems in reports. Additionally, we observe that visuals are commonly used to report GUI-related bugs, but they are rarely used to report configuration bugs in comparison to non-visual issue reports. To summarize, our work highlights the role of visual play in the bug-fixing process and lays the foundation for future research to support bug comprehension by exploiting visuals.

问题报告是开发人员和用户之间的一个重要接口，用于接收产品中的错误信息。在实践中，由于问题报告往往包含不正确的信息或缺乏足够的信息，因此重现这些错误具有挑战性。此外，低质量的问题报告还会延误整个错误修复过程。为了增强对错误的理解并促进错误的重现，GitHub Issue 允许用户嵌入图片和视频等视觉效果来补充文字描述。因此，我们对 34 个活跃的 GitHub 仓库进行了实证研究，定量分析了可视化问题报告与非可视化问题报告之间的差异，并定性分析了可视化问题报告的特点以及可视化问题报告在错误类型中的使用情况。结果表明，可视化问题报告报告错误的概率明显更高。可视化报告还倾向于收到第一条评论，并在相对较短的时间内完成对话。可视化通常用于展示程序行为和用户界面，其主要目的是在报告中引入问题。此外，我们还观察到，可视化通常用于报告图形用户界面相关的错误，但与非可视化问题报告相比，可视化很少用于报告配置错误。总之，我们的工作强调了视觉效果在错误修复过程中的作用，并为今后通过利用视觉效果来支持错误理解的研究奠定了基础。

{"title":"Understanding the characteristics and the role of visual issue reports","authors":"Hiroki Kuramoto, Dong Wang, Masanari Kondo, Yutaro Kashiwa, Yasutaka Kamei, Naoyasu Ubayashi","doi":"10.1007/s10664-024-10459-3","DOIUrl":"https://doi.org/10.1007/s10664-024-10459-3","url":null,"abstract":"<p>Issue reports are a pivotal interface between developers and users for receiving information about bugs in their products. In practice, reproducing those bugs is challenging, since issue reports often contain incorrect information or lack sufficient information. Furthermore, the poor quality of issue reports would have the effect of delaying the entire bug-fixing process. To enhance bug comprehension and facilitate bug reproduction, GitHub Issue allows users to embed visuals such as images and videos to complement the textual description. Hence, we conduct an empirical study on 34 active GitHub repositories to quantitatively analyze the difference between visual issue reports and non-visual ones, and qualitatively analyze the characteristics of visuals and the usage of visuals in bug types. Our results show that visual issue reports have a significantly higher probability of reporting bugs. Visual reports also tend to receive the first comment and complete the conversation in a relatively shorter time. Visuals are frequently used to present the program behavior and the user interface, with the major purpose of introducing problems in reports. Additionally, we observe that visuals are commonly used to report GUI-related bugs, but they are rarely used to report configuration bugs in comparison to non-visual issue reports. To summarize, our work highlights the role of visual play in the bug-fixing process and lays the foundation for future research to support bug comprehension by exploiting visuals.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"23 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Toward effective secure code reviews: an empirical study of security-related coding weaknesses 实现有效的安全代码审查：与安全相关的编码弱点实证研究

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-08 DOI: 10.1007/s10664-024-10496-y

Wachiraphan Charoenwet, Patanamon Thongtanunam, Van-Thuan Pham, Christoph Treude

Identifying security issues early is encouraged to reduce the latent negative impacts on the software systems. Code review is a widely-used method that allows developers to manually inspect modified code, catching security issues during a software development cycle. However, existing code review studies often focus on known vulnerabilities, neglecting coding weaknesses, which can introduce real-world security issues that are more visible through code review. The practices of code reviews in identifying such coding weaknesses are not yet fully investigated. To better understand this, we conducted an empirical case study in two large open-source projects, OpenSSL and PHP. Based on 135,560 code review comments, we found that reviewers raised security concerns in 35 out of 40 coding weakness categories. Surprisingly, some coding weaknesses related to past vulnerabilities, such as memory errors and resource management, were discussed less often than the vulnerabilities. Developers attempted to address raised security concerns in many cases (39%-41%), but a substantial portion was merely acknowledged (30%-36%), and some went unfixed due to disagreements about solutions (18%-20%). This highlights that coding weaknesses can slip through code review even when identified. Our findings suggest that reviewers can identify various coding weaknesses leading to security issues during code reviews. However, these results also reveal shortcomings in current code review practices, indicating the need for more effective mechanisms or support for increasing awareness of security issue management in code reviews.

我们鼓励尽早发现安全问题，以减少对软件系统的潜在负面影响。代码审查是一种广泛使用的方法，允许开发人员手动检查修改过的代码，在软件开发周期中捕捉安全问题。然而，现有的代码审查研究往往只关注已知的漏洞，而忽略了编码弱点，而这些弱点可能会带来现实世界中的安全问题，通过代码审查更容易发现。代码审查在识别此类编码弱点方面的实践尚未得到充分研究。为了更好地理解这一点，我们对 OpenSSL 和 PHP 这两个大型开源项目进行了实证案例研究。基于 135,560 条代码审查意见，我们发现审查员在 40 个编码弱点类别中的 35 个类别中提出了安全问题。令人惊讶的是，与过去的漏洞相关的一些编码弱点，如内存错误和资源管理，被讨论的次数少于漏洞。在许多情况下，开发人员都试图解决提出的安全问题（39%-41%），但有相当一部分问题只是得到了承认（30%-36%），还有一些问题由于对解决方案存在分歧而没有得到修复（18%-20%）。这突出表明，编码缺陷即使被识别出来，也有可能在代码审查中漏掉。我们的研究结果表明，审查人员可以在代码审查过程中发现导致安全问题的各种编码弱点。然而，这些结果也揭示了当前代码审查实践中的不足，表明需要更有效的机制或支持来提高代码审查中的安全问题管理意识。

{"title":"Toward effective secure code reviews: an empirical study of security-related coding weaknesses","authors":"Wachiraphan Charoenwet, Patanamon Thongtanunam, Van-Thuan Pham, Christoph Treude","doi":"10.1007/s10664-024-10496-y","DOIUrl":"https://doi.org/10.1007/s10664-024-10496-y","url":null,"abstract":"<p>Identifying security issues early is encouraged to reduce the latent negative impacts on the software systems. Code review is a widely-used method that allows developers to manually inspect modified code, catching security issues during a software development cycle. However, existing code review studies often focus on known vulnerabilities, neglecting coding weaknesses, which can introduce real-world security issues that are more visible through code review. The practices of code reviews in identifying such coding weaknesses are not yet fully investigated. To better understand this, we conducted an empirical case study in two large open-source projects, OpenSSL and PHP. Based on 135,560 code review comments, we found that reviewers raised security concerns in 35 out of 40 coding weakness categories. Surprisingly, some coding weaknesses related to past vulnerabilities, such as memory errors and resource management, were discussed less often than the vulnerabilities. Developers attempted to address raised security concerns in many cases (39%-41%), but a substantial portion was merely acknowledged (30%-36%), and some went unfixed due to disagreements about solutions (18%-20%). This highlights that coding weaknesses can slip through code review even when identified. Our findings suggest that reviewers can identify various coding weaknesses leading to security issues during code reviews. However, these results also reveal shortcomings in current code review practices, indicating the need for more effective mechanisms or support for increasing awareness of security issue management in code reviews.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"204 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141521634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The untold impact of learning approaches on software fault-proneness predictions: an analysis of temporal aspects 学习方法对软件故障倾向性预测的不可言喻的影响：时间方面的分析

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-08 DOI: 10.1007/s10664-024-10454-8

Mohammad Jamil Ahmad, Katerina Goseva-Popstojanova, Robyn R. Lutz

This paper aims to improve software fault-proneness prediction by investigating the unexplored effects on classification performance of the temporal decisions made by practitioners and researchers regarding (i) the interval for which they will collect longitudinal features (software metrics data), and (ii) the interval for which they will predict software bugs (the target variable). We call these specifics of the data used for training and of the target variable being predicted the learning approach, and explore the impact of the two most common learning approaches on the performance of software fault-proneness prediction, both within a single release of a software product and across releases. The paper presents empirical results from a study based on data extracted from 64 releases of twelve open-source projects. Results show that the learning approach has a substantial, and typically unacknowledged, impact on classification performance. Specifically, we show that one learning approach leads to significantly better performance than the other, both within-release and across-releases. Furthermore, this paper uncovers that, for within-release predictions, the difference in classification performance is due to different levels of class imbalance in the two learning approaches. Our findings show that improved specification of the learning approach is essential to understanding and explaining the performance of fault-proneness prediction models, as well as to avoiding misleading comparisons among them. The paper concludes with some practical recommendations and research directions based on our findings toward improved software fault-proneness prediction.

本文旨在通过研究从业人员和研究人员在以下两个方面做出的时间性决定对分类性能的影响来改进软件故障倾向性预测：(i) 收集纵向特征（软件度量数据）的时间间隔；(ii) 预测软件错误（目标变量）的时间间隔。我们将这些用于训练的数据和预测的目标变量的具体情况称为学习方法，并探讨了两种最常见的学习方法对软件产品单个版本内和跨版本的软件缺陷预测性能的影响。本文介绍了一项基于 12 个开源项目 64 个版本数据的研究的实证结果。研究结果表明，学习方法对分类性能有很大的影响，而这种影响通常未得到承认。具体来说，我们发现，无论是在发布版本内还是在不同发布版本之间，一种学习方法的性能都明显优于另一种学习方法。此外，本文还发现，在版本内预测中，分类性能的差异是由于两种学习方法的类不平衡程度不同造成的。我们的研究结果表明，改进学习方法的规范对于理解和解释故障倾向性预测模型的性能以及避免对它们进行误导性比较至关重要。最后，本文根据我们的发现提出了一些实用建议和研究方向，以改进软件故障倾向性预测。

{"title":"The untold impact of learning approaches on software fault-proneness predictions: an analysis of temporal aspects","authors":"Mohammad Jamil Ahmad, Katerina Goseva-Popstojanova, Robyn R. Lutz","doi":"10.1007/s10664-024-10454-8","DOIUrl":"https://doi.org/10.1007/s10664-024-10454-8","url":null,"abstract":"<p>This paper aims to improve software fault-proneness prediction by investigating the unexplored effects on classification performance of the temporal decisions made by practitioners and researchers regarding (i) the interval for which they will collect longitudinal features (software metrics data), and (ii) the interval for which they will predict software bugs (the target variable). We call these specifics of the data used for training and of the target variable being predicted the <i>learning approach</i>, and explore the impact of the two most common learning approaches on the performance of software fault-proneness prediction, both within a single release of a software product and across releases. The paper presents empirical results from a study based on data extracted from 64 releases of twelve open-source projects. Results show that the learning approach has a substantial, and typically unacknowledged, impact on classification performance. Specifically, we show that one learning approach leads to significantly better performance than the other, both within-release and across-releases. Furthermore, this paper uncovers that, for within-release predictions, the difference in classification performance is due to different levels of class imbalance in the two learning approaches. Our findings show that improved specification of the learning approach is essential to understanding and explaining the performance of fault-proneness prediction models, as well as to avoiding misleading comparisons among them. The paper concludes with some practical recommendations and research directions based on our findings toward improved software fault-proneness prediction.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"65 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141521632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Challenges, adaptations, and fringe benefits of conducting software engineering research with human participants during the COVID-19 pandemic 在 COVID-19 大流行期间与人类参与者一起开展软件工程研究的挑战、适应性和附带利益

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-07 DOI: 10.1007/s10664-024-10490-4

Anuradha Madugalla, Tanjila Kanij, Rashina Hoda, Dulaji Hidellaarachchi, Aastha Pant, Samia Ferdousi, John Grundy

The COVID-19 pandemic changed the way we live, work and the way we conduct research. With the restrictions of lockdowns and social distancing, various impacts were experienced by many software engineering researchers, especially whose studies depend on human participants. We conducted a mixed methods study to understand the extent of this impact. Through a detailed survey with 89 software engineering researchers working with human participants around the world and a further nine follow-up interviews, we identified the key challenges faced, the adaptations made, and the surprising fringe benefits of conducting research involving human participants during the pandemic. Our findings also revealed that in retrospect, many researchers did not wish to revert to the old ways of conducting human-orienfted research. Based on our analysis and insights, we share recommendations on how to conduct remote studies with human participants effectively in an increasingly hybrid world when face-to-face engagement is not possible or where remote participation is preferred.

COVID-19 大流行改变了我们的生活、工作和研究方式。在封锁和社会疏远的限制下，许多软件工程研究人员，尤其是其研究依赖于人类参与者的研究人员，受到了各种影响。我们开展了一项混合方法研究，以了解这种影响的程度。通过对全球 89 名与人类参与者合作的软件工程研究人员进行详细调查，以及另外九次后续访谈，我们确定了在大流行病期间开展有人类参与者参与的研究时所面临的主要挑战、所做的调整以及令人惊喜的附带好处。我们的研究结果还显示，回过头来看，许多研究人员并不希望回到过去开展以人为对象的研究的方式。基于我们的分析和见解，我们分享了在一个日益混合的世界里，当面对面参与不可能或远程参与是首选时，如何有效地开展有人类参与者参与的远程研究的建议。

{"title":"Challenges, adaptations, and fringe benefits of conducting software engineering research with human participants during the COVID-19 pandemic","authors":"Anuradha Madugalla, Tanjila Kanij, Rashina Hoda, Dulaji Hidellaarachchi, Aastha Pant, Samia Ferdousi, John Grundy","doi":"10.1007/s10664-024-10490-4","DOIUrl":"https://doi.org/10.1007/s10664-024-10490-4","url":null,"abstract":"<p>The COVID-19 pandemic changed the way we live, work and the way we conduct research. With the restrictions of lockdowns and social distancing, various impacts were experienced by many software engineering researchers, especially whose studies depend on human participants. We conducted a mixed methods study to understand the extent of this impact. Through a detailed survey with 89 software engineering researchers working with human participants around the world and a further nine follow-up interviews, we identified the key challenges faced, the adaptations made, and the surprising fringe benefits of conducting research involving human participants during the pandemic. Our findings also revealed that in retrospect, many researchers did not wish to revert to the old ways of conducting human-orienfted research. Based on our analysis and insights, we share recommendations on how to conduct remote studies with human participants effectively in an increasingly hybrid world when face-to-face engagement is not possible or where remote participation is preferred.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"238 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141521635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Characterizing and classifying developer forum posts with their intentions 根据开发者的意图对其论坛帖子进行定性和分类

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-05 DOI: 10.1007/s10664-024-10487-z

Xingfang Wu, Eric Laufer, Heng Li, Foutse Khomh, Santhosh Srinivasan, Jayden Luo

With the rapid growth of the developer community, the amount of posts on online technical forums has been growing rapidly, which poses difficulties for users to filter useful posts and find important information. Tags provide a concise feature dimension for users to locate their interested posts and for search engines to index the most relevant posts according to the queries. Most tags are only focused on the technical perspective (e.g., program language, platform, tool). In most cases, forum posts in online developer communities reveal the author’s intentions to solve a problem, ask for advice, share information, etc. The modeling of the intentions of posts can provide an extra dimension to the current tag taxonomy. By referencing previous studies and learning from industrial perspectives, we create a refined taxonomy for the intentions of technical forum posts. Through manual labeling and analysis on a sampled post dataset extracted from online forums, we understand the relevance between the constitution of posts (code, error messages) and their intentions. Furthermore, inspired by our manual study, we design a pre-trained transformer-based model to automatically predict post intentions. The best variant of our intention prediction framework, which achieves a Micro F1-score of 0.589, Top 1-3 accuracy of 62.6% to 87.8%, and an average AUC of 0.787, outperforms the state-of-the-art baseline approach. Our characterization and automated classification of forum posts regarding their intentions may help forum maintainers or third-party tool developers improve the organization and retrieval of posts on technical forums.

随着开发人员社区的迅速发展，在线技术论坛上的帖子数量也在快速增长，这给用户筛选有用帖子和查找重要信息带来了困难。标签为用户提供了一个简明的功能维度，便于他们查找感兴趣的帖子，也便于搜索引擎根据查询结果索引最相关的帖子。大多数标签只侧重于技术角度（如程序语言、平台、工具）。在大多数情况下，在线开发者社区的论坛帖子会显示作者解决问题、寻求建议、分享信息等的意图。对帖子意图的建模可以为当前的标签分类法提供一个额外的维度。通过参考以前的研究并借鉴行业观点，我们为技术论坛帖子的意图创建了一个完善的分类标准。通过对从在线论坛中提取的帖子数据集进行手动标记和分析，我们了解了帖子的构成（代码、错误信息）与其意图之间的相关性。此外，受人工研究的启发，我们设计了一个基于转换器的预训练模型来自动预测帖子的意图。我们的意图预测框架的最佳变体取得了 0.589 的 Micro F1 分数、62.6% 到 87.8% 的 Top 1-3 准确率和 0.787 的平均 AUC，优于最先进的基线方法。我们对论坛帖子意图的表征和自动分类可以帮助论坛维护者或第三方工具开发人员改进技术论坛帖子的组织和检索。

{"title":"Characterizing and classifying developer forum posts with their intentions","authors":"Xingfang Wu, Eric Laufer, Heng Li, Foutse Khomh, Santhosh Srinivasan, Jayden Luo","doi":"10.1007/s10664-024-10487-z","DOIUrl":"https://doi.org/10.1007/s10664-024-10487-z","url":null,"abstract":"<p>With the rapid growth of the developer community, the amount of posts on online technical forums has been growing rapidly, which poses difficulties for users to filter useful posts and find important information. Tags provide a concise feature dimension for users to locate their interested posts and for search engines to index the most relevant posts according to the queries. Most tags are only focused on the technical perspective (e.g., program language, platform, tool). In most cases, forum posts in online developer communities reveal the author’s intentions to solve a problem, ask for advice, share information, etc. The modeling of the intentions of posts can provide an extra dimension to the current tag taxonomy. By referencing previous studies and learning from industrial perspectives, we create a refined taxonomy for the intentions of technical forum posts. Through manual labeling and analysis on a sampled post dataset extracted from online forums, we understand the relevance between the constitution of posts (code, error messages) and their intentions. Furthermore, inspired by our manual study, we design a pre-trained transformer-based model to automatically predict post intentions. The best variant of our intention prediction framework, which achieves a Micro F1-score of 0.589, Top 1-3 accuracy of 62.6% to 87.8%, and an average AUC of 0.787, outperforms the state-of-the-art baseline approach. Our characterization and automated classification of forum posts regarding their intentions may help forum maintainers or third-party tool developers improve the organization and retrieval of posts on technical forums.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"25 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VulNet: Towards improving vulnerability management in the Maven ecosystem VulNet：改进 Maven 生态系统中的漏洞管理

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-05 DOI: 10.1007/s10664-024-10448-6

Zeyang Ma, Shouvick Mondal, Tse-Hsun (Peter) Chen, Haoxiang Zhang, Ahmed E. Hassan

Developers rely on software ecosystems such as Maven to manage and reuse external libraries (i.e., dependencies). Due to the complexity of the used dependencies, developers may face challenges in choosing which library to use and whether they should upgrade or downgrade a library. One important factor that affects this decision is the number of potential vulnerabilities in a library and its dependencies. Therefore, state-of-the-art platforms such as Maven Repository (MVN) and Open Source Insights (OSI) help developers in making such a decision by presenting vulnerability information associated with every dependency. In this paper, we first conduct an empirical study to understand how the two platforms, MVN and OSI, present and categorize vulnerability information. We found that these two platforms may either overestimate or underestimate the number of associated vulnerabilities in a dependency, and they lack prioritization mechanisms on which dependencies are more likely to cause an issue. Hence, we propose a tool named VulNet to address the limitations we found in MVN and OSI. Through an evaluation of 19,886 versions of the top 200 popular libraries, we find VulNet includes 90.5% and 65.8% of the dependencies that were omitted by MVN and OSI, respectively. VulNet also helps reduce 27% of potentially unreachable or less impactful vulnerabilities listed by OSI in test dependencies. Finally, our user study with 24 participants gave VulNet an average rating of 4.5/5 in presenting and prioritizing vulnerable dependencies, compared to 2.83 (MVN) and 3.14 (OSI).

开发人员依赖 Maven 等软件生态系统来管理和重用外部库（即依赖库）。由于所用依赖库的复杂性，开发人员在选择使用哪个库以及是否应升级或降级某个库时可能会面临挑战。影响这一决定的一个重要因素是库及其依赖关系中潜在漏洞的数量。因此，最先进的平台，如 Maven Repository (MVN) 和 Open Source Insights (OSI)，通过提供与每个依赖关系相关的漏洞信息，帮助开发人员做出这样的决定。在本文中，我们首先进行了一项实证研究，以了解 MVN 和 OSI 这两个平台是如何呈现和分类漏洞信息的。我们发现，这两个平台可能会高估或低估依赖关系中相关漏洞的数量，而且它们缺乏优先排序机制，无法确定哪些依赖关系更有可能导致问题。因此，我们提出了一个名为 VulNet 的工具，以解决我们在 MVN 和 OSI 中发现的局限性。通过对前 200 个流行库的 19886 个版本进行评估，我们发现 VulNet 分别包含了 MVN 和 OSI 遗漏的 90.5% 和 65.8% 的依赖关系。VulNet 还帮助减少了 OSI 在测试依赖项中列出的 27% 可能无法访问或影响较小的漏洞。最后，我们对 24 位参与者进行的用户研究显示，VulNet 在呈现和优先处理易受攻击的依赖性方面的平均评分为 4.5/5，而 MVN 和 OSI 的评分分别为 2.83 和 3.14。

{"title":"VulNet: Towards improving vulnerability management in the Maven ecosystem","authors":"Zeyang Ma, Shouvick Mondal, Tse-Hsun (Peter) Chen, Haoxiang Zhang, Ahmed E. Hassan","doi":"10.1007/s10664-024-10448-6","DOIUrl":"https://doi.org/10.1007/s10664-024-10448-6","url":null,"abstract":"<p>Developers rely on software ecosystems such as Maven to manage and reuse external libraries (i.e., dependencies). Due to the complexity of the used dependencies, developers may face challenges in choosing which library to use and whether they should upgrade or downgrade a library. One important factor that affects this decision is the number of potential vulnerabilities in a library and its dependencies. Therefore, state-of-the-art platforms such as Maven Repository (MVN) and Open Source Insights (OSI) help developers in making such a decision by presenting vulnerability information associated with every dependency. In this paper, we first conduct an empirical study to understand how the two platforms, MVN and OSI, present and categorize vulnerability information. We found that these two platforms may either overestimate or underestimate the number of associated vulnerabilities in a dependency, and they lack prioritization mechanisms on which dependencies are more likely to cause an issue. Hence, we propose a tool named VulNet to address the limitations we found in MVN and OSI. Through an evaluation of 19,886 versions of the top 200 popular libraries, we find VulNet includes 90.5% and 65.8% of the dependencies that were omitted by MVN and OSI, respectively. VulNet also helps reduce 27% of potentially unreachable or less impactful vulnerabilities listed by OSI in test dependencies. Finally, our user study with 24 participants gave VulNet an average rating of 4.5/5 in presenting and prioritizing vulnerable dependencies, compared to 2.83 (MVN) and 3.14 (OSI).</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"38 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141256016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards automatic labeling of exception handling bugs: A case study of 10 years bug-fixing in Apache Hadoop 实现异常处理错误的自动标记：Apache Hadoop 10 年错误修复案例研究

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-05 DOI: 10.1007/s10664-024-10494-0

Antônio José A. da Silva, Renan G. Vieira, Diego P. P. Mesquita, João Paulo P. Gomes, Lincoln S. Rocha

<h3 data-test="abstract-sub-heading">Context</h3><p>Exception handling (EH) bugs stem from incorrect usage of exception handling mechanisms (EHMs) and often incur severe consequences (e.g., system downtime, data loss, and security risk). Tracking EH bugs is particularly relevant for contemporary systems (e.g., cloud- and AI-based systems), in which the software’s sophisticated logic is an additional threat to the correct use of the EHM. On top of that, bug reporters seldom can tag EH bugs — since it may require an encompassing knowledge of the software’s EH strategy. Surprisingly, to the best of our knowledge, there is no automated procedure to identify EH bugs from report descriptions.</p><h3 data-test="abstract-sub-heading">Objective</h3><p>First, we aim to evaluate the extent to which Natural Language Processing (NLP) and Machine Learning (ML) can be used to reliably label EH bugs using the text fields from bug reports (e.g., summary, description, and comments). Second, we aim to provide a reliably labeled dataset that the community can use in future endeavors. Overall, we expect our work to raise the community’s awareness regarding the importance of EH bugs.</p><h3 data-test="abstract-sub-heading">Method</h3><p>We manually analyzed 4,516 bug reports from the four main components of Apache’s Hadoop project, out of which we labeled <span>(approx 20%)</span> (943) as EH bugs. We also labeled 2,584 non-EH bugs analyzing their bug-fixing code and creating a dataset composed of 7,100 bug reports. Then, we used word embedding techniques (Bag-of-Words and TF-IDF) to summarize the textual fields of bug reports. Subsequently, we used these embeddings to fit five classes of ML methods and evaluate them on unseen data. We also evaluated a pre-trained transformer-based model using the complete textual fields. We have also evaluated whether considering only EH keywords is enough to achieve high predictive performance.</p><h3 data-test="abstract-sub-heading">Results</h3><p>Our results show that using a pre-trained DistilBERT with a linear layer trained with our proposed dataset can reasonably label EH bugs, achieving ROC-AUC scores of up to 0.88. The combination of NLP and ML traditional techniques achieved ROC-AUC scores of up to 0.74 and recall up to 0.56. As a sanity check, we also evaluate methods using embeddings extracted solely from keywords. Considering ROC-AUC as the primary concern, for the majority of ML methods tested, the analysis suggests that keywords alone are not sufficient to characterize reports of EH bugs, although this can change based on other metrics (such as recall and precision) or ML methods (e.g., Random Forest).</p><h3 data-test="abstract-sub-heading">Conclusions</h3><p>To the best of our knowledge, this is the first study addressing the problem of automatic labeling of EH bugs. Based on our results, we can conclude that the use of ML techniques, specially transformer-base models, sounds promising to automate the task of labeling

上下文异常处理（EH）错误源于对异常处理机制（EHM）的不正确使用，通常会导致严重后果（如系统宕机、数据丢失和安全风险）。跟踪 EH 错误对当代系统（如基于云和人工智能的系统）尤为重要，因为在这些系统中，软件的复杂逻辑是正确使用 EHM 的额外威胁。此外，错误报告者很少能标记 EH 错误，因为这可能需要对软件的 EH 策略有全面的了解。首先，我们旨在评估自然语言处理（NLP）和机器学习（ML）在多大程度上可用于使用错误报告中的文本字段（如摘要、描述和注释）可靠地标记 EH 错误。其次，我们的目标是提供一个可靠的标签数据集，供社区在未来的工作中使用。总之，我们希望我们的工作能提高社区对 EH 错误重要性的认识。方法我们人工分析了来自 Apache Hadoop 项目四个主要组件的 4,516 份错误报告，其中我们将 943 个错误标记为 EH 错误。我们还标记了 2,584 个非 EH 错误，分析了它们的错误修复代码，并创建了一个由 7,100 份错误报告组成的数据集。然后，我们使用词嵌入技术（Bag-of-Words 和 TF-IDF）来概括错误报告的文本字段。随后，我们使用这些嵌入技术拟合了五类 ML 方法，并在未见数据上对其进行了评估。我们还使用完整的文本字段对预先训练好的基于转换器的模型进行了评估。我们还评估了仅考虑 EH 关键字是否足以实现较高的预测性能。结果我们的结果表明，使用预先训练好的 DistilBERT，再加上使用我们提出的数据集训练好的线性层，可以合理地标注 EH 错误，ROC-AUC 分数高达 0.88。结合使用 NLP 和 ML 传统技术，ROC-AUC 得分高达 0.74，召回率高达 0.56。为了进行合理性检查，我们还评估了仅从关键词中提取嵌入的方法。考虑到 ROC-AUC 是主要关注点，对于大多数测试过的 ML 方法，分析表明仅靠关键词不足以描述 EH Bug 报告的特征，不过根据其他指标（如召回率和精确度）或 ML 方法（如随机森林），情况可能会发生变化。根据我们的研究结果，我们可以得出结论：使用 ML 技术，特别是转换器基础模型，很有希望实现 EH 错误标记任务的自动化。总之，我们希望：(i) 我们的工作将有助于提高人们对 EH 错误的认识；(ii) 我们（公开提供的）数据集将作为基准数据集，为后续工作铺平道路。此外，我们的发现还可用于构建工具，帮助维护者在分流过程中充实 EH 漏洞。

{"title":"Towards automatic labeling of exception handling bugs: A case study of 10 years bug-fixing in Apache Hadoop","authors":"Antônio José A. da Silva, Renan G. Vieira, Diego P. P. Mesquita, João Paulo P. Gomes, Lincoln S. Rocha","doi":"10.1007/s10664-024-10494-0","DOIUrl":"https://doi.org/10.1007/s10664-024-10494-0","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Context</h3><p>Exception handling (EH) bugs stem from incorrect usage of exception handling mechanisms (EHMs) and often incur severe consequences (e.g., system downtime, data loss, and security risk). Tracking EH bugs is particularly relevant for contemporary systems (e.g., cloud- and AI-based systems), in which the software’s sophisticated logic is an additional threat to the correct use of the EHM. On top of that, bug reporters seldom can tag EH bugs — since it may require an encompassing knowledge of the software’s EH strategy. Surprisingly, to the best of our knowledge, there is no automated procedure to identify EH bugs from report descriptions.</p><h3 data-test=\"abstract-sub-heading\">Objective</h3><p>First, we aim to evaluate the extent to which Natural Language Processing (NLP) and Machine Learning (ML) can be used to reliably label EH bugs using the text fields from bug reports (e.g., summary, description, and comments). Second, we aim to provide a reliably labeled dataset that the community can use in future endeavors. Overall, we expect our work to raise the community’s awareness regarding the importance of EH bugs.</p><h3 data-test=\"abstract-sub-heading\">Method</h3><p>We manually analyzed 4,516 bug reports from the four main components of Apache’s Hadoop project, out of which we labeled <span>(approx 20%)</span> (943) as EH bugs. We also labeled 2,584 non-EH bugs analyzing their bug-fixing code and creating a dataset composed of 7,100 bug reports. Then, we used word embedding techniques (Bag-of-Words and TF-IDF) to summarize the textual fields of bug reports. Subsequently, we used these embeddings to fit five classes of ML methods and evaluate them on unseen data. We also evaluated a pre-trained transformer-based model using the complete textual fields. We have also evaluated whether considering only EH keywords is enough to achieve high predictive performance.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>Our results show that using a pre-trained DistilBERT with a linear layer trained with our proposed dataset can reasonably label EH bugs, achieving ROC-AUC scores of up to 0.88. The combination of NLP and ML traditional techniques achieved ROC-AUC scores of up to 0.74 and recall up to 0.56. As a sanity check, we also evaluate methods using embeddings extracted solely from keywords. Considering ROC-AUC as the primary concern, for the majority of ML methods tested, the analysis suggests that keywords alone are not sufficient to characterize reports of EH bugs, although this can change based on other metrics (such as recall and precision) or ML methods (e.g., Random Forest).</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>To the best of our knowledge, this is the first study addressing the problem of automatic labeling of EH bugs. Based on our results, we can conclude that the use of ML techniques, specially transformer-base models, sounds promising to automate the task of labeling","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"23 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141256279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards understanding barriers and mitigation strategies of software engineers with non-traditional educational and occupational backgrounds 了解具有非传统教育和职业背景的软件工程师面临的障碍和缓解策略

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-04 DOI: 10.1007/s10664-024-10493-1

Tavian Barnes, Ken Jen Lee, Cristina Tavares, Gema Rodríguez-Pérez, Meiyappan Nagappan

The traditional path to a software engineering career usually involves a post-secondary diploma in Software Engineering, Computer Science, or a related field. However, many individuals working as software engineers take a non-traditional path to their careers, starting from other industries or fields of study. This paper explores the barriers that individuals with non-traditional educational and occupational backgrounds face when pursuing a software engineering career and potential strategies to overcome those barriers. A two-stage methodology was used, consisting of an exploratory study followed by a follow-up survey. The exploratory study consisted of a grounded-theory-based qualitative analysis of relevant Reddit data to yield a framework around the barriers and possible mitigation strategies. These findings were then supplemented through a follow-up survey. Understanding these barriers and what strategies could be effective is an important step towards making software engineering more accessible to individuals with non-traditional backgrounds. In addition to fostering functional diversity, this might also serve to tackle labor shortages within the software engineering industry.

软件工程职业的传统途径通常包括获得软件工程、计算机科学或相关专业的大专文凭。然而，许多从事软件工程师工作的人都是从其他行业或研究领域起步，走上了非传统的职业道路。本文探讨了具有非传统教育和职业背景的个人在从事软件工程职业时所面临的障碍，以及克服这些障碍的潜在策略。本文采用了两阶段方法，包括一项探索性研究和一项后续调查。探索性研究包括对相关 Reddit 数据进行基于基础理论的定性分析，从而得出一个围绕障碍和可能的缓解策略的框架。随后，通过后续调查对这些发现进行了补充。了解这些障碍和可能有效的策略，是让非传统背景的个人更容易接触软件工程的重要一步。除了促进功能多样性之外，这还有助于解决软件工程行业劳动力短缺的问题。

{"title":"Towards understanding barriers and mitigation strategies of software engineers with non-traditional educational and occupational backgrounds","authors":"Tavian Barnes, Ken Jen Lee, Cristina Tavares, Gema Rodríguez-Pérez, Meiyappan Nagappan","doi":"10.1007/s10664-024-10493-1","DOIUrl":"https://doi.org/10.1007/s10664-024-10493-1","url":null,"abstract":"<p>The traditional path to a software engineering career usually involves a post-secondary diploma in Software Engineering, Computer Science, or a related field. However, many individuals working as software engineers take a non-traditional path to their careers, starting from other industries or fields of study. This paper explores the barriers that individuals with non-traditional educational and occupational backgrounds face when pursuing a software engineering career and potential strategies to overcome those barriers. A two-stage methodology was used, consisting of an exploratory study followed by a follow-up survey. The exploratory study consisted of a grounded-theory-based qualitative analysis of relevant Reddit data to yield a framework around the barriers and possible mitigation strategies. These findings were then supplemented through a follow-up survey. Understanding these barriers and what strategies could be effective is an important step towards making software engineering more accessible to individuals with non-traditional backgrounds. In addition to fostering functional diversity, this might also serve to tackle labor shortages within the software engineering industry.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"34 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141256011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mining architectural information: A systematic mapping study 挖掘建筑信息：系统制图研究

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-04 DOI: 10.1007/s10664-024-10480-6

Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin, Chen Yang, Zengyang Li

Mining Software Repositories (MSR) has become an essential activity in software development. Mining architectural information (e.g., architectural models) to support architecting activities, such as architecture understanding, has received significant attention in recent years. However, there is a lack of clarity on what literature on mining architectural information is available. Consequently, this may create difficulty for practitioners to understand and adopt the state-of-the-art research results, such as what approaches should be adopted to mine what architectural information in order to support architecting activities. It also hinders researchers from being aware of the challenges and remedies for the identified research gaps. We aim to identify, analyze, and synthesize the literature on mining architectural information in software repositories in terms of architectural information and sources mined, architecting activities supported, approaches and tools used, and challenges faced. A Systematic Mapping Study (SMS) has been conducted on the literature published between January 2006 and December 2022. Of the 104 primary studies finally selected, 7 categories of architectural information have been mined, among which architectural description is the most mined architectural information; 11 categories of sources have been leveraged for mining architectural information, among which version control system (e.g., GitHub) is the most popular source; 11 architecting activities can be supported by the mined architectural information, among which architecture understanding is the most supported activity; 95 approaches and 56 tools were proposed and employed in mining architectural information; and 4 types of challenges in mining architectural information were identified. This SMS provides researchers with promising future directions and help practitioners be aware of what approaches and tools can be used to mine what architectural information from what sources to support various architecting activities.

挖掘软件库（MSR）已成为软件开发中的一项基本活动。近年来，挖掘架构信息（如架构模型）以支持架构活动（如架构理解）受到了极大关注。然而，目前关于挖掘架构信息的文献尚不明确。因此，这可能会给实践者理解和采用最先进的研究成果造成困难，比如应该采用什么方法来挖掘什么架构信息，以支持架构设计活动。这也妨碍了研究人员意识到所发现的研究差距所面临的挑战和补救措施。我们的目标是，从所挖掘的架构信息和来源、所支持的架构设计活动、所使用的方法和工具以及所面临的挑战等方面，识别、分析和综合有关在软件库中挖掘架构信息的文献。我们对 2006 年 1 月至 2022 年 12 月间发表的文献进行了系统性的绘图研究（SMS）。在最终选出的 104 项主要研究中，有 7 类建筑信息被挖掘出来，其中建筑描述是被挖掘最多的建筑信息；有 11 类信息源被用于挖掘建筑信息，其中版本控制系统（如 GitHub）是最受欢迎的信息源；有 11 项建筑活动可由被挖掘的建筑信息提供支持，其中建筑理解是被支持最多的活动；在挖掘建筑信息时，提出并使用了 95 种方法和 56 种工具；并确定了挖掘建筑信息时面临的 4 类挑战。本简讯为研究人员提供了有前景的未来方向，并帮助从业人员了解可以使用哪些方法和工具从哪些来源挖掘哪些建筑信息，以支持各种建筑设计活动。

{"title":"Mining architectural information: A systematic mapping study","authors":"Musengamana Jean de Dieu, Peng Liang, Mojtaba Shahin, Chen Yang, Zengyang Li","doi":"10.1007/s10664-024-10480-6","DOIUrl":"https://doi.org/10.1007/s10664-024-10480-6","url":null,"abstract":"<p>Mining Software Repositories (MSR) has become an essential activity in software development. Mining architectural information (e.g., architectural models) to support architecting activities, such as architecture understanding, has received significant attention in recent years. However, there is a lack of clarity on what literature on mining architectural information is available. Consequently, this may create difficulty for practitioners to understand and adopt the state-of-the-art research results, such as what approaches should be adopted to mine what architectural information in order to support architecting activities. It also hinders researchers from being aware of the challenges and remedies for the identified research gaps. We aim to identify, analyze, and synthesize the literature on mining architectural information in software repositories in terms of architectural information and sources mined, architecting activities supported, approaches and tools used, and challenges faced. A Systematic Mapping Study (SMS) has been conducted on the literature published between January 2006 and December 2022. Of the 104 primary studies finally selected, 7 categories of architectural information have been mined, among which architectural description is the most mined architectural information; 11 categories of sources have been leveraged for mining architectural information, among which version control system (e.g., GitHub) is the most popular source; 11 architecting activities can be supported by the mined architectural information, among which architecture understanding is the most supported activity; 95 approaches and 56 tools were proposed and employed in mining architectural information; and 4 types of challenges in mining architectural information were identified. This SMS provides researchers with promising future directions and help practitioners be aware of what approaches and tools can be used to mine what architectural information from what sources to support various architecting activities.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"8 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141255987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A large-scale exploratory study on the proxy pattern in Ethereum 关于以太坊代理模式的大规模探索性研究

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Empirical Software Engineering

Pub Date : 2024-06-04 DOI: 10.1007/s10664-024-10485-1

Amir M. Ebrahimi, Bram Adams, Gustavo A. Oliva, Ahmed E. Hassan

The proxy pattern is a well-known design pattern with numerous use cases in several sectors of the software industry (e.g., network applications, microservices, and IoT). As such, the use of the proxy pattern is also a common approach in the development of complex decentralized applications (DApps) on the Ethereum blockchain. A contract that implements the proxy pattern (proxy contract) acts as a layer between the clients and the target contract, enabling greater flexibility (e.g., data validation checks) and upgradeability (e.g., online smart contract replacement with zero downtime) in DApp development. Despite the importance of proxy contracts, little is known about (i) how their prevalence changed over time, (ii) the ways in which developers integrate proxies in the design of DApps, and (iii) what proxy types are being most commonly leveraged by developers. In this paper, we present a large-scale exploratory study on the use of the proxy pattern in Ethereum. We analyze a dataset of all Ethereum smart contracts as of Sep. 2022 containing 50M smart contracts and 1.6B transactions, and apply both quantitative and qualitative methods in order to (i) determine the prevalence of proxy contracts, (ii) understand the ways they are deployed and integrated into applications, and (iii) uncover the prevalence of different types of proxy contracts. Our findings reveal that 14.2% of all deployed smart contracts are proxy contracts. We show that proxy contracts are being more actively used than non-proxy contracts. Also, the usage of proxy contracts in various contexts, transactions involving proxy contracts, and adoption of proxy contracts by users have shown an upward trend over time, peaking at the end of our study period. They are either deployed through off-chain scripts or on-chain factory contracts, with the former and latter being employed in 39.1% and 60.9% of identified usage contexts in turn. We found that while the majority (67.8%) of proxies act as an interceptor, 32.2% enables upgradeability. Proxy contracts are typically (79%) implemented based on known reference implementations with 29.4% being of type ERC-1167, a class of proxies that aims to cheaply reuse and clone contracts’ functionality. Our evaluation shows that our proposed behavioral proxy detection method has a precision and recall of 100% in detecting active proxies. Finally, we derive a set of practical recommendations for developers and introduce open research questions to guide future research on the topic.

代理模式是一种众所周知的设计模式，在软件行业的多个领域（如网络应用程序、微服务和物联网）都有大量使用案例。因此，在以太坊区块链上开发复杂的去中心化应用程序（DApps）时，使用代理模式也是一种常见的方法。实现代理模式的合约（代理合约）是客户端与目标合约之间的一个层级，可在 DApp 开发中实现更大的灵活性（如数据验证检查）和可升级性（如零停机时间的在线智能合约更换）。尽管代理合约非常重要，但人们对以下方面知之甚少：(i) 代理合约的流行程度随着时间的推移发生了怎样的变化；(ii) 开发者在设计 DApp 时集成代理合约的方式；(iii) 开发者最常利用的代理合约类型。在本文中，我们对代理模式在以太坊中的使用情况进行了大规模的探索性研究。我们分析了截至 2022 年 9 月所有以太坊智能合约的数据集，其中包含 5000 万份智能合约和 16 亿笔交易，并采用定量和定性方法，以（i）确定代理合约的普遍性，（ii）了解它们部署和集成到应用程序中的方式，以及（iii）揭示不同类型代理合约的普遍性。我们的研究结果表明，在所有已部署的智能合约中，有 14.2% 是代理合约。我们发现，代理合约的使用比非代理合约更为活跃。此外，随着时间的推移，代理合约在各种情况下的使用、涉及代理合约的交易以及用户对代理合约的采用都呈现出上升趋势，并在研究期结束时达到顶峰。它们或者通过链外脚本部署，或者通过链上工厂合同部署，前者和后者依次在 39.1% 和 60.9% 的已识别使用环境中使用。我们发现，虽然大多数代理（67.8%）充当拦截器，但也有 32.2% 的代理实现了可升级性。代理合约通常（79%）是基于已知的参考实现实现的，其中 29.4% 属于 ERC-1167 类型，这是一类旨在廉价重用和克隆合约功能的代理合约。我们的评估结果表明，我们提出的行为代理检测方法在检测活跃代理方面的精确度和召回率均达到了 100%。最后，我们为开发人员提出了一系列实用建议，并介绍了一些开放式研究问题，以指导未来的相关研究。

{"title":"A large-scale exploratory study on the proxy pattern in Ethereum","authors":"Amir M. Ebrahimi, Bram Adams, Gustavo A. Oliva, Ahmed E. Hassan","doi":"10.1007/s10664-024-10485-1","DOIUrl":"https://doi.org/10.1007/s10664-024-10485-1","url":null,"abstract":"<p>The proxy pattern is a well-known design pattern with numerous use cases in several sectors of the software industry (e.g., network applications, microservices, and IoT). As such, the use of the proxy pattern is also a common approach in the development of complex decentralized applications (DApps) on the Ethereum blockchain. A contract that implements the proxy pattern (proxy contract) acts as a layer between the clients and the target contract, enabling greater flexibility (e.g., data validation checks) and upgradeability (e.g., online smart contract replacement with zero downtime) in DApp development. Despite the importance of proxy contracts, little is known about (i) how their prevalence changed over time, (ii) the ways in which developers integrate proxies in the design of DApps, and (iii) what proxy types are being most commonly leveraged by developers. In this paper, we present a large-scale exploratory study on the use of the proxy pattern in Ethereum. We analyze a dataset of all Ethereum smart contracts as of Sep. 2022 containing 50M smart contracts and 1.6B transactions, and apply both quantitative and qualitative methods in order to (i) determine the prevalence of proxy contracts, (ii) understand the ways they are deployed and integrated into applications, and (iii) uncover the prevalence of different types of proxy contracts. Our findings reveal that 14.2% of all deployed smart contracts are proxy contracts. We show that proxy contracts are being more actively used than non-proxy contracts. Also, the usage of proxy contracts in various contexts, transactions involving proxy contracts, and adoption of proxy contracts by users have shown an upward trend over time, peaking at the end of our study period. They are either deployed through off-chain scripts or on-chain factory contracts, with the former and latter being employed in 39.1% and 60.9% of identified usage contexts in turn. We found that while the majority (67.8%) of proxies act as an interceptor, 32.2% enables upgradeability. Proxy contracts are typically (79%) implemented based on known reference implementations with 29.4% being of type ERC-1167, a class of proxies that aims to cheaply reuse and clone contracts’ functionality. Our evaluation shows that our proposed behavioral proxy detection method has a precision and recall of 100% in detecting active proxies. Finally, we derive a set of practical recommendations for developers and introduce open research questions to guide future research on the topic.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"51 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141256002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0