首页 > 最新文献

Information and Software Technology最新文献

英文 中文
Machine learning for requirements engineering (ML4RE): A systematic literature review complemented by practitioners’ voices from Stack Overflow 需求工程的机器学习(ML4RE):系统性文献综述,辅以 Stack Overflow 上从业人员的声音
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-27 DOI: 10.1016/j.infsof.2024.107477
Tong Li, Xinran Zhang, Yunduo Wang, Qixiang Zhou, Yiting Wang, Fangqi Dong

Context:

The research of machine learning for requirements engineering (ML4RE) has attracted more and more attention from researchers and practitioners. Although pioneering research has shown the potential of using ML techniques to improve RE practices, there lacks a systematic and comprehensive literature review in academia that integrates an industrial perspective. Specifically, none of the reviews available in ML4RE have considered the grey literature, which is primarily from practitioner origin and is more reflective of the real issues and challenges faced in practice.

Objective:

In this paper, we conduct a systematic survey of academic publications in ML4RE and complement it with the practitioners’ voices from Stack Overflow to complete a comprehensive literature review. Our research objective is to provide a comprehensive view of the current research progress in ML4RE, present the main questions and challenges faced in RE practice, understand the gap between research and practice, and provide our insights into how the RE academic domain can pragmatically develop in the future.

Method:

We systematically investigated 207 academic papers on ML4RE from 2010 to 2022, along with 375 questions related to RE practices on Stack Overflow and their corresponding answers. Our analysis encompassed their trends, focused RE activities and tasks, employed solutions, and associated data. Finally, we conducted a joint analysis, contrasting the outcomes of both parts.

Results:

Based on the statistical results from collected literature, we summarize an academic roadmap and analyse the disparities, offering research recommendations. Our suggestions include the development of intelligent question-answering assistants employing large language models, the integration of machine learning into industrial tools, and the promotion of collaboration between academia and industry.

Conclusion:

This study contributes by providing a holistic view of ML4RE, delineating disparities between research and practice, and proposing pragmatic suggestions to bridge the academia-industry gap.

背景:面向需求工程的机器学习(ML4RE)研究吸引了越来越多研究人员和从业人员的关注。尽管开创性的研究已经显示了使用机器学习技术改进需求工程实践的潜力,但学术界还缺乏系统而全面的文献综述,将工业视角纳入其中。具体来说,现有的 ML4RE 综述都没有考虑灰色文献,而灰色文献主要来自实践者,更能反映实践中面临的实际问题和挑战。目标:在本文中,我们对 ML4RE 方面的学术出版物进行了系统调查,并辅以 Stack Overflow 中实践者的声音,完成了一份全面的文献综述。我们的研究目标是全面了解 ML4RE 目前的研究进展,提出 RE 实践中面临的主要问题和挑战,了解研究与实践之间的差距,并就 RE 学术领域未来如何务实发展提出自己的见解。方法:我们系统调查了 2010 年至 2022 年期间有关 ML4RE 的 207 篇学术论文,以及 Stack Overflow 上有关 RE 实践的 375 个问题及其相应答案。我们的分析包括其趋势、重点可再生能源活动和任务、采用的解决方案以及相关数据。最后,我们进行了联合分析,对两部分的结果进行了对比。结果:基于收集到的文献统计结果,我们总结了学术路线图,分析了差异,并提出了研究建议。我们的建议包括开发采用大型语言模型的智能问题解答助手,将机器学习整合到工业工具中,以及促进学术界与工业界之间的合作。结论:本研究提供了有关 ML4RE 的整体观点,划分了研究与实践之间的差距,并提出了弥合学术界与工业界差距的务实建议。
{"title":"Machine learning for requirements engineering (ML4RE): A systematic literature review complemented by practitioners’ voices from Stack Overflow","authors":"Tong Li,&nbsp;Xinran Zhang,&nbsp;Yunduo Wang,&nbsp;Qixiang Zhou,&nbsp;Yiting Wang,&nbsp;Fangqi Dong","doi":"10.1016/j.infsof.2024.107477","DOIUrl":"https://doi.org/10.1016/j.infsof.2024.107477","url":null,"abstract":"<div><h3>Context:</h3><p>The research of machine learning for requirements engineering (ML4RE) has attracted more and more attention from researchers and practitioners. Although pioneering research has shown the potential of using ML techniques to improve RE practices, there lacks a systematic and comprehensive literature review in academia that integrates an industrial perspective. Specifically, none of the reviews available in ML4RE have considered the grey literature, which is primarily from practitioner origin and is more reflective of the real issues and challenges faced in practice.</p></div><div><h3>Objective:</h3><p>In this paper, we conduct a systematic survey of academic publications in ML4RE and complement it with the practitioners’ voices from Stack Overflow to complete a comprehensive literature review. Our research objective is to provide a comprehensive view of the current research progress in ML4RE, present the main questions and challenges faced in RE practice, understand the gap between research and practice, and provide our insights into how the RE academic domain can pragmatically develop in the future.</p></div><div><h3>Method:</h3><p>We systematically investigated 207 academic papers on ML4RE from 2010 to 2022, along with 375 questions related to RE practices on Stack Overflow and their corresponding answers. Our analysis encompassed their trends, focused RE activities and tasks, employed solutions, and associated data. Finally, we conducted a joint analysis, contrasting the outcomes of both parts.</p></div><div><h3>Results:</h3><p>Based on the statistical results from collected literature, we summarize an academic roadmap and analyse the disparities, offering research recommendations. Our suggestions include the development of intelligent question-answering assistants employing large language models, the integration of machine learning into industrial tools, and the promotion of collaboration between academia and industry.</p></div><div><h3>Conclusion:</h3><p>This study contributes by providing a holistic view of ML4RE, delineating disparities between research and practice, and proposing pragmatic suggestions to bridge the academia-industry gap.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107477"},"PeriodicalIF":3.9,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S095058492400082X/pdfft?md5=63ea9a0df5bff96f324d63b42a81b4cb&pid=1-s2.0-S095058492400082X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140879144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Studying and recommending information highlighting in Stack Overflow answers 研究并推荐 Stack Overflow 答案中的高亮信息
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-27 DOI: 10.1016/j.infsof.2024.107478
Shahla Shaan Ahmed , Shaowei Wang , Yuan Tian , Tse-Hsun (Peter) Chen , Haoxiang Zhang

Context:

Navigating the knowledge of Stack Overflow (SO) remains challenging. To make the posts vivid to users, SO allows users to write and edit posts with Markdown or HTML so that users can leverage various formatting styles (e.g., bold, italic, and code) to highlight the important information. Nonetheless, there have been limited studies on the highlighted information.

Objective:

We carried out the first large-scale exploratory study on the information highlighted in SO answers in our recent study. To extend our previous study, we develop approaches to automatically recommend highlighted content with formatting styles using neural network architectures initially designed for the Named Entity Recognition task.

Method:

In this paper, we studied 31,169,429 answers of Stack Overflow. For training recommendation models, we choose CNN-based and BERT-based models for each type of formatting (i.e., Bold, Italic, Code, and Heading) using the information highlighting dataset we collected from SO answers.

Results:

Our models achieve a precision ranging from 0.50 to 0.72 for different formatting types. It is easier to build a model to recommend Code than other types. Models for text formatting types (i.e., Heading, Bold, and Italic) suffer low recall. Our analysis of failure cases indicates that the majority of the failure cases are due to missing identification. One explanation is that the models are easy to learn the frequent highlighted words while struggling to learn less frequent words (i.g., long-tail knowledge).

Conclusion:

Our findings suggest that it is possible to develop recommendation models for highlighting information for answers with different formatting styles on Stack Overflow.

背景:浏览 Stack Overflow(SO)的知识仍然具有挑战性。为了让用户生动地阅读帖子,SO允许用户使用Markdown或HTML编写和编辑帖子,这样用户就可以利用各种格式样式(如粗体、斜体和代码)来突出重要信息。目标:我们在最近的研究中首次对SO答案中的高亮信息进行了大规模的探索性研究。方法:本文研究了 Stack Overflow 的 31,169,429 个答案。在训练推荐模型时,我们使用从 Stack Overflow 答案中收集的信息高亮数据集,针对每种格式类型(即粗体、斜体、代码和标题)选择了基于 CNN 和基于 BERT 的模型。与其他类型相比,建立推荐 "代码 "的模型更容易。文本格式类型(即标题、粗体和斜体)的模型召回率较低。我们对失败案例的分析表明,大多数失败案例都是由于识别缺失造成的。结论:我们的研究结果表明,开发针对 Stack Overflow 上不同格式风格答案的高亮信息推荐模型是可行的。
{"title":"Studying and recommending information highlighting in Stack Overflow answers","authors":"Shahla Shaan Ahmed ,&nbsp;Shaowei Wang ,&nbsp;Yuan Tian ,&nbsp;Tse-Hsun (Peter) Chen ,&nbsp;Haoxiang Zhang","doi":"10.1016/j.infsof.2024.107478","DOIUrl":"https://doi.org/10.1016/j.infsof.2024.107478","url":null,"abstract":"<div><h3>Context:</h3><p>Navigating the knowledge of Stack Overflow (SO) remains challenging. To make the posts vivid to users, SO allows users to write and edit posts with Markdown or HTML so that users can leverage various formatting styles (e.g., bold, italic, and code) to highlight the important information. Nonetheless, there have been limited studies on the highlighted information.</p></div><div><h3>Objective:</h3><p>We carried out the first large-scale exploratory study on the information highlighted in SO answers in our recent study. To extend our previous study, we develop approaches to automatically recommend highlighted content with formatting styles using neural network architectures initially designed for the Named Entity Recognition task.</p></div><div><h3>Method:</h3><p>In this paper, we studied 31,169,429 answers of Stack Overflow. For training recommendation models, we choose CNN-based and BERT-based models for each type of formatting (i.e., Bold, Italic, Code, and Heading) using the information highlighting dataset we collected from SO answers.</p></div><div><h3>Results:</h3><p>Our models achieve a precision ranging from 0.50 to 0.72 for different formatting types. It is easier to build a model to recommend Code than other types. Models for text formatting types (i.e., Heading, Bold, and Italic) suffer low recall. Our analysis of failure cases indicates that the majority of the failure cases are due to missing identification. One explanation is that the models are easy to learn the frequent highlighted words while struggling to learn less frequent words (i.g., long-tail knowledge).</p></div><div><h3>Conclusion:</h3><p>Our findings suggest that it is possible to develop recommendation models for highlighting information for answers with different formatting styles on Stack Overflow.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107478"},"PeriodicalIF":3.9,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140894714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CriticalFuzz: A critical neuron coverage-guided fuzz testing framework for deep neural networks CriticalFuzz:用于深度神经网络的临界神经元覆盖引导模糊测试框架
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-24 DOI: 10.1016/j.infsof.2024.107476
Tongtong Bai , Song Huang , Yifan Huang , Xingya Wang , Chunyan Xia , Yubin Qu , Zhen Yang

Context:

Deep neural networks (DNN) have been widely deployed in safety-critical domains, such as autonomous cars and healthcare, where error behaviors can lead to serious accidents, testing DNN is extremely important. Neuron coverage-guided fuzz testing (NCFT) has become an effective whitebox testing approach for testing DNN, which iteratively generates new test cases with the guidance of neuron coverage to explore different logics of DNN, and has found numerous defects. However, existing NCFT approaches ignore that the role of neurons is distinct for the final output of DNN. Given an input, only a fraction of neurons determines the final output of the DNN. These neurons hold the essential logic of the DNN.

Objective:

To ensure the quality of DNN and improve testing efficiency, NCFT should first cover neurons containing major logic of DNN.

Method:

In this paper, we propose the critical neurons that hold essential logic of DNN. In order to prioritize the detection of potential defects of critical neurons, we propose a fuzz testing framework, named CriticalFuzz, which mainly contains the energy-based test case generation and the critical neuron coverage criteria. The energy-based test case generation has the capability to produce test cases that are more likely to cover critical neurons and involves energy-based seed selection, power schedule, and seed mutation. The critical neuron coverage as a mechanism for providing feedback to guide the CriticalFuzz in prioritizing the coverage of critical neurons. To evaluate the significance of critical neurons and the performance of CriticalFuzz, we conducted experiments on popular DNNs and datasets.

Results:

The experiment results show that (1) the critical neurons have a 100% impact on the output of models, while the non-critical neurons have a lesser effect; (2) CriticalFuzz is effective in achieving 100% coverage of critical neurons and covering 10 classes of critical neurons, outperforming both DeepHunter and TensorFuzz. (3) CriticalFuzz exhibits exceptional error detection capabilities, successfully identifying thousands of errors across 10 diverse error classes within DNN.

Conclusion:

The critical neurons defined in this paper hold more significant logic of DNN than non-critical neurons. CriticalFuzz can preferentially cover critical neurons, thereby improving the efficiency of the NCFT process. Additionally, CriticalFuzz is capable of identifying a greater number of errors, thus enhancing the reliability and effectiveness of the NCFT.

背景:深度神经网络(DNN)已被广泛部署在自动驾驶汽车和医疗保健等安全关键领域,在这些领域中,错误行为可能导致严重事故,因此对DNN的测试极为重要。神经元覆盖引导模糊测试(NCFT)已成为一种有效的 DNN 白盒测试方法,它在神经元覆盖的引导下迭代生成新的测试用例,探索 DNN 的不同逻辑,发现了大量缺陷。然而,现有的 NCFT 方法忽视了神经元对 DNN 最终输出的不同作用。给定输入后,只有一部分神经元决定 DNN 的最终输出。方法:本文提出了 DNN 重要逻辑的关键神经元。为了优先检测关键神经元的潜在缺陷,我们提出了一个模糊测试框架,命名为 CriticalFuzz,主要包括基于能量的测试用例生成和关键神经元覆盖标准。基于能量的测试用例生成能够生成更有可能覆盖临界神经元的测试用例,包括基于能量的种子选择、功率调度和种子突变。临界神经元覆盖率作为一种反馈机制,可指导 CriticalFuzz 优先覆盖临界神经元。为了评估临界神经元的意义和 CriticalFuzz 的性能,我们在流行的 DNN 和数据集上进行了实验。结果:实验结果表明:(1)临界神经元对模型输出的影响是 100% 的,而非临界神经元的影响较小;(2)CriticalFuzz 能够有效地实现临界神经元的 100% 覆盖,并覆盖 10 类临界神经元,性能优于 DeepHunter 和 TensorFuzz。(结论:与非临界神经元相比,本文定义的临界神经元拥有更重要的 DNN 逻辑。CriticalFuzz 可以优先覆盖关键神经元,从而提高 NCFT 过程的效率。此外,CriticalFuzz 还能识别更多的错误,从而提高 NCFT 的可靠性和有效性。
{"title":"CriticalFuzz: A critical neuron coverage-guided fuzz testing framework for deep neural networks","authors":"Tongtong Bai ,&nbsp;Song Huang ,&nbsp;Yifan Huang ,&nbsp;Xingya Wang ,&nbsp;Chunyan Xia ,&nbsp;Yubin Qu ,&nbsp;Zhen Yang","doi":"10.1016/j.infsof.2024.107476","DOIUrl":"10.1016/j.infsof.2024.107476","url":null,"abstract":"<div><h3>Context:</h3><p>Deep neural networks (DNN) have been widely deployed in safety-critical domains, such as autonomous cars and healthcare, where error behaviors can lead to serious accidents, testing DNN is extremely important. Neuron coverage-guided fuzz testing (NCFT) has become an effective whitebox testing approach for testing DNN, which iteratively generates new test cases with the guidance of neuron coverage to explore different logics of DNN, and has found numerous defects. However, existing NCFT approaches ignore that the role of neurons is distinct for the final output of DNN. Given an input, only a fraction of neurons determines the final output of the DNN. These neurons hold the essential logic of the DNN.</p></div><div><h3>Objective:</h3><p>To ensure the quality of DNN and improve testing efficiency, NCFT should first cover neurons containing major logic of DNN.</p></div><div><h3>Method:</h3><p>In this paper, we propose the critical neurons that hold essential logic of DNN. In order to prioritize the detection of potential defects of critical neurons, we propose a fuzz testing framework, named CriticalFuzz, which mainly contains the energy-based test case generation and the critical neuron coverage criteria. The energy-based test case generation has the capability to produce test cases that are more likely to cover critical neurons and involves energy-based seed selection, power schedule, and seed mutation. The critical neuron coverage as a mechanism for providing feedback to guide the CriticalFuzz in prioritizing the coverage of critical neurons. To evaluate the significance of critical neurons and the performance of CriticalFuzz, we conducted experiments on popular DNNs and datasets.</p></div><div><h3>Results:</h3><p>The experiment results show that (1) the critical neurons have a 100% impact on the output of models, while the non-critical neurons have a lesser effect; (2) CriticalFuzz is effective in achieving 100% coverage of critical neurons and covering 10 classes of critical neurons, outperforming both DeepHunter and TensorFuzz. (3) CriticalFuzz exhibits exceptional error detection capabilities, successfully identifying thousands of errors across 10 diverse error classes within DNN.</p></div><div><h3>Conclusion:</h3><p>The critical neurons defined in this paper hold more significant logic of DNN than non-critical neurons. CriticalFuzz can preferentially cover critical neurons, thereby improving the efficiency of the NCFT process. Additionally, CriticalFuzz is capable of identifying a greater number of errors, thus enhancing the reliability and effectiveness of the NCFT.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107476"},"PeriodicalIF":3.9,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140785527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Systematic Literature Review on Software Maintenance Offshoring Decisions 关于软件维护离岸外包决策的系统性文献综述
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-21 DOI: 10.1016/j.infsof.2024.107475
Hanif Ur Rahman , Alberto Rodrigues da Silva , Asaad Alzayed , Mushtaq Raza

Context

Over the last decades, the rapid expansion of the internet has prompted an increasing number of organizations that have taken their work global and have outsourced their information technology (IT) activities to specialized suppliers. The longest part of the software life cycle includes software maintenance, which consumes 60-70% of the total IT budget. Therefore, organizations have adopted offshoring strategies to reduce maintenance costs and free up resources to focus on their core competencies. Offshore outsourcing decision-making involves technical, social, and other influencing factors; however, there is a limited understanding of the key factors associated with offshoring software maintenance within the global software development context.

Objective

This work presents the factors that have influenced the decision-making process of offshoring software maintenance. Further, this research sheds light on decision-making by identifying the models, frameworks, and software tools used within this context.

Method

A systematic literature review is conducted, delving into the factors related to the decision-making and analyzing the models, frameworks and tools supporting offshoring software maintenance.

Results

This study identifies the top 10 key factors concerning the decision-making process, namely human communication, cost reduction, organizational and employee maturity, project management practices, IT infrastructure support, language constraints, knowledge-based support, changes in requirements, legal issues and cultural diversity. In addition, the models, frameworks, and tools used in the decision-making process of software maintenance are analyzed, and research gaps are identified.

Conclusion

The findings reveal that the software industry lacks effective and efficient models tailored explicitly for software offshoring within the global software development landscape. Overall, this study provides valuable insights into the decision-making dynamics of software maintenance offshoring by identifying key factors and research gaps that can pave the way for developing more effective decision support systems.

背景过去几十年来,互联网的迅速发展促使越来越多的组织将其工作推向全球,并将其信息技术(IT)活动外包给专业供应商。软件生命周期中最长的部分包括软件维护,其费用占信息技术总预算的 60-70%。因此,各组织采取了离岸外包战略,以降低维护成本,腾出资源,集中精力提高核心竞争力。离岸外包决策涉及技术、社会和其他影响因素;然而,人们对全球软件开发背景下与软件维护离岸外包相关的关键因素了解有限。结果这项研究确定了与决策过程有关的十大关键因素,即人与人之间的交流、降低成本、组织和员工的成熟度、项目管理实践、信息技术基础设施支持、语言限制、基于知识的支持、需求变化、法律问题和文化多样性。此外,还对软件维护决策过程中使用的模式、框架和工具进行了分析,并找出了 研究方面的差距。总之,本研究通过找出关键因素和研究空白,为开发更有效的决策支持系统铺平了 道路,从而为软件维护离岸外包的决策动态提供了有价值的见解。
{"title":"A Systematic Literature Review on Software Maintenance Offshoring Decisions","authors":"Hanif Ur Rahman ,&nbsp;Alberto Rodrigues da Silva ,&nbsp;Asaad Alzayed ,&nbsp;Mushtaq Raza","doi":"10.1016/j.infsof.2024.107475","DOIUrl":"10.1016/j.infsof.2024.107475","url":null,"abstract":"<div><h3>Context</h3><p>Over the last decades, the rapid expansion of the internet has prompted an increasing number of organizations that have taken their work global and have outsourced their information technology (IT) activities to specialized suppliers. The longest part of the software life cycle includes software maintenance, which consumes 60-70% of the total IT budget. Therefore, organizations have adopted offshoring strategies to reduce maintenance costs and free up resources to focus on their core competencies. Offshore outsourcing decision-making involves technical, social, and other influencing factors; however, there is a limited understanding of the key factors associated with offshoring software maintenance within the global software development context.</p></div><div><h3>Objective</h3><p>This work presents the factors that have influenced the decision-making process of offshoring software maintenance. Further, this research sheds light on decision-making by identifying the models, frameworks, and software tools used within this context.</p></div><div><h3>Method</h3><p>A systematic literature review is conducted, delving into the factors related to the decision-making and analyzing the models, frameworks and tools supporting offshoring software maintenance.</p></div><div><h3>Results</h3><p>This study identifies the top 10 key factors concerning the decision-making process, namely human communication, cost reduction, organizational and employee maturity, project management practices, IT infrastructure support, language constraints, knowledge-based support, changes in requirements, legal issues and cultural diversity. In addition, the models, frameworks, and tools used in the decision-making process of software maintenance are analyzed, and research gaps are identified.</p></div><div><h3>Conclusion</h3><p>The findings reveal that the software industry lacks effective and efficient models tailored explicitly for software offshoring within the global software development landscape. Overall, this study provides valuable insights into the decision-making dynamics of software maintenance offshoring by identifying key factors and research gaps that can pave the way for developing more effective decision support systems.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107475"},"PeriodicalIF":3.9,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140796189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic build repair for test cases using incompatible Java versions 对使用不兼容 Java 版本的测试用例进行自动构建修复
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-18 DOI: 10.1016/j.infsof.2024.107473
Ching Hang Mak, Shing-Chi Cheung

Context:

Bug bisection is a common technique used to identify a revision that introduces a bug or indirectly fixes a bug, and often involves executing multiple revisions of a project to determine whether the bug is present within the revision. However, many legacy revisions often cannot be successfully compiled due to changes in the programming language or tools used in the compilation process, adding complexity and preventing automation in the bisection process.

Objective:

In this paper, we introduce an approach to repair test cases of Java projects by performing dependency minimization. Our approach aims to remove classes and methods that are not required for the execution of one or more test cases. Unlike existing state-of-the-art techniques, our approach performs minimization at source-level, which allows compile-time errors to be fixed.

Methods:

A standalone Java tool implementing our technique was developed, and we evaluated our technique using subjects from Defects4J retargeted against Java 8 and 17.

Results:

Our evaluation showed that a majority of subjects can be repaired solely by performing minimization, including replicating the test results of the original version. Furthermore, our technique is also shown to achieve accurate minimized results, while only adding a small overhead to the bisection process.

Conclusion:

Our proposed technique is shown to be effective for repairing build failures with minimal overhead, making it suitable for use in automated bug bisection. Our tool can also be adapted for use cases such as bug corpus creation and refactoring.

背景:错误分割是一种常用技术,用于识别引入错误或间接修复错误的修订版,通常涉及执行项目的多个修订版,以确定修订版中是否存在错误。然而,由于编程语言或编译过程中使用的工具发生了变化,许多遗留修订版往往无法成功编译,从而增加了复杂性,并阻碍了修正过程的自动化。目标:本文介绍了一种通过执行依赖最小化来修复 Java 项目测试用例的方法。我们的方法旨在删除执行一个或多个测试用例时不需要的类和方法。与现有的一流技术不同,我们的方法是在源代码级执行最小化,这样就可以修复编译时错误。方法:我们开发了一个独立的 Java 工具来实现我们的技术,并使用 Defects4J 中针对 Java 8 和 Java 17 重定向的测试对象对我们的技术进行了评估。此外,我们的技术还能实现精确的最小化结果,同时仅为错误分割过程增加少量开销。结论:我们提出的技术能以最小的开销有效修复构建失败,因此适合用于自动错误分割。我们的工具还可用于创建错误语料库和重构等使用案例。
{"title":"Automatic build repair for test cases using incompatible Java versions","authors":"Ching Hang Mak,&nbsp;Shing-Chi Cheung","doi":"10.1016/j.infsof.2024.107473","DOIUrl":"10.1016/j.infsof.2024.107473","url":null,"abstract":"<div><h3>Context:</h3><p>Bug bisection is a common technique used to identify a revision that introduces a bug or indirectly fixes a bug, and often involves executing multiple revisions of a project to determine whether the bug is present within the revision. However, many legacy revisions often cannot be successfully compiled due to changes in the programming language or tools used in the compilation process, adding complexity and preventing automation in the bisection process.</p></div><div><h3>Objective:</h3><p>In this paper, we introduce an approach to repair test cases of Java projects by performing dependency minimization. Our approach aims to remove classes and methods that are not required for the execution of one or more test cases. Unlike existing state-of-the-art techniques, our approach performs minimization at source-level, which allows compile-time errors to be fixed.</p></div><div><h3>Methods:</h3><p>A standalone Java tool implementing our technique was developed, and we evaluated our technique using subjects from Defects4J retargeted against Java 8 and 17.</p></div><div><h3>Results:</h3><p>Our evaluation showed that a majority of subjects can be repaired solely by performing minimization, including replicating the test results of the original version. Furthermore, our technique is also shown to achieve accurate minimized results, while only adding a small overhead to the bisection process.</p></div><div><h3>Conclusion:</h3><p>Our proposed technique is shown to be effective for repairing build failures with minimal overhead, making it suitable for use in automated bug bisection. Our tool can also be adapted for use cases such as bug corpus creation and refactoring.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107473"},"PeriodicalIF":3.9,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140756470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Acceptance behavior theories and models in software engineering — A mapping study 软件工程中的接受行为理论和模型--映射研究
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-16 DOI: 10.1016/j.infsof.2024.107469
Jürgen Börstler , Nauman bin Ali , Kai Petersen , Emelie Engström

Context:

The adoption or acceptance of new technologies or ways of working in software development activities is a recurrent topic in the software engineering literature. The topic has, therefore, been empirically investigated extensively. It is, however, unclear which theoretical frames of reference are used in this research to explain acceptance behaviors.

Objective:

In this study, we explore how major theories and models of acceptance behavior have been used in the software engineering literature to empirically investigate acceptance behavior.

Method:

We conduct a systematic mapping study of empirical studies using acceptance behavior theories in software engineering.

Results:

We identified 47 primary studies covering 56 theory uses. The theories were categorized into six groups. Technology acceptance models (TAM and its extensions) were used in 29 of the 47 primary studies, innovation theories in 10, and the theories of planned behavior/ reasoned action (TPB/TRA) in six. All other theories were used in at most two of the primary studies. The usage and operationalization of the theories were, in many cases, inconsistent with the underlying theories. Furthermore, we identified 77 constructs used by these studies of which many lack clear definitions.

Conclusions:

Our results show that software engineering researchers are aware of some of the leading theories and models of acceptance behavior, which indicates an attempt to have more theoretical foundations. However, we identified issues related to theory usage that make it difficult to aggregate and synthesize results across studies. We propose mitigation actions that encourage the consistent use of theories and emphasize the measurement of key constructs.

背景:在软件开发活动中采用或接受新技术或工作方式是软件工程文献中经常出现的话题。因此,对这一主题进行了广泛的实证研究。方法:我们对软件工程中使用接受行为理论的实证研究进行了系统的映射研究。结果:我们确定了 47 项主要研究,涉及 56 种理论的使用。这些理论被分为六组。47 项主要研究中有 29 项使用了技术接受模型(TAM 及其扩展),10 项使用了创新理论,6 项使用了计划行为理论/合理行动理论(TPB/TRA)。所有其他理论最多在两项主要研究中使用。在许多情况下,这些理论的使用和操作与基本理论不一致。结论:我们的研究结果表明,软件工程研究人员了解接受行为的一些主要理论和模型,这表明他们在尝试建立更多的理论基础。然而,我们也发现了一些与理论使用相关的问题,这些问题使得我们很难汇总和综合各项研究的结果。我们建议采取一些缓解措施,鼓励统一使用理论,并强调对关键结构的测量。
{"title":"Acceptance behavior theories and models in software engineering — A mapping study","authors":"Jürgen Börstler ,&nbsp;Nauman bin Ali ,&nbsp;Kai Petersen ,&nbsp;Emelie Engström","doi":"10.1016/j.infsof.2024.107469","DOIUrl":"https://doi.org/10.1016/j.infsof.2024.107469","url":null,"abstract":"<div><h3>Context:</h3><p>The adoption or acceptance of new technologies or ways of working in software development activities is a recurrent topic in the software engineering literature. The topic has, therefore, been empirically investigated extensively. It is, however, unclear which theoretical frames of reference are used in this research to explain acceptance behaviors.</p></div><div><h3>Objective:</h3><p>In this study, we explore how major theories and models of acceptance behavior have been used in the software engineering literature to empirically investigate acceptance behavior.</p></div><div><h3>Method:</h3><p>We conduct a systematic mapping study of empirical studies using acceptance behavior theories in software engineering.</p></div><div><h3>Results:</h3><p>We identified 47 primary studies covering 56 theory uses. The theories were categorized into six groups. Technology acceptance models (TAM and its extensions) were used in 29 of the 47 primary studies, innovation theories in 10, and the theories of planned behavior/ reasoned action (TPB/TRA) in six. All other theories were used in at most two of the primary studies. The usage and operationalization of the theories were, in many cases, inconsistent with the underlying theories. Furthermore, we identified 77 constructs used by these studies of which many lack clear definitions.</p></div><div><h3>Conclusions:</h3><p>Our results show that software engineering researchers are aware of some of the leading theories and models of acceptance behavior, which indicates an attempt to have more theoretical foundations. However, we identified issues related to theory usage that make it difficult to aggregate and synthesize results across studies. We propose mitigation actions that encourage the consistent use of theories and emphasize the measurement of key constructs.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107469"},"PeriodicalIF":3.9,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0950584924000740/pdfft?md5=2b39458871e60592c3bd5ed7e83cf658&pid=1-s2.0-S0950584924000740-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140644403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic test cases generation from formal contracts 根据正式合同自动生成测试用例
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-16 DOI: 10.1016/j.infsof.2024.107467
Samuel Jiménez Gil , Manuel I. Capel , Gabriel Olea Olea

Context:

Software verification for critical systems is facing an unprecedented cost increase due to the large amount of software packed in multicore platforms generally. A substantial amount of the verification efforts are dedicated to testing. Spark/Ada is a language often employed in safety-critical systems due to its high reliability. Formal contracts are often inserted in Spark’s program specification to be used by a static theorem prover that checks whether the specification conforms with the implementation. However, this static analysis has its limitations as certain bugs can only be spotted through software testing.

Objective:

The main goal of our work is to use these formal contracts in Spark as input for a test oracle – whose method we describe – to generate test cases. Subsequent objectives consist of a) arguing about the traceability to comply with safety-critical software standards such as DO-178C for civil avionics and b) embracing the best-established software testing methods for these systems.

Method:

Our test generation method reads Spark formal contracts and applies Equivalence Class Partitioning with Boundary Analysis as a software testing method generating traceable test cases.

Results:

The evaluation, which uses an array of open-source examples of Spark contracts, shows a high level of passed test cases and statement coverage. The results are also compared against a random test generator.

Conclusion:

The proposed method is very effective at achieving a high number of passed test cases and coverage. We make the case that the effort to create formal specifications for Spark can be used both for proof and (automatic) testing. Lastly, we noticed that some formal contracts are more suitable than others for our test generation.

背景:由于多核平台中普遍存在大量软件,关键系统的软件验证成本正面临前所未有的增长。大量的验证工作专门用于测试。Spark/Ada 因其高可靠性而成为安全关键型系统中经常使用的语言。在 Spark 的程序规范中通常会插入形式契约,以便由静态定理检验器检查规范是否与实现相符。目标:我们工作的主要目标是将 Spark 中的这些形式化合约作为测试甲骨文的输入(我们将介绍其方法),以生成测试用例。方法:我们的测试生成方法读取 Spark 形式化合约,并将等价类划分与边界分析作为软件测试方法来生成可跟踪的测试用例。结果:评估使用了一系列 Spark 合约的开源示例,显示了高水平的通过测试用例和语句覆盖率。结论:所提出的方法在实现高通过测试用例数和覆盖率方面非常有效。我们认为,为 Spark 创建正式规范的努力既可用于证明,也可用于(自动)测试。最后,我们注意到某些形式化合约比其他合约更适合我们的测试生成。
{"title":"Automatic test cases generation from formal contracts","authors":"Samuel Jiménez Gil ,&nbsp;Manuel I. Capel ,&nbsp;Gabriel Olea Olea","doi":"10.1016/j.infsof.2024.107467","DOIUrl":"https://doi.org/10.1016/j.infsof.2024.107467","url":null,"abstract":"<div><h3>Context:</h3><p>Software verification for critical systems is facing an unprecedented cost increase due to the large amount of software packed in multicore platforms generally. A substantial amount of the verification efforts are dedicated to testing. Spark/Ada is a language often employed in safety-critical systems due to its high reliability. Formal contracts are often inserted in Spark’s program specification to be used by a static theorem prover that checks whether the specification conforms with the implementation. However, this static analysis has its limitations as certain bugs can only be spotted through software testing.</p></div><div><h3>Objective:</h3><p>The main goal of our work is to use these formal contracts in Spark as input for a test oracle – whose method we describe – to generate test cases. Subsequent objectives consist of a) arguing about the traceability to comply with safety-critical software standards such as DO-178C for civil avionics and b) embracing the best-established software testing methods for these systems.</p></div><div><h3>Method:</h3><p>Our test generation method reads Spark formal contracts and applies Equivalence Class Partitioning with Boundary Analysis as a software testing method generating traceable test cases.</p></div><div><h3>Results:</h3><p>The evaluation, which uses an array of open-source examples of Spark contracts, shows a high level of passed test cases and statement coverage. The results are also compared against a random test generator.</p></div><div><h3>Conclusion:</h3><p>The proposed method is very effective at achieving a high number of passed test cases and coverage. We make the case that the effort to create formal specifications for Spark can be used both for proof and (automatic) testing. Lastly, we noticed that some formal contracts are more suitable than others for our test generation.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107467"},"PeriodicalIF":3.9,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0950584924000727/pdfft?md5=80c3283544002febbccadca1ed428ad0&pid=1-s2.0-S0950584924000727-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140644404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Forward-Oriented Programming: A meta-DSL for fast development of component libraries 前向编程:用于快速开发组件库的元 DSL
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-16 DOI: 10.1016/j.infsof.2024.107474
Emmanouil Krasanakis, Andreas Symeonidis

Libraries that implement Domain-Specific Language (DSL) components keep gaining traction when it comes to developing software for specific application domains. However, creating components that can be organically weaved into use cases is an extremely complex task. In this work, we introduce a meta-DSL to assist library development, called Forward-Oriented Programming (FOP). This combines lazy evaluation and aspect-oriented programming principles to align crosscutting component configurations and alter their execution outcomes depending on usage in subsequent code. Theoretical analysis shows that FOP simplifies component development and makes their combination logic learnable by library users. We realize the paradigm with a Python package, called pyfop, and conduct a case study that compares it with purely functional and object-oriented library implementations. In the study, source code quality metrics demonstrate reduced time and effort to write library components, and increased comprehensibility. Configurations are shared without modifying distant code segments.

在为特定应用领域开发软件时,实现特定领域语言(DSL)组件的库越来越受到重视。然而,创建可有机编织到用例中的组件是一项极其复杂的任务。在这项工作中,我们引入了一种元 DSL 来协助库的开发,称为前向编程(FOP)。它结合了懒惰评估和面向方面的编程原则,对横切组件配置进行调整,并根据后续代码的使用情况改变其执行结果。理论分析表明,FOP 简化了组件开发,并使库用户可以学习它们的组合逻辑。我们用一个名为 pyfop 的 Python 包实现了这一范式,并进行了一项案例研究,将其与纯函数式和面向对象的库实现进行了比较。在这项研究中,源代码质量指标表明,编写库组件所需的时间和精力减少了,可理解性提高了。配置可共享,无需修改远处的代码段。
{"title":"Forward-Oriented Programming: A meta-DSL for fast development of component libraries","authors":"Emmanouil Krasanakis,&nbsp;Andreas Symeonidis","doi":"10.1016/j.infsof.2024.107474","DOIUrl":"https://doi.org/10.1016/j.infsof.2024.107474","url":null,"abstract":"<div><p>Libraries that implement Domain-Specific Language (DSL) components keep gaining traction when it comes to developing software for specific application domains. However, creating components that can be organically weaved into use cases is an extremely complex task. In this work, we introduce a meta-DSL to assist library development, called Forward-Oriented Programming (FOP). This combines lazy evaluation and aspect-oriented programming principles to align crosscutting component configurations and alter their execution outcomes depending on usage in subsequent code. Theoretical analysis shows that FOP simplifies component development and makes their combination logic learnable by library users. We realize the paradigm with a Python package, called <em>pyfop</em>, and conduct a case study that compares it with purely functional and object-oriented library implementations. In the study, source code quality metrics demonstrate reduced time and effort to write library components, and increased comprehensibility. Configurations are shared without modifying distant code segments.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"171 ","pages":"Article 107474"},"PeriodicalIF":3.9,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140632641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge and research mapping of the data and database forensics domains: A bibliometric analysis 数据和数据库取证领域的知识和研究图谱:文献计量分析
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-12 DOI: 10.1016/j.infsof.2024.107472
Georgios Chorozidis , Konstantinos Georgiou , Nikolaos Mittas , Lefteris Angelis

The field of digital forensics has undergone rapid development alongside the technological advancements of the latest century. This study focuses in two of its subdomains, namely database forensics and data forensics. Though the concept of a database is relatively old, there is an academic void when it comes to its research compared to different domains in digital forensics. Data forensics has a myriad of applications, however there appears to be a lack of standardization in regards to the field itself throughout the different disciplines of the forensic field. Our main objectives with this study were to identify the prominent trends, uncover research gaps or further research necessity and to provide a high level outline of the selected domains. To fulfill the objectives, we designed and executed a protocol with predefined phases, steps, and activities that all stem from the principles of bibliometric analysis. The findings of the methodological procedure are presented and the research questions are answered in a concise manner. The two domains have considerable growth, given how recently they emerged in literature. However, there are issues present in the current literature that might hinder the future research and might repulse not only the aspiring but also the current professionals of the forensic field. These issues must be resolved in order to make the selected domains less elusive when it comes to cross-domain applications and when new practitioners are concerned.

随着新世纪技术的进步,数字取证领域也经历了快速发展。本研究侧重于其中的两个子领域,即数据库取证和数据取证。虽然数据库的概念相对古老,但与数字取证的不同领域相比,数据库的研究在学术上还是空白。数据取证有无数种应用,但在整个取证领域的不同学科中,该领域本身似乎缺乏标准化。我们进行这项研究的主要目的是确定突出的趋势,发现研究差距或进一步研究的必要性,并提供所选领域的高层次概要。为了实现这些目标,我们设计并执行了一项协议,其中包含预先确定的阶段、步骤和活动,所有这些都源于文献计量学分析的原则。我们以简明扼要的方式介绍了方法论程序的结果并回答了研究问题。鉴于这两个领域是最近才出现在文献中的,因此它们都有了长足的发展。然而,当前文献中存在的一些问题可能会阻碍未来的研究,不仅会让法医领域的有志之士,也会让当前的专业人员感到失望。必须解决这些问题,才能使选定的领域在跨领域应用和新从业人员关注时不那么难以捉摸。
{"title":"Knowledge and research mapping of the data and database forensics domains: A bibliometric analysis","authors":"Georgios Chorozidis ,&nbsp;Konstantinos Georgiou ,&nbsp;Nikolaos Mittas ,&nbsp;Lefteris Angelis","doi":"10.1016/j.infsof.2024.107472","DOIUrl":"10.1016/j.infsof.2024.107472","url":null,"abstract":"<div><p>The field of digital forensics has undergone rapid development alongside the technological advancements of the latest century. This study focuses in two of its subdomains, namely database forensics and data forensics. Though the concept of a database is relatively old, there is an academic void when it comes to its research compared to different domains in digital forensics. Data forensics has a myriad of applications, however there appears to be a lack of standardization in regards to the field itself throughout the different disciplines of the forensic field. Our main objectives with this study were to identify the prominent trends, uncover research gaps or further research necessity and to provide a high level outline of the selected domains. To fulfill the objectives, we designed and executed a protocol with predefined phases, steps, and activities that all stem from the principles of bibliometric analysis. The findings of the methodological procedure are presented and the research questions are answered in a concise manner. The two domains have considerable growth, given how recently they emerged in literature. However, there are issues present in the current literature that might hinder the future research and might repulse not only the aspiring but also the current professionals of the forensic field. These issues must be resolved in order to make the selected domains less elusive when it comes to cross-domain applications and when new practitioners are concerned.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"171 ","pages":"Article 107472"},"PeriodicalIF":3.9,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140612843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guiding the way: A systematic literature review on mentoring practices in open source software projects 指引方向:关于开源软件项目中指导实践的系统文献综述
IF 3.9 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-04-09 DOI: 10.1016/j.infsof.2024.107470
Zixuan Feng , Katie Kimura , Bianca Trinkenreich , Anita Sarma , Igor Steinmacher

Context:

Mentoring in Open Source Software (OSS) is important to its project’s growth and sustainability. Mentoring allows contributors to improve their technical skills and learn about the protocols and cultural norms of the project. However, mentoring has its challenges: mentors sometimes feel unappreciated, and mentees may have mismatched interests or lack interpersonal skills. Existing research has investigated the different challenges of mentoring in different OSS contexts, but we lack a holistic understanding.

Objective:

A comprehensive understanding of the current practices and challenges of mentoring in OSS is needed to implement appropriate strategies to facilitate mentoring.

Method:

This study presents a systematic literature review investigating how literature has characterized mentoring practices in OSS, including their challenges and the strategies to mitigate them. We retrieved 232 studies from four digital libraries. Out of these, 21 were primary studies. Using this, we performed backward and author snowballing, adding another 27 studies. We conducted a completeness check by reviewing the references of the 4 most relevant primary studies, which resulted in us adding 1 additional study. We then conducted a full-text review and evaluated the studies using a set of criteria; as a result, 10 papers were excluded. We then employed an open-coding approach to analyze, aggregate, and synthesize the selected studies.

Results:

We reviewed 39 studies to investigate the different facets of mentoring in OSS, encompassing motivations, goals, channels, and contributor dynamics. We then identified 13 challenges associated with mentoring in OSS, which fall into three categories: social, process, and technical. We also present a quick-reference strategy catalog to map these strategies to challenges for mitigation.

Conclusions:

Our study serves as a guideline for researchers and practitioners about mentoring challenges and potential strategies to mitigate these challenges.

背景:开源软件(OSS)中的指导对于项目的发展和可持续性非常重要。指导可以让贡献者提高技术技能,了解项目的协议和文化规范。然而,指导也有其挑战:指导者有时会感到不被重视,被指导者可能兴趣不匹配或缺乏人际交往技巧。现有研究已经调查了在不同开放源码软件环境中指导所面临的不同挑战,但我们还缺乏一个全面的认识。目标:我们需要全面了解开放源码软件中指导的当前实践和挑战,以便实施适当的策略来促进指导。方法:本研究通过系统的文献综述,调查了文献如何描述开放源码软件中的指导实践,包括其挑战和缓解这些挑战的策略。我们从四个数字图书馆检索到 232 篇研究报告。其中 21 篇为主要研究。在此基础上,我们进行了反向和作者 "滚雪球",又增加了 27 项研究。我们对 4 篇最相关的主要研究报告的参考文献进行了完整性检查,结果又增加了 1 篇研究报告。然后,我们进行了全文审阅,并使用一套标准对研究进行了评估,结果有 10 篇论文被排除在外。结果:我们回顾了 39 项研究,调查了开放源码软件中指导的不同方面,包括动机、目标、渠道和贡献者动态。然后,我们确定了与开放源码软件中的指导相关的 13 项挑战,这些挑战分为三类:社会、过程和技术。结论:我们的研究为研究人员和从业人员提供了有关指导挑战和缓解这些挑战的潜在策略的指南。
{"title":"Guiding the way: A systematic literature review on mentoring practices in open source software projects","authors":"Zixuan Feng ,&nbsp;Katie Kimura ,&nbsp;Bianca Trinkenreich ,&nbsp;Anita Sarma ,&nbsp;Igor Steinmacher","doi":"10.1016/j.infsof.2024.107470","DOIUrl":"https://doi.org/10.1016/j.infsof.2024.107470","url":null,"abstract":"<div><h3>Context:</h3><p>Mentoring in Open Source Software (OSS) is important to its project’s growth and sustainability. Mentoring allows contributors to improve their technical skills and learn about the protocols and cultural norms of the project. However, mentoring has its challenges: mentors sometimes feel unappreciated, and mentees may have mismatched interests or lack interpersonal skills. Existing research has investigated the different challenges of mentoring in different OSS contexts, but we lack a holistic understanding.</p></div><div><h3>Objective:</h3><p>A comprehensive understanding of the current practices and challenges of mentoring in OSS is needed to implement appropriate strategies to facilitate mentoring.</p></div><div><h3>Method:</h3><p>This study presents a systematic literature review investigating how literature has characterized mentoring practices in OSS, including their challenges and the strategies to mitigate them. We retrieved 232 studies from four digital libraries. Out of these, 21 were primary studies. Using this, we performed backward and author snowballing, adding another 27 studies. We conducted a completeness check by reviewing the references of the 4 most relevant primary studies, which resulted in us adding 1 additional study. We then conducted a full-text review and evaluated the studies using a set of criteria; as a result, 10 papers were excluded. We then employed an open-coding approach to analyze, aggregate, and synthesize the selected studies.</p></div><div><h3>Results:</h3><p>We reviewed 39 studies to investigate the different facets of mentoring in OSS, encompassing motivations, goals, channels, and contributor dynamics. We then identified 13 challenges associated with mentoring in OSS, which fall into three categories: social, process, and technical. We also present a quick-reference strategy catalog to map these strategies to challenges for mitigation.</p></div><div><h3>Conclusions:</h3><p>Our study serves as a guideline for researchers and practitioners about mentoring challenges and potential strategies to mitigate these challenges.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"171 ","pages":"Article 107470"},"PeriodicalIF":3.9,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140545694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information and Software Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1