首页 > 最新文献

Journal of Software-Evolution and Process最新文献

英文 中文
Multilanguage Detection of Design Pattern Instances 设计模式实例的多语言检测
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-18 DOI: 10.1002/smr.2738
Hugo Andrade, João Bispo, Filipe F. Correia

Code comprehension is often supported by source code analysis tools that provide more abstract views over software systems, such as those detecting design patterns. These tools encompass analysis of source code and ensuing extraction of relevant information. However, the analysis of the source code is often specific to the target programming language. We propose DP-LARA, a multilanguage pattern detection tool that uses the multilanguage capability of the LARA framework to support finding pattern instances in a code base. LARA provides a virtual AST, which is common to multiple OOP programming languages, and DP-LARA then performs code analysis of detecting pattern instances on this abstract representation. We evaluate the detection performance and consistency of DP-LARA with a few software projects. Results show that a multilanguage approach does not compromise detection performance, and DP-LARA is consistent across the languages we tested it for (i.e., Java and C/C++). Moreover, by providing a virtual AST as the abstract representation, we believe to have decreased the effort of extending the tool to new programming languages and maintaining existing ones.

代码理解通常由源代码分析工具支持,这些工具提供了对软件系统的更抽象的视图,例如那些检测设计模式的工具。这些工具包括源代码分析和随后的相关信息提取。然而,对源代码的分析通常是特定于目标编程语言的。我们提出了一种多语言模式检测工具DP-LARA,它使用LARA框架的多语言功能来支持在代码库中查找模式实例。LARA提供了一个虚拟的AST,这对于多种面向对象编程语言来说是通用的,然后DP-LARA在这个抽象表示上执行检测模式实例的代码分析。通过几个软件项目对DP-LARA的检测性能和一致性进行了评价。结果表明,多语言方法不会影响检测性能,DP-LARA在我们测试的语言(即Java和C/ c++)中是一致的。此外,通过提供虚拟AST作为抽象表示,我们相信减少了将工具扩展到新编程语言和维护现有编程语言的工作量。
{"title":"Multilanguage Detection of Design Pattern Instances","authors":"Hugo Andrade,&nbsp;João Bispo,&nbsp;Filipe F. Correia","doi":"10.1002/smr.2738","DOIUrl":"https://doi.org/10.1002/smr.2738","url":null,"abstract":"<div>\u0000 \u0000 <p>Code comprehension is often supported by source code analysis tools that provide more abstract views over software systems, such as those detecting design patterns. These tools encompass analysis of source code and ensuing extraction of relevant information. However, the analysis of the source code is often specific to the target programming language. We propose DP-LARA, a multilanguage pattern detection tool that uses the multilanguage capability of the LARA framework to support finding pattern instances in a code base. LARA provides a virtual AST, which is common to multiple OOP programming languages, and DP-LARA then performs code analysis of detecting pattern instances on this abstract representation. We evaluate the detection performance and consistency of DP-LARA with a few software projects. Results show that a multilanguage approach does not compromise detection performance, and DP-LARA is consistent across the languages we tested it for (i.e., Java and C/C++). Moreover, by providing a virtual AST as the abstract representation, we believe to have decreased the effort of extending the tool to new programming languages and maintaining existing ones.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143438831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Why and How We Combine Multiple Deep Learning Models With Functional Overlaps 我们为什么以及如何结合多个功能重叠的深度学习模型
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-16 DOI: 10.1002/smr.70003
Mingliang Ma, Yanhui Li, Yingxin Chen, Lin Chen, Yuming Zhou

The evolution (e.g., development and maintenance) of deep learning (DL) models has attracted much attention. One of the main challenges during the development and maintenance of DL models is model training, which often requires a lot of human resources and computing power (such as labeling costs and parameter training). In recent years, to alleviate this problem, researchers have introduced the idea of software engineering (SE) into DL. Researchers consider the DL model a new type of software, borrowing the practice of traditional software reuse, that is, focusing on the reuse of DL models to improve the quality of DL model development and maintenance. This paper focuses on more complex model reuse scenarios, where developers need to combine multiple models with functional overlaps. We explore whether the model combination technique can meet the requirements for such scenarios. We have conducted an empirical study of the research scenario and found that a model composition approach was needed to meet the requirements. Furthermore, we propose a model combination method based on concatenation-parallel called MCCP. First, the multiple models' hidden layer features are connected, and then the multiple models are connected in parallel to construct a joint model with all output categories. The joint model is trained to achieve unified requirements under the limited marking cost. Through experiments on data sets in nine domains and five model structures, the following two conclusions are drawn: (1) we observe noticeable differences (38% at most) in the performance of multiple models within overlapping category data, which calls for effective model combination techniques. (2) MCCP is more effective than the baseline, which performs the best in eight of the nine domains. Our research shows that the joint model generated by combining models with overlapping functions can meet the requirements of complex model reuse scenarios.

深度学习(DL)模型的演变(如开发和维护)引起了人们的广泛关注。DL模型开发和维护过程中的主要挑战之一是模型训练,这通常需要大量的人力资源和计算能力(如标记成本和参数训练)。近年来,为了缓解这一问题,研究者将软件工程(SE)的思想引入深度学习。研究者认为深度学习模型是一种新型的软件,借鉴了传统软件重用的做法,即关注深度学习模型的重用,以提高深度学习模型开发和维护的质量。本文关注的是更复杂的模型重用场景,其中开发人员需要将多个具有功能重叠的模型组合在一起。我们探索模型组合技术是否能够满足这些场景的需求。我们对研究场景进行了实证研究,发现需要一种模型组合方法来满足需求。在此基础上,提出了一种基于串联并行的模型组合方法(MCCP)。首先将多个模型的隐层特征连接起来,然后将多个模型并行连接,构建一个包含所有输出类别的联合模型。对联合模型进行训练,在有限的标记成本下实现统一需求。通过对9个领域和5种模型结构的数据集进行实验,得出以下两个结论:(1)在重叠的类别数据中,我们观察到多个模型的性能存在显著差异(最多38%),这需要有效的模型组合技术。(2) MCCP比基线更有效,在9个领域中的8个领域表现最佳。我们的研究表明,将具有重叠功能的模型组合生成的联合模型可以满足复杂模型重用场景的需求。
{"title":"Why and How We Combine Multiple Deep Learning Models With Functional Overlaps","authors":"Mingliang Ma,&nbsp;Yanhui Li,&nbsp;Yingxin Chen,&nbsp;Lin Chen,&nbsp;Yuming Zhou","doi":"10.1002/smr.70003","DOIUrl":"https://doi.org/10.1002/smr.70003","url":null,"abstract":"<div>\u0000 \u0000 <p>The evolution (e.g., development and maintenance) of deep learning (DL) models has attracted much attention. One of the main challenges during the development and maintenance of DL models is model training, which often requires a lot of human resources and computing power (such as labeling costs and parameter training). In recent years, to alleviate this problem, researchers have introduced the idea of software engineering (SE) into DL. Researchers consider the DL model a new type of software, borrowing the practice of traditional software reuse, that is, focusing on the reuse of DL models to improve the quality of DL model development and maintenance. This paper focuses on more complex model reuse scenarios, where developers need to combine multiple models with functional overlaps. We explore whether the model combination technique can meet the requirements for such scenarios. We have conducted an empirical study of the research scenario and found that a model composition approach was needed to meet the requirements. Furthermore, we propose a model combination method based on concatenation-parallel called MCCP. First, the multiple models' hidden layer features are connected, and then the multiple models are connected in parallel to construct a joint model with all output categories. The joint model is trained to achieve unified requirements under the limited marking cost. Through experiments on data sets in nine domains and five model structures, the following two conclusions are drawn: (1) we observe noticeable differences (38% at most) in the performance of multiple models within overlapping category data, which calls for effective model combination techniques. (2) MCCP is more effective than the baseline, which performs the best in eight of the nine domains. Our research shows that the joint model generated by combining models with overlapping functions can meet the requirements of complex model reuse scenarios.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143424164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Approach to Develop Correct-by-Construction Business Process Models Using a Formal Domain Specific Language 一种使用正式领域特定语言开发按构造更正业务流程模型的方法
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-11 DOI: 10.1002/smr.2762
Yousra Bendaly Hlaoui, Salma Ayari

As the size and the complexity of business process models are an important driver of error probability, it is recommended to split large models into smaller models. Hence, we propose, in this paper, to develop business process models by refinement. A refinement is a transformation of a source model to a target model expressed in the same modeling language. This transformation should preserve the semantics of the source model to provide semantically correct target model. Thus, we propose, in this paper, a domain specific language based on Business Process Model and Notation (BPMN) language for developing by refinement business process models correct-by-construction. Hence, we propose (i) a BPMNR$$ BPM{N}_R $$ formal syntax throughout a context-free grammar GBPMNR$$ {G}_{BPM{N}_R} $$, (ii) axiomatic semantics to ensure the refinement correction when building business process models, (iii) operational semantics in terms of Kripke structure permitting formal verification of provided BPMNR$$ BPM{N}_R $$ models to check their reliability. The Kripke structure supports the verification of behavioral requirements represented by the Computational Tree Logic (CTL) temporal logic and verified by NuSMV model checker. Based on these semantics, we prove the validity

由于业务流程模型的大小和复杂性是错误概率的重要驱动因素,因此建议将大型模型拆分为较小的模型。因此,我们在本文中建议通过细化来开发业务流程模型。精化是源模型到用相同建模语言表示的目标模型的转换。这种转换应该保留源模型的语义,以提供语义正确的目标模型。因此,我们在本文中提出了一种基于业务流程模型和符号(BPMN)语言的领域特定语言,用于通过按构造正确地细化业务流程模型来进行开发。因此,我们提出(i)一个贯穿上下文无关语法G的B P M N R $$ BPM{N}_R $$形式语法B P M N R $$ {G}_{BPM{N}_R} $$,(ii)公理语义,确保构建业务流程模型时的精化校正;(iii)基于Kripke结构的操作语义,允许对所提供的B P M N R $$ BPM{N}_R $$模型进行形式化验证,以检查其可靠性。Kripke结构支持以计算树逻辑(CTL)时态逻辑表示的行为需求验证,并通过NuSMV模型检查器进行验证。基于这些语义,我们证明了我们开发的b.p M N R $$ BPM{N}_R $$编译器的有效性,该编译器可以帮助开发人员构建正确的b.pM N R $$ BPM{N}_R $$模型,并将这些模型转换为NuSMV代码,以证明其可靠性。
{"title":"An Approach to Develop Correct-by-Construction Business Process Models Using a Formal Domain Specific Language","authors":"Yousra Bendaly Hlaoui,&nbsp;Salma Ayari","doi":"10.1002/smr.2762","DOIUrl":"https://doi.org/10.1002/smr.2762","url":null,"abstract":"<div>\u0000 \u0000 <p>As the size and the complexity of business process models are an important driver of error probability, it is recommended to split large models into smaller models. Hence, we propose, in this paper, to develop business process models by refinement. A refinement is a transformation of a source model to a target model expressed in the same modeling language. This transformation should preserve the semantics of the source model to provide semantically correct target model. Thus, we propose, in this paper, a domain specific language based on Business Process Model and Notation (BPMN) language for developing by refinement business process models correct-by-construction. Hence, we propose (i) a <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>B</mi>\u0000 <mi>P</mi>\u0000 <mi>M</mi>\u0000 <msub>\u0000 <mrow>\u0000 <mi>N</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>R</mi>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ BPM{N}_R $$</annotation>\u0000 </semantics></math> formal syntax throughout a context-free grammar <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msub>\u0000 <mrow>\u0000 <mi>G</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>B</mi>\u0000 <mi>P</mi>\u0000 <mi>M</mi>\u0000 <msub>\u0000 <mrow>\u0000 <mi>N</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>R</mi>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ {G}_{BPM{N}_R} $$</annotation>\u0000 </semantics></math>, (ii) axiomatic semantics to ensure the refinement correction when building business process models, (iii) operational semantics in terms of Kripke structure permitting formal verification of provided <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>B</mi>\u0000 <mi>P</mi>\u0000 <mi>M</mi>\u0000 <msub>\u0000 <mrow>\u0000 <mi>N</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mi>R</mi>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ BPM{N}_R $$</annotation>\u0000 </semantics></math> models to check their reliability. The Kripke structure supports the verification of behavioral requirements represented by the Computational Tree Logic (CTL) temporal logic and verified by NuSMV model checker. Based on these semantics, we prove the validity ","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143389002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Indispensable Role of Software Ecosystem Services 软件生态系统服务不可或缺的角色
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-06 DOI: 10.1002/smr.70002
Casper van Schothorst, Slinger Jansen, Liza Lausberg

Software ecosystem services are essential for the sustainability and functionality of software ecosystems, but they lack comprehensive categorization, hindering further study. This study explores the concept of software ecosystem services through a systematic literature review and brief survey. Drawing analogies from natural ecosystems, we define software ecosystem services as the conditions and processes through which software ecosystems create, provide, and sustain innovation and value creation via software. Software ecosystem services are categorized into four primary types: provisioning, regulating, cultural, and supporting services.

Our findings highlight the crucial role of services that do not directly add customer value but are essential for the software ecosystem's functionality, such as authentication and authorization services, collaboration and communication platforms, and app stores. By highlighting these vital yet often overlooked services, the research identifies potential sustainability threats for software ecosystems, such as the dominance of a few major players, which mirrors the risks of monocultures in natural ecosystems. This study lays the groundwork for further research aimed at ensuring the long-term sustainability and resilience of software ecosystems.

软件生态系统服务对软件生态系统的可持续性和功能性至关重要,但缺乏全面的分类,阻碍了进一步的研究。本文通过系统的文献综述和简要的调查,探讨了软件生态系统服务的概念。通过类比自然生态系统,我们将软件生态系统服务定义为软件生态系统通过软件创造、提供和维持创新和价值创造的条件和过程。软件生态系统服务分为四种主要类型:供应服务、调节服务、文化服务和支持服务。我们的研究结果强调了服务的关键作用,这些服务并不直接增加客户价值,但对软件生态系统的功能至关重要,例如身份验证和授权服务、协作和通信平台以及应用商店。通过强调这些重要但经常被忽视的服务,该研究确定了软件生态系统的潜在可持续性威胁,例如少数主要参与者的主导地位,这反映了自然生态系统中单一栽培的风险。本研究为确保软件生态系统的长期可持续性和弹性的进一步研究奠定了基础。
{"title":"The Indispensable Role of Software Ecosystem Services","authors":"Casper van Schothorst,&nbsp;Slinger Jansen,&nbsp;Liza Lausberg","doi":"10.1002/smr.70002","DOIUrl":"https://doi.org/10.1002/smr.70002","url":null,"abstract":"<p>Software ecosystem services are essential for the sustainability and functionality of software ecosystems, but they lack comprehensive categorization, hindering further study. This study explores the concept of software ecosystem services through a systematic literature review and brief survey. Drawing analogies from natural ecosystems, we define software ecosystem services as the conditions and processes through which software ecosystems create, provide, and sustain innovation and value creation via software. Software ecosystem services are categorized into four primary types: provisioning, regulating, cultural, and supporting services.</p><p>Our findings highlight the crucial role of services that do not directly add customer value but are essential for the software ecosystem's functionality, such as authentication and authorization services, collaboration and communication platforms, and app stores. By highlighting these vital yet often overlooked services, the research identifies potential sustainability threats for software ecosystems, such as the dominance of a few major players, which mirrors the risks of monocultures in natural ecosystems. This study lays the groundwork for further research aimed at ensuring the long-term sustainability and resilience of software ecosystems.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143362482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Influencing Factors' Analysis for the Performance of Parallel Evolutionary Test Case Generation for Web Applications Web应用并行演化测试用例生成性能的影响因素分析
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-05 DOI: 10.1002/smr.2751
Weiwei Wang, Shukai Zhang, Kepeng Qiu, Xuejun Liu, Xiaodan Li, Ruilian Zhao

Evolutionary test case generation plays a vital role in ensuring software quality and reliability. Since Web applications involve a large number of interactions between client and server, the dynamic evolutionary test case generation is very time-consuming, which makes it difficult to apply in actual projects. Obviously, parallelization provides a feasible way to improve the efficiency and effectiveness of evolutionary test generation. In our previous research, the idea of parallelism has been introduced into the evolutionary test generation for Web applications. However, its performance is affected by many factors, such as migration scale, migration frequency, the number of browser processes and subpopulations, and so on. The analysis of influencing factors can guide enhancing the performance of evolutionary test generation. For this reason, this paper analyzes the factors that influence parallel evolutionary algorithms and how they affect the performance of test generation for Web applications. At the same time, different parallel evolutionary test generation methods are designed and implemented. Experiments are conducted on open-source Web applications to generate test cases that meet the server-side sensitive paths coverage criterion, providing guidance and suggestions for the parameter setting of parallel evolutionary test case generation for Web applications. The experimental results show that (1) compared with the global parallelization model, the evolutionary algorithm based on the parallel island model has a greater improvement in test case generation performance. In more detail, when generating test cases with the same server-side sensitive paths coverage, the number of iterations required is reduced by 49.6%, and the time cost is reduced by 58.7%; (2) for the test case generation based on the parallel island model, if the migration scale is large, appropriately increasing the migration frequency can reduce its time cost; (3) if the number of subpopulations is fixed, appropriately increasing the number of browser processes can reduce the time cost of Web application test case evolution, but the number of browser processes should not be too large; otherwise, it may increase the time cost.

演化式测试用例生成在确保软件质量和可靠性方面起着至关重要的作用。由于Web应用程序涉及客户端和服务器之间的大量交互,因此动态演化测试用例生成非常耗时,这使得它很难应用于实际项目中。显然,并行化为提高进化测试生成的效率和有效性提供了一种可行的方法。在我们之前的研究中,并行性的思想已经被引入到Web应用程序的演化测试生成中。然而,它的性能受到许多因素的影响,如迁移规模、迁移频率、浏览器进程和子种群的数量等。对影响因素的分析可以指导进化测试生成性能的提高。因此,本文分析了影响并行进化算法的因素,以及它们如何影响Web应用程序测试生成的性能。同时,设计并实现了不同的并行演化测试生成方法。在开源Web应用上进行实验,生成满足服务器端敏感路径覆盖标准的测试用例,为Web应用并行进化测试用例生成的参数设置提供指导和建议。实验结果表明:(1)与全局并行化模型相比,基于并行岛模型的进化算法在测试用例生成性能上有较大提升。更详细地说,当生成具有相同服务器端敏感路径覆盖率的测试用例时,所需的迭代次数减少了49.6%,时间成本减少了58.7%;(2)对于基于并行岛模型的测试用例生成,如果迁移规模较大,适当增加迁移频率可以降低其时间成本;(3)在子种群数量一定的情况下,适当增加浏览器进程的数量可以减少Web应用测试用例演化的时间成本,但浏览器进程的数量不宜过大;否则可能会增加时间成本。
{"title":"Influencing Factors' Analysis for the Performance of Parallel Evolutionary Test Case Generation for Web Applications","authors":"Weiwei Wang,&nbsp;Shukai Zhang,&nbsp;Kepeng Qiu,&nbsp;Xuejun Liu,&nbsp;Xiaodan Li,&nbsp;Ruilian Zhao","doi":"10.1002/smr.2751","DOIUrl":"https://doi.org/10.1002/smr.2751","url":null,"abstract":"<div>\u0000 \u0000 <p>Evolutionary test case generation plays a vital role in ensuring software quality and reliability. Since Web applications involve a large number of interactions between client and server, the dynamic evolutionary test case generation is very time-consuming, which makes it difficult to apply in actual projects. Obviously, parallelization provides a feasible way to improve the efficiency and effectiveness of evolutionary test generation. In our previous research, the idea of parallelism has been introduced into the evolutionary test generation for Web applications. However, its performance is affected by many factors, such as migration scale, migration frequency, the number of browser processes and subpopulations, and so on. The analysis of influencing factors can guide enhancing the performance of evolutionary test generation. For this reason, this paper analyzes the factors that influence parallel evolutionary algorithms and how they affect the performance of test generation for Web applications. At the same time, different parallel evolutionary test generation methods are designed and implemented. Experiments are conducted on open-source Web applications to generate test cases that meet the server-side sensitive paths coverage criterion, providing guidance and suggestions for the parameter setting of parallel evolutionary test case generation for Web applications. The experimental results show that (1) compared with the global parallelization model, the evolutionary algorithm based on the parallel island model has a greater improvement in test case generation performance. In more detail, when generating test cases with the same server-side sensitive paths coverage, the number of iterations required is reduced by 49.6%, and the time cost is reduced by 58.7%; (2) for the test case generation based on the parallel island model, if the migration scale is large, appropriately increasing the migration frequency can reduce its time cost; (3) if the number of subpopulations is fixed, appropriately increasing the number of browser processes can reduce the time cost of Web application test case evolution, but the number of browser processes should not be too large; otherwise, it may increase the time cost.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143248387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structuring Semantic-Aware Relations Between Bugs and Patches for Accurate Patch Evaluation 构建bug和补丁之间的语义感知关系,以实现准确的补丁评估
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-02 DOI: 10.1002/smr.70001
Lingxiao Zhao, Hui Li, Yongqian Chen, Xiaowei Pan, Shikai Guo

Patches can help fix security vulnerabilities and optimize software performance, thereby enhancing the quality and security of the software. Unfortunately, patches generated by automated program repair tools are not always correct, as they may introduce new bugs or fail to fully rectify the original issue. Various methods for evaluating patch correctness have been proposed. However, most methods face the challenge of capturing long-distance dependencies in patch correctness evaluation, which leads to a decline in the predictive performance of the models. To address the challenge, this paper presents a method named Qamhaen to evaluate the correctness of patches generated by APR. Specifically, text embedding of bugs and patches component address the challenge of long-distance dependencies across functions in patch correctness evaluation by using bug reports and patch descriptions as inputs instead of code snippets. BERT is employed for pretraining to capture these dependencies, followed by an additional multihead self-attention mechanism for further feature extraction. Similarity evaluator component devises a similarity calculation to assess the effectiveness of patch descriptions in resolving issues outlined in bug reports. Comprehensive experiments are conducted on a dataset containing 9135 patches and a patch correctness assessment metric, and extensive experiments demonstrate that Qamhaen outperforms baseline methods in terms of overall performance across AUC, F1, +Recall, -Recall, and Precision. For example, compared to the baseline, Qamhaen achieves an F1 of 0.691, representing improvements of 24.2%, 22.1%, and 6.3% over the baseline methods, respectively.

补丁可以修复安全漏洞,优化软件性能,从而提高软件的质量和安全性。不幸的是,自动程序修复工具生成的补丁并不总是正确的,因为它们可能会引入新的错误或无法完全纠正原始问题。已经提出了各种评估补丁正确性的方法。然而,大多数方法在补丁正确性评估中都面临着捕获远程依赖关系的挑战,这导致了模型预测性能的下降。为了解决这一问题,本文提出了一种名为Qamhaen的方法来评估apr生成的补丁的正确性。其中,bug和补丁组件的文本嵌入通过使用bug报告和补丁描述作为输入而不是代码片段,解决了补丁正确性评估中功能之间的长距离依赖。采用BERT进行预训练以捕获这些依赖关系,然后采用额外的多头自注意机制进行进一步的特征提取。相似度评估器组件设计了相似度计算,以评估补丁描述在解决bug报告中概述的问题方面的有效性。在包含9135个补丁和补丁准确性评估指标的数据集上进行了全面的实验,大量的实验表明,Qamhaen在AUC、F1、+Recall、-Recall和Precision的整体性能方面优于基线方法。例如,与基线方法相比,Qamhaen实现了0.691的F1,分别比基线方法提高了24.2%、22.1%和6.3%。
{"title":"Structuring Semantic-Aware Relations Between Bugs and Patches for Accurate Patch Evaluation","authors":"Lingxiao Zhao,&nbsp;Hui Li,&nbsp;Yongqian Chen,&nbsp;Xiaowei Pan,&nbsp;Shikai Guo","doi":"10.1002/smr.70001","DOIUrl":"https://doi.org/10.1002/smr.70001","url":null,"abstract":"<div>\u0000 \u0000 <p>Patches can help fix security vulnerabilities and optimize software performance, thereby enhancing the quality and security of the software. Unfortunately, patches generated by automated program repair tools are not always correct, as they may introduce new bugs or fail to fully rectify the original issue. Various methods for evaluating patch correctness have been proposed. However, most methods face the challenge of capturing long-distance dependencies in patch correctness evaluation, which leads to a decline in the predictive performance of the models. To address the challenge, this paper presents a method named Qamhaen to evaluate the correctness of patches generated by APR. Specifically, text embedding of bugs and patches component address the challenge of long-distance dependencies across functions in patch correctness evaluation by using bug reports and patch descriptions as inputs instead of code snippets. BERT is employed for pretraining to capture these dependencies, followed by an additional multihead self-attention mechanism for further feature extraction. Similarity evaluator component devises a similarity calculation to assess the effectiveness of patch descriptions in resolving issues outlined in bug reports. Comprehensive experiments are conducted on a dataset containing 9135 patches and a patch correctness assessment metric, and extensive experiments demonstrate that Qamhaen outperforms baseline methods in terms of overall performance across <i>AUC</i>, <i>F1</i>, <i>+Recall</i>, <i>-Recall</i>, and <i>Precision</i>. For example, compared to the baseline, Qamhaen achieves an <i>F1</i> of 0.691, representing improvements of 24.2%, 22.1%, and 6.3% over the baseline methods, respectively.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 2","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143110874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring ChatGPT's Potential in Java API Method Recommendation: An Empirical Study 探索ChatGPT在Java API方法推荐中的潜力:一项实证研究
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-28 DOI: 10.1002/smr.2765
Ye Wang, Weihao Xue, Qiao Huang, Bo Jiang, Hua Zhang

As software development grows increasingly complex, application programming interface (API) plays a significant role in enhancing development efficiency and code quality. However, the explosive growth in the number of APIs makes it impossible for developers to become familiar with all of them. In actual development scenarios, developers may spend a significant amount of time searching for suitable APIs, which could severely impact the development process. Recently, the OpenAI's large language model (LLM) based application—ChatGPT has shown exceptional performance across various software development tasks, responding swiftly to instructions and generating high-quality textual responses, suggesting its potential in API recommendation tasks. Thus, this paper presents an empirical study to investigate the performance of ChatGPT in query-based API recommendation tasks. Specifically, we utilized the existing benchmark APIBENCH-Q and the newly constructed dataset as evaluation datasets, selecting the state-of-the-art models BIKER and MULAREC for comparison with ChatGPT. Our research findings demonstrate that ChatGPT outperforms existing approaches in terms of success rate, mean reciprocal rank (MRR), and mean average precision (MAP). Through a manual examination of samples in which ChatGPT exceeds baseline performance and those where it provides incorrect answers, we further substantiate ChatGPT's advantages over the baselines and identify several issues contributing to its suboptimal performance. To address these issues and enhance ChatGPT's recommendation capabilities, we employed two strategies: (1) utilizing a more advanced LLM (GPT-4) and (2) exploring a new approach—MACAR, which is based on the Chain of Thought methodology. The results indicate that both strategies are effective.

随着软件开发的日益复杂,应用程序编程接口(API)在提高开发效率和代码质量方面发挥着重要作用。然而,api数量的爆炸性增长使得开发人员不可能熟悉所有api。在实际的开发场景中,开发人员可能会花费大量时间寻找合适的api,这可能会严重影响开发过程。最近,OpenAI基于大型语言模型(LLM)的应用程序chatgpt在各种软件开发任务中表现出色,对指令做出快速响应并生成高质量的文本响应,表明其在API推荐任务中的潜力。因此,本文对ChatGPT在基于查询的API推荐任务中的性能进行了实证研究。具体而言,我们使用现有的基准APIBENCH-Q和新构建的数据集作为评估数据集,选择最先进的模型BIKER和MULAREC与ChatGPT进行比较。我们的研究结果表明,ChatGPT在成功率、平均倒数秩(MRR)和平均平均精度(MAP)方面优于现有的方法。通过人工检查ChatGPT超过基线性能的样本以及它提供不正确答案的样本,我们进一步证实了ChatGPT优于基线的优势,并确定了导致其性能不理想的几个问题。为了解决这些问题并增强ChatGPT的推荐能力,我们采用了两种策略:(1)利用更先进的LLM (GPT-4)和(2)探索一种基于思维链方法的新方法——macar。结果表明,两种策略都是有效的。
{"title":"Exploring ChatGPT's Potential in Java API Method Recommendation: An Empirical Study","authors":"Ye Wang,&nbsp;Weihao Xue,&nbsp;Qiao Huang,&nbsp;Bo Jiang,&nbsp;Hua Zhang","doi":"10.1002/smr.2765","DOIUrl":"https://doi.org/10.1002/smr.2765","url":null,"abstract":"<div>\u0000 \u0000 <p>As software development grows increasingly complex, application programming interface (API) plays a significant role in enhancing development efficiency and code quality. However, the explosive growth in the number of APIs makes it impossible for developers to become familiar with all of them. In actual development scenarios, developers may spend a significant amount of time searching for suitable APIs, which could severely impact the development process. Recently, the OpenAI's large language model (LLM) based application—ChatGPT has shown exceptional performance across various software development tasks, responding swiftly to instructions and generating high-quality textual responses, suggesting its potential in API recommendation tasks. Thus, this paper presents an empirical study to investigate the performance of ChatGPT in query-based API recommendation tasks. Specifically, we utilized the existing benchmark APIBENCH-Q and the newly constructed dataset as evaluation datasets, selecting the state-of-the-art models BIKER and MULAREC for comparison with ChatGPT. Our research findings demonstrate that ChatGPT outperforms existing approaches in terms of success rate, mean reciprocal rank (MRR), and mean average precision (MAP). Through a manual examination of samples in which ChatGPT exceeds baseline performance and those where it provides incorrect answers, we further substantiate ChatGPT's advantages over the baselines and identify several issues contributing to its suboptimal performance. To address these issues and enhance ChatGPT's recommendation capabilities, we employed two strategies: (1) utilizing a more advanced LLM (GPT-4) and (2) exploring a new approach—MACAR, which is based on the Chain of Thought methodology. The results indicate that both strategies are effective.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensuring Confidentiality in Supply Chains With an Application to Life-Cycle Assessment 应用生命周期评估确保供应链的机密性
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-27 DOI: 10.1002/smr.2763
Achim D. Brucker, Sakine Yalman

Modern supply chains of goods and services rely heavily on close collaborations between the partners within these supply chains. Consequently, there is a demand for IT systems that support collaborations between business partners, for instance, allowing for joint computations for global optimizations (in contrast to local optimizations that each partner can do on their own). Still, businesses are very reluctant to share data or connect their enterprise systems to allow for such joint computation. The topmost factor that businesses name as reason for not collaborating, is their security concern in general and, in particular, the confidentiality of business critical data. While there are techniques (e.g., homomorphic encryption or secure multiparty computation) that allow joint computations and, at the same time, that are protecting the confidentiality of the data that flows into such a joint computation, they are not widely used. One of the main problems that prevent their adoption is their perceived performance overhead. In this paper, we address this problem by an approach that utilized the structure of supply chains by decomposing global computations into local groups, and applying secure multiparty computation within each group. This results in a scalable (resulting in a significant smaller runtime overhead than traditional approaches) and secure (i.e., protecting the confidentiality of data provided by supply chain partners) approach for joint computations within supply chains. We evaluate our approach using life-cycle assessment (LCA) as a case study. Our experiments show that, for instance, secure LCA computations even in supply chains with 15 partners are possible within less than two minutes, while traditional approaches using secure multiparty computation need more than a day.

商品和服务的现代供应链在很大程度上依赖于这些供应链中的合作伙伴之间的密切合作。因此,需要支持业务合作伙伴之间协作的IT系统,例如,允许对全局优化进行联合计算(与每个合作伙伴可以自己进行的局部优化形成对比)。尽管如此,企业仍然非常不愿意共享数据或连接他们的企业系统来允许这种联合计算。企业认为不合作的最主要原因是他们的安全问题,特别是业务关键数据的机密性。虽然有一些技术(例如,同态加密或安全多方计算)允许联合计算,同时保护流入这种联合计算的数据的机密性,但它们并没有被广泛使用。阻碍它们被采用的主要问题之一是它们的性能开销。在本文中,我们通过将全局计算分解为局部组并在每个组中应用安全多方计算来利用供应链结构来解决这个问题。这为供应链内的联合计算提供了可伸缩(比传统方法的运行时开销小得多)和安全(即保护供应链合作伙伴提供的数据的机密性)的方法。我们使用生命周期评估(LCA)作为案例研究来评估我们的方法。例如,我们的实验表明,即使在有15个合作伙伴的供应链中,安全的LCA计算也可以在不到两分钟的时间内完成,而使用安全多方计算的传统方法需要一天以上的时间。
{"title":"Ensuring Confidentiality in Supply Chains With an Application to Life-Cycle Assessment","authors":"Achim D. Brucker,&nbsp;Sakine Yalman","doi":"10.1002/smr.2763","DOIUrl":"https://doi.org/10.1002/smr.2763","url":null,"abstract":"<div>\u0000 \u0000 <p>Modern supply chains of goods and services rely heavily on close collaborations between the partners within these supply chains. Consequently, there is a demand for IT systems that support collaborations between business partners, for instance, allowing for joint computations for global optimizations (in contrast to local optimizations that each partner can do on their own). Still, businesses are very reluctant to share data or connect their enterprise systems to allow for such joint computation. The topmost factor that businesses name as reason for not collaborating, is their security concern in general and, in particular, the confidentiality of business critical data. While there are techniques (e.g., homomorphic encryption or secure multiparty computation) that allow joint computations <i>and</i>, at the same time, that are protecting the confidentiality of the data that flows into such a joint computation, they are not widely used. One of the main problems that prevent their adoption is their perceived performance overhead. In this paper, we address this problem by an approach that utilized the structure of supply chains by decomposing global computations into local groups, and applying secure multiparty computation within each group. This results in a scalable (resulting in a significant smaller runtime overhead than traditional approaches) <i>and</i> secure (i.e., protecting the confidentiality of data provided by supply chain partners) approach for joint computations within supply chains. We evaluate our approach using life-cycle assessment (LCA) as a case study. Our experiments show that, for instance, secure LCA computations even in supply chains with 15 partners are possible within less than two minutes, while traditional approaches using secure multiparty computation need more than a day.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143119858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Process-Technology Fit Decisions: Evidence From an Expert Panel and Case Studies 工艺-技术契合决策:来自专家小组和案例研究的证据
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-27 DOI: 10.1002/smr.70000
Tahir Ahmad, Amy Van Looy

Business process management (BPM) combined with new technologies can trigger both incremental and disruptive improvements in how organizations operate. More specifically, today's fourth industrial revolution can bring rapid changes in an organization's process dynamics. Our study explores differences between possible process-technology “fit” and “unfit” situations in BPM innovative projects. We extend relevant past studies and theories using a mix of qualitative techniques consisting of expert panel interviews and a case design using two field studies. Our findings reveal that, although alternative process-technology “fit” and “no-fit” situations exist, elements such as creativity, efficiency, integration, user friendliness, and proper task monitoring turn out to be the most promising factors to gain a process-technology fit. Novelty in our work includes discovering “fit” and “no-fit” factors in terms of process-technology alignment, and the development of a decision framework with a generic set of suggestions for BPM practitioners and decision makers. Our mixed-method approach is based on qualitative results by emphasizing in-depth insights and lessons learned rather than building a generalizable theory. We intend to guide managers and decision makers to help them think about possible directions, as suggested by our experts and case participants at the time of their technology adoption in a BPM context.

与新技术相结合的业务流程管理(BPM)可以触发组织运作方式的增量和破坏性改进。更具体地说,今天的第四次工业革命可以给组织的流程动态带来快速变化。我们的研究探讨了BPM创新项目中可能的流程技术“适合”和“不适合”情况之间的差异。我们使用由专家小组访谈和使用两个实地研究的案例设计组成的定性技术来扩展相关的过去研究和理论。我们的研究结果表明,尽管存在可选择的过程技术“适合”和“不适合”的情况,但创造力、效率、集成、用户友好性和适当的任务监控等因素被证明是获得过程技术适合的最有希望的因素。我们工作中的新奇之处包括发现流程技术一致性方面的“适合”和“不适合”因素,以及为BPM从业者和决策者开发具有通用建议集的决策框架。我们的混合方法是基于定性的结果,强调深入的见解和经验教训,而不是建立一个概括的理论。我们打算指导管理人员和决策者,帮助他们考虑可能的方向,正如我们的专家和案例参与者在BPM上下文中采用技术时所建议的那样。
{"title":"Process-Technology Fit Decisions: Evidence From an Expert Panel and Case Studies","authors":"Tahir Ahmad,&nbsp;Amy Van Looy","doi":"10.1002/smr.70000","DOIUrl":"https://doi.org/10.1002/smr.70000","url":null,"abstract":"<div>\u0000 \u0000 <p>Business process management (BPM) combined with new technologies can trigger both incremental and disruptive improvements in how organizations operate. More specifically, today's fourth industrial revolution can bring rapid changes in an organization's process dynamics. Our study explores differences between possible process-technology “fit” and “unfit” situations in BPM innovative projects. We extend relevant past studies and theories using a mix of qualitative techniques consisting of expert panel interviews and a case design using two field studies. Our findings reveal that, although alternative process-technology “fit” and “no-fit” situations exist, elements such as creativity, efficiency, integration, user friendliness, and proper task monitoring turn out to be the most promising factors to gain a process-technology fit. Novelty in our work includes discovering “fit” and “no-fit” factors in terms of process-technology alignment, and the development of a decision framework with a generic set of suggestions for BPM practitioners and decision makers. Our mixed-method approach is based on qualitative results by emphasizing in-depth insights and lessons learned rather than building a generalizable theory. We intend to guide managers and decision makers to help them think about possible directions, as suggested by our experts and case participants at the time of their technology adoption in a BPM context.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143119857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dissecting Code Features: An Evolutionary Analysis of Kernel Versus Nonkernel Code in Operating Systems 剖析代码特征:操作系统中内核代码与非内核代码的演化分析
IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-01-25 DOI: 10.1002/smr.2752
Yangyang Zhao, Chenglin Li, Zhifei Chen, Zuohua Ding

Understanding the evolution of software systems is crucial for advancing software engineering practices. Many studies have been devoted to exploring software evolution. However, they primarily treat software as an entire entity and overlook the inherent differences between subsystems, which may lead to biased conclusions. In this study, we attempt to explore variations between subsystems by investigating the code feature differences between kernel and nonkernel components from an evolutionary perspective. Based on three operating systems as case studies, we examine multiple dimensions, including the code churn characteristics and code inherent characteristics. The main findings are as follows: (1) The proportion of kernel code remains relatively small, and exhibits consistent stability across the majority of versions as systems evolve. (2) Kernel code exhibits higher stability in contrast to nonkernel code, characterized by a lower modification rate and finer modification granularity. The patterns of modification activities are similar in both kernel and nonkernel code, with a preference of changing code and a tendency to avoid the combination of adding and deleting code. (3) The cumulative code size and complexity of kernel files show an upward trajectory as the system evolves. (4) Kernel files exhibit a significantly higher code density and complexity than nonkernel files, featuring a greater number of code line, comments, and statements, along with a larger program length, vocabulary, and volume. Conversely, kernel functions prioritize modularity and maintainability, with a significantly smaller size and lower complexity than nonkernel functions. These insights contribute to a deeper understanding of the dynamics within operating system codebases and highlight the necessity of targeted maintenance strategies for different subsystems.

理解软件系统的演化对于推进软件工程实践是至关重要的。许多研究都致力于探索软件进化。然而,他们主要将软件视为一个完整的实体,而忽略了子系统之间的内在差异,这可能导致有偏见的结论。在这项研究中,我们试图通过从进化的角度研究内核和非内核组件之间的代码特征差异来探索子系统之间的差异。基于三个操作系统作为案例研究,我们检查了多个维度,包括代码混乱特征和代码固有特征。主要发现如下:(1)内核代码的比例仍然相对较小,并且随着系统的发展,在大多数版本中表现出一致的稳定性。(2)与非内核代码相比,内核代码具有更高的稳定性,具有更低的修改率和更细的修改粒度。修改活动的模式在内核和非内核代码中都是相似的,都倾向于更改代码,并倾向于避免添加和删除代码的组合。(3)随着系统的演化,内核文件的累积代码大小和复杂度呈上升趋势。(4)内核文件比非内核文件表现出明显更高的代码密度和复杂性,具有更多的代码行、注释和语句,以及更大的程序长度、词汇表和体积。相反,内核函数优先考虑模块化和可维护性,比非内核函数具有更小的大小和更低的复杂性。这些见解有助于更深入地理解操作系统代码库中的动态,并强调了针对不同子系统的目标维护策略的必要性。
{"title":"Dissecting Code Features: An Evolutionary Analysis of Kernel Versus Nonkernel Code in Operating Systems","authors":"Yangyang Zhao,&nbsp;Chenglin Li,&nbsp;Zhifei Chen,&nbsp;Zuohua Ding","doi":"10.1002/smr.2752","DOIUrl":"https://doi.org/10.1002/smr.2752","url":null,"abstract":"<div>\u0000 \u0000 <p>Understanding the evolution of software systems is crucial for advancing software engineering practices. Many studies have been devoted to exploring software evolution. However, they primarily treat software as an entire entity and overlook the inherent differences between subsystems, which may lead to biased conclusions. In this study, we attempt to explore variations between subsystems by investigating the code feature differences between kernel and nonkernel components from an evolutionary perspective. Based on three operating systems as case studies, we examine multiple dimensions, including the code churn characteristics and code inherent characteristics. The main findings are as follows: (1) The proportion of kernel code remains relatively small, and exhibits consistent stability across the majority of versions as systems evolve. (2) Kernel code exhibits higher stability in contrast to nonkernel code, characterized by a lower modification rate and finer modification granularity. The patterns of modification activities are similar in both kernel and nonkernel code, with a preference of changing code and a tendency to avoid the combination of adding and deleting code. (3) The cumulative code size and complexity of kernel files show an upward trajectory as the system evolves. (4) Kernel files exhibit a significantly higher code density and complexity than nonkernel files, featuring a greater number of code line, comments, and statements, along with a larger program length, vocabulary, and volume. Conversely, kernel functions prioritize modularity and maintainability, with a significantly smaller size and lower complexity than nonkernel functions. These insights contribute to a deeper understanding of the dynamics within operating system codebases and highlight the necessity of targeted maintenance strategies for different subsystems.</p>\u0000 </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143119356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Software-Evolution and Process
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1