Journal of Software-Evolution and Process最新文献

A hybrid‐ensemble model for software defect prediction for balanced and imbalanced datasets using AI‐based techniques with feature preservation: SMERKP‐XGB 使用基于人工智能的特征保存技术，为平衡和不平衡数据集建立软件缺陷预测混合组合模型：SMERKP-XGB

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-09-18 DOI: 10.1002/smr.2731

Mohd Mustaqeem, Tamanna Siddiqui, Suhel Mustajab

Maintaining software quality is a significant challenge as the complexity of software is increasing with the rise of the software industry. Software defects are a primary concern in complex modules, and predicting them in the early stages of the software development life cycle (SDLC) is difficult. Previous techniques to address this issue have not been very promising. We have proposed “A hybrid ensemble model for software defect prediction using AI‐based techniques with feature preservation” to overcome this problem. We have used the National Aeronautics and Space Administration (NASA) dataset from the PROMISE repository for testing and validation. By applying exploratory data analysis (EDA), feature engineering, scaling, and standardization, we found that the dataset is imbalanced, which can negatively affect the model's performance. To address this, we have used the Synthetic Minority Oversampling (SMOTE) technique and the edited nearest neighbor (ENN) (SMOTE‐ENN). We have also used recursive feature elimination cross‐validation (RFE‐CV) with a pipeline to prevent data leaking in CV and kernel‐based principal component analysis (K‐PCA) to minimize dimensionality and selectively relevant features. The reduced dimensional data is then given to the eXtreme Gradient Boosting (XGBoost) for classification, resulting in the hybrid‐ensemble (SMERKP‐XGB) model. The proposed SMERKP‐XGB model is better than previously developed models in terms of accuracy (CM1: 97.53%, PC1: 92.05%, and PC2: 97.45%, KC1:95.65%), and area under the receiver operating characteristic curve values (CM1:96.30%, PC1:98.30%, and PC2:99.30%: KC1: 93.54) and other evaluation criteria mentioned in the literature.

随着软件产业的兴起，软件的复杂性不断增加，如何保持软件质量是一项重大挑战。软件缺陷是复杂模块的首要问题，而在软件开发生命周期（SDLC）的早期阶段预测软件缺陷是非常困难的。以往解决这一问题的技术并不理想。为了解决这个问题，我们提出了 "基于人工智能技术的软件缺陷预测混合集合模型"。我们使用了美国国家航空航天局（NASA）PROMISE 数据库中的数据集进行测试和验证。通过应用探索性数据分析（EDA）、特征工程、缩放和标准化，我们发现数据集是不平衡的，这会对模型的性能产生负面影响。为此，我们使用了合成少数群体过度采样（SMOTE）技术和编辑近邻（ENN）技术（SMOTE-ENN）。我们还使用了递归特征消除交叉验证（RFE-CV）和基于内核的主成分分析（K-PCA），以防止 CV 和 K-PCA 中的数据泄露，从而最小化维度并选择相关特征。然后，将降维数据交给极梯度提升（XGBoost）进行分类，最终形成混合组合（SMERKP-XGB）模型。所提出的 SMERKP-XGB 模型在准确率（CM1：97.53%；PC1：92.05%；PC2：97.45%；KC1：95.65%）和接收者工作特征曲线下面积值（CM1：96.30%；PC1：98.30%；PC2：99.30%；KC1：93.54）以及文献中提到的其他评价标准方面均优于之前开发的模型。

{"title":"A hybrid‐ensemble model for software defect prediction for balanced and imbalanced datasets using AI‐based techniques with feature preservation: SMERKP‐XGB","authors":"Mohd Mustaqeem, Tamanna Siddiqui, Suhel Mustajab","doi":"10.1002/smr.2731","DOIUrl":"https://doi.org/10.1002/smr.2731","url":null,"abstract":"Maintaining software quality is a significant challenge as the complexity of software is increasing with the rise of the software industry. Software defects are a primary concern in complex modules, and predicting them in the early stages of the software development life cycle (SDLC) is difficult. Previous techniques to address this issue have not been very promising. We have proposed “A hybrid ensemble model for software defect prediction using AI‐based techniques with feature preservation” to overcome this problem. We have used the National Aeronautics and Space Administration (NASA) dataset from the PROMISE repository for testing and validation. By applying exploratory data analysis (EDA), feature engineering, scaling, and standardization, we found that the dataset is imbalanced, which can negatively affect the model's performance. To address this, we have used the Synthetic Minority Oversampling (SMOTE) technique and the edited nearest neighbor (ENN) (SMOTE‐ENN). We have also used recursive feature elimination cross‐validation (RFE‐CV) with a pipeline to prevent data leaking in CV and kernel‐based principal component analysis (K‐PCA) to minimize dimensionality and selectively relevant features. The reduced dimensional data is then given to the eXtreme Gradient Boosting (XGBoost) for classification, resulting in the hybrid‐ensemble (SMERKP‐XGB) model. The proposed SMERKP‐XGB model is better than previously developed models in terms of accuracy (CM1: 97.53%, PC1: 92.05%, and PC2: 97.45%, KC1:95.65%), and area under the receiver operating characteristic curve values (CM1:96.30%, PC1:98.30%, and PC2:99.30%: KC1: 93.54) and other evaluation criteria mentioned in the literature.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142265344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LLMs for science: Usage for code generation and data analysis 用于科学的 LLM：用于代码生成和数据分析

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-09-13 DOI: 10.1002/smr.2723

Mohamed Nejjar, Luca Zacharias, Fabian Stiehle, Ingo Weber

Large language models (LLMs) have been touted to enable increased productivity in many areas of today's work life. Scientific research as an area of work is no exception: The potential of LLM‐based tools to assist in the daily work of scientists has become a highly discussed topic across disciplines. However, we are only at the very onset of this subject of study. It is still unclear how the potential of LLMs will materialize in research practice. With this study, we give first empirical evidence on the use of LLMs in the research process. We have investigated a set of use cases for LLM‐based tools in scientific research and conducted a first study to assess to which degree current tools are helpful. In this position paper, we report specifically on use cases related to software engineering, specifically, on generating application code and developing scripts for data analytics and visualization. While we studied seemingly simple use cases, results across tools differ significantly. Our results highlight the promise of LLM‐based tools in general, yet we also observe various issues, particularly regarding the integrity of the output these tools provide.

大型语言模型（LLM）被认为可以提高当今许多工作领域的生产率。科学研究领域也不例外：基于 LLM 的工具在协助科学家日常工作方面的潜力已成为各学科讨论的热门话题。然而，我们对这一主题的研究才刚刚起步。目前还不清楚 LLM 的潜力将如何在研究实践中实现。通过本研究，我们首次提供了在研究过程中使用 LLM 的实证证据。我们调查了一系列基于 LLM 的工具在科学研究中的使用案例，并进行了首次研究，以评估当前工具在多大程度上有所帮助。在本立场文件中，我们特别报告了与软件工程相关的用例，尤其是生成应用代码以及开发数据分析和可视化脚本的用例。虽然我们研究的用例看似简单，但不同工具的结果却大相径庭。我们的研究结果凸显了基于 LLM 的工具的前景，但我们也发现了各种问题，尤其是这些工具所提供的输出的完整性。

{"title":"LLMs for science: Usage for code generation and data analysis","authors":"Mohamed Nejjar, Luca Zacharias, Fabian Stiehle, Ingo Weber","doi":"10.1002/smr.2723","DOIUrl":"https://doi.org/10.1002/smr.2723","url":null,"abstract":"Large language models (LLMs) have been touted to enable increased productivity in many areas of today's work life. Scientific research as an area of work is no exception: The potential of LLM‐based tools to assist in the daily work of scientists has become a highly discussed topic across disciplines. However, we are only at the very onset of this subject of study. It is still unclear how the potential of LLMs will materialize in research practice. With this study, we give first empirical evidence on the use of LLMs in the research process. We have investigated a set of use cases for LLM‐based tools in scientific research and conducted a first study to assess to which degree current tools are helpful. In this position paper, we report specifically on use cases related to software engineering, specifically, on generating application code and developing scripts for data analytics and visualization. While we studied seemingly simple use cases, results across tools differ significantly. Our results highlight the promise of LLM‐based tools in general, yet we also observe various issues, particularly regarding the integrity of the output these tools provide.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"4 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detect atomicity violations in concurrent programs through user assistance and identification of suspicious variable access patterns 通过用户协助和识别可疑变量访问模式，检测并发程序中的原子性违规行为

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-09-04 DOI: 10.1002/smr.2725

Jingwen Zhao, Yanxia Wu, Yun Feng, Jibin Dong, Changting Shi

Atomicity violation bugs are a frequent problem in concurrency. Because of the unpredictable nature of thread interleaving, most current methods are unable to differentiate between harmful and benign atomicity violations. This makes it challenging to determine the existence of an actual bug. This paper presents a method for detecting atomicity violation bugs in programs based on user interaction. First, UserTrack matches access interleaving patterns to identify all potential violations. We then verify the programmer's atomicity intent between two logical access operations on the same thread and filter out some candidates. Finally, a user interaction mechanism checks the paths that do not produce atomicity violations when threads are interleaved. Our method focuses on a small number of interesting states and interleavings, while allowing the programmer to impose constraints on thread interleavings and explore all executions that satisfy these constraints. We continuously gather information through user feedback results and detect atomicity violation bugs that truly go against the programmer's intent. We evaluated the method using benchmark tests, which demonstrated the effectiveness of UserTrack in detecting atomicity violations.

违反原子性错误是并发问题中经常出现的问题。由于线程交错的不可预测性，目前大多数方法都无法区分有害和无害的原子性违规。这就给确定是否存在实际漏洞带来了挑战。本文介绍了一种基于用户交互的程序原子性违规漏洞检测方法。首先，UserTrack 对访问交织模式进行匹配，以识别所有潜在的违规行为。然后，我们验证程序员在同一线程的两个逻辑访问操作之间的原子性意图，并过滤掉一些候选者。最后，用户交互机制会检查线程交错时不会产生原子性违规的路径。我们的方法侧重于少数有趣的状态和交错，同时允许程序员对线程交错施加约束，并探索满足这些约束的所有执行。我们通过用户反馈结果不断收集信息，并检测出真正违背程序员意图的违反原子性的错误。我们使用基准测试对该方法进行了评估，结果表明 UserTrack 在检测违反原子性方面非常有效。

{"title":"Detect atomicity violations in concurrent programs through user assistance and identification of suspicious variable access patterns","authors":"Jingwen Zhao, Yanxia Wu, Yun Feng, Jibin Dong, Changting Shi","doi":"10.1002/smr.2725","DOIUrl":"https://doi.org/10.1002/smr.2725","url":null,"abstract":"Atomicity violation bugs are a frequent problem in concurrency. Because of the unpredictable nature of thread interleaving, most current methods are unable to differentiate between harmful and benign atomicity violations. This makes it challenging to determine the existence of an actual bug. This paper presents a method for detecting atomicity violation bugs in programs based on user interaction. First, UserTrack matches access interleaving patterns to identify all potential violations. We then verify the programmer's atomicity intent between two logical access operations on the same thread and filter out some candidates. Finally, a user interaction mechanism checks the paths that do not produce atomicity violations when threads are interleaved. Our method focuses on a small number of interesting states and interleavings, while allowing the programmer to impose constraints on thread interleavings and explore all executions that satisfy these constraints. We continuously gather information through user feedback results and detect atomicity violation bugs that truly go against the programmer's intent. We evaluated the method using benchmark tests, which demonstrated the effectiveness of UserTrack in detecting atomicity violations.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"53 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142185445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measuring software engineer's contribution in practice: An industrial experience report 衡量软件工程师的实际贡献：行业经验报告

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-09-04 DOI: 10.1002/smr.2722

Yue Li, He Zhang, Lanxin Yang, Liming Dong, Juzheng Zhang, Bohan Liu

Software engineers play a centric role throughout the software development lifecycle. Their activities directly impact the quality, performance, and successful delivery of software products, in particular for enterprises with an emphasis on high levels of quality assurance and timely delivery. Proper incentives that motivate software engineers are vital to secure and continuously improve development productivity and software quality. However, most existing research ignores the positive incentives for software engineers, especially industry‐oriented research. In addition, existing research largely relies on peer assessment and lacks objectivity and transparency. To this end, this study investigates the process of contribution measurement for software engineers in a global Information and Communications Technology (ICT) enterprise, to explore the practical experiences and significance of contribution measurement. We investigated the practices of contribution measurement through multiple methods, including archival analysis, interviews, and survey. A total of 22 software engineers were interviewed to understand the practical implementation process of measuring contributions and its impact on software processes as well as engineers. In addition, 74 responses to our questionnaire were collected and used for a comprehensive impact analysis on software engineers. The analysis results reveal five benefits for software development processes and four benefits for practitioners of contribution measurement in the studied enterprise. In addition, this study reports on the best practices of contribution measurement, such as team‐specific measurements, and provides a practical reference for researchers and organizations interested in studying or performing contribution measurement.

软件工程师在整个软件开发生命周期中发挥着核心作用。他们的工作直接影响到软件产品的质量、性能和成功交付，尤其是对于强调高水平质量保证和及时交付的企业而言。激励软件工程师的适当激励措施对于确保并不断提高开发效率和软件质量至关重要。然而，大多数现有研究都忽视了对软件工程师的积极激励，尤其是面向行业的研究。此外，现有研究大多依赖同行评估，缺乏客观性和透明度。为此，本研究调查了一家全球性信息和通信技术（ICT）企业中软件工程师的贡献度测评过程，以探索贡献度测评的实践经验和意义。我们通过档案分析、访谈和调查等多种方法调查了贡献度衡量的实践。我们共采访了 22 名软件工程师，以了解贡献度衡量的实际实施过程及其对软件流程和工程师的影响。此外，我们还收集了 74 份问卷回复，用于对软件工程师的影响进行综合分析。分析结果显示，在所研究的企业中，贡献度衡量为软件开发流程带来了五项益处，为从业人员带来了四项益处。此外，本研究还报告了贡献度测评的最佳实践，如针对特定团队的测评，并为有兴趣研究或实施贡献度测评的研究人员和组织提供了实用参考。

{"title":"Measuring software engineer's contribution in practice: An industrial experience report","authors":"Yue Li, He Zhang, Lanxin Yang, Liming Dong, Juzheng Zhang, Bohan Liu","doi":"10.1002/smr.2722","DOIUrl":"https://doi.org/10.1002/smr.2722","url":null,"abstract":"Software engineers play a centric role throughout the software development lifecycle. Their activities directly impact the quality, performance, and successful delivery of software products, in particular for enterprises with an emphasis on high levels of quality assurance and timely delivery. Proper incentives that motivate software engineers are vital to secure and continuously improve development productivity and software quality. However, most existing research ignores the positive incentives for software engineers, especially industry‐oriented research. In addition, existing research largely relies on peer assessment and lacks objectivity and transparency. To this end, this study investigates the process of contribution measurement for software engineers in a global Information and Communications Technology (ICT) enterprise, to explore the practical experiences and significance of contribution measurement. We investigated the practices of contribution measurement through multiple methods, including archival analysis, interviews, and survey. A total of 22 software engineers were interviewed to understand the practical implementation process of measuring contributions and its impact on software processes as well as engineers. In addition, 74 responses to our questionnaire were collected and used for a comprehensive impact analysis on software engineers. The analysis results reveal five benefits for software development processes and four benefits for practitioners of contribution measurement in the studied enterprise. In addition, this study reports on the best practices of contribution measurement, such as team‐specific measurements, and provides a practical reference for researchers and organizations interested in studying or performing contribution measurement.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"60 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

E2E test execution optimization for web application based on state reuse 基于状态重用的网络应用程序 E2E 测试执行优化

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-09-02 DOI: 10.1002/smr.2714

Ruilian Zhao, Shukai Zhang, Zhifan Zhu, Ying Shang, Weiwei Wang

End‐to‐end (E2E) testing is a commonly used technique for web application testing. Unlike traditional unit tests, E2E tests focus on test scenarios that target the entire business, which integrate and collaborate of various components and services to make the whole application work. As a result, one drawback of E2E tests is their longer execution time, seriously affecting testing efficiency of web applications. In order to speed up the execution of E2E tests for web applications, this paper proposes a test execution optimization approach based page state reuse. Through analyzing the common operations of E2E tests, a common prefix trees is constructed to organize the same operations among test scripts in a test suite of web applications. Under the guidance of the prefix tree, the optimal reusable state is identified and duplicated to maximize the utilization of page states triggered by the same operations, thereby reducing the overall execution time of the test suite. Besides, to realize page state reuse automatically, we design a browser process replication strategy, which implement querying the active page and duplicating the web page. To verify the effectiveness of our method, experiments and evaluations were conducted on 347 E2E tests from eight open‐source web applications, and the results showed that our approach reduced the E2E testing execution time for web applications by 52%–68%.

端到端（E2E）测试是一种常用的网络应用程序测试技术。与传统的单元测试不同，E2E 测试的重点是针对整个业务的测试场景，这些场景整合了各种组件和服务并相互协作，使整个应用程序能够正常运行。因此，E2E 测试的一个缺点是执行时间较长，严重影响了网络应用程序的测试效率。为了加快网络应用程序 E2E 测试的执行速度，本文提出了一种基于页面状态重用的测试执行优化方法。通过分析 E2E 测试中的常见操作，本文构建了一个通用前缀树，用于组织网络应用程序测试套件中各测试脚本之间的相同操作。在前缀树的指导下，找出最佳的可重用状态并进行复制，最大限度地利用相同操作触发的页面状态，从而减少测试套件的整体执行时间。此外，为了自动实现页面状态重用，我们设计了一种浏览器进程复制策略，实现了对活动页面的查询和网页复制。为了验证我们方法的有效性，我们在八个开源网络应用程序的 347 个 E2E 测试中进行了实验和评估，结果表明我们的方法缩短了网络应用程序的 E2E 测试执行时间 52%-68%。

{"title":"E2E test execution optimization for web application based on state reuse","authors":"Ruilian Zhao, Shukai Zhang, Zhifan Zhu, Ying Shang, Weiwei Wang","doi":"10.1002/smr.2714","DOIUrl":"https://doi.org/10.1002/smr.2714","url":null,"abstract":"End‐to‐end (E2E) testing is a commonly used technique for web application testing. Unlike traditional unit tests, E2E tests focus on test scenarios that target the entire business, which integrate and collaborate of various components and services to make the whole application work. As a result, one drawback of E2E tests is their longer execution time, seriously affecting testing efficiency of web applications. In order to speed up the execution of E2E tests for web applications, this paper proposes a test execution optimization approach based page state reuse. Through analyzing the common operations of E2E tests, a common prefix trees is constructed to organize the same operations among test scripts in a test suite of web applications. Under the guidance of the prefix tree, the optimal reusable state is identified and duplicated to maximize the utilization of page states triggered by the same operations, thereby reducing the overall execution time of the test suite. Besides, to realize page state reuse automatically, we design a browser process replication strategy, which implement querying the active page and duplicating the web page. To verify the effectiveness of our method, experiments and evaluations were conducted on 347 E2E tests from eight open‐source web applications, and the results showed that our approach reduced the E2E testing execution time for web applications by 52%–68%.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"60 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142185444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards an ontology for process compliance with the (machinery) legislations 建立流程符合（机械）法规的本体论

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-08-29 DOI: 10.1002/smr.2728

Barbara Gallina, Gergő László Steierhoffer, Thomas Young Olesen, Eszter Parajdi, Mike Aarup

Legislations impose requirements on the manufacturing of machinery. Typically, these requirements are interpreted and refined by (domain‐specific) technical committees and published in terms of standards. At the company level, these refined requirements are further interpreted, refined, and documented in terms of internal processes. Due to the proliferation of (interdependent) legislations and standards and the consequent increase of the cognitive complexity, at the company level, manual knowledge management is becoming more and more challenging and requires automated decision support. Despite the availability of approaches aimed at automating the decision support, no one offers a satisfactory solution. In this paper, we focus on knowledge management for process compliance and we propose a novel structured ontology. Our ontology aims at mastering (by dividing and conquering via tracing) the cognitive complexity of the compliance problem, when heterogeneous and sometimes geographically distributed knowledge‐driven organizational structures (legal department, standardization department, etc.) are involved and need to communicate. We also illustrate the potential usefulness of our proposed ontology in the context of pumps manufacturing and safety process compliance with the Machinery Directive and related harmonized standards including EN 809:1998+A1. Specifically, first, we identify the competencies that characterize departments and interdepartment interactions, then we formulate an initial set of competency questions that translate those identified competencies, then we show how the ontology can be exploited to retrieve the answers to the questions and how the answers can be exploited to build a justification for compliance. Precisely, we propose an argumentation pattern given in two different argumentation notations, and we show how it can be partly instantiated by exploiting the returned answers. The illustration also partly covers the compliance with the Machinery Regulation, expected to replace the Machinery Directive by January 2027. Finally, we sketch our intended future work.

法律对机械制造提出了要求。通常，这些要求由（特定领域的）技术委员会进行解释和完善，并以标准的形式发布。在公司层面，这些细化的要求会在内部流程中得到进一步解释、细化和记录。由于（相互依存的）立法和标准激增，认知复杂性随之增加，在公司层面，人工知识管理变得越来越具有挑战性，需要自动化决策支持。尽管有很多旨在实现决策支持自动化的方法，但没有一种能提供令人满意的解决方案。在本文中，我们将重点放在流程合规的知识管理上，并提出了一种新颖的结构化本体。我们的本体论旨在掌握合规问题的认知复杂性（通过追踪进行划分和征服），当涉及到异构的、有时是分布在不同地域的知识驱动型组织结构（法律部门、标准化部门等）并需要进行沟通时，我们的本体论就会发挥作用。我们还以泵制造和安全流程是否符合机械指令和相关协调标准（包括 EN 809:1998+A1）为背景，说明了我们提出的本体论的潜在用途。具体来说，首先，我们确定了各部门和部门间互动的能力特征，然后，我们提出了一组初步的能力问题，对这些已确定的能力进行翻译，接着，我们展示了如何利用本体来检索问题的答案，以及如何利用这些答案来建立合规理由。准确地说，我们提出了一种以两种不同论证符号给出的论证模式，并展示了如何通过利用返回的答案将其部分实例化。该说明还部分涵盖了《机械条例》的合规性，该条例预计将于 2027 年 1 月取代《机械指令》。最后，我们简要介绍了今后打算开展的工作。

{"title":"Towards an ontology for process compliance with the (machinery) legislations","authors":"Barbara Gallina, Gergő László Steierhoffer, Thomas Young Olesen, Eszter Parajdi, Mike Aarup","doi":"10.1002/smr.2728","DOIUrl":"https://doi.org/10.1002/smr.2728","url":null,"abstract":"Legislations impose requirements on the manufacturing of machinery. Typically, these requirements are interpreted and refined by (domain‐specific) technical committees and published in terms of standards. At the company level, these refined requirements are further interpreted, refined, and documented in terms of internal processes. Due to the proliferation of (interdependent) legislations and standards and the consequent increase of the cognitive complexity, at the company level, manual knowledge management is becoming more and more challenging and requires automated decision support. Despite the availability of approaches aimed at automating the decision support, no one offers a satisfactory solution. In this paper, we focus on knowledge management for process compliance and we propose a novel structured ontology. Our ontology aims at mastering (by dividing and conquering via tracing) the cognitive complexity of the compliance problem, when heterogeneous and sometimes geographically distributed knowledge‐driven organizational structures (legal department, standardization department, etc.) are involved and need to communicate. We also illustrate the potential usefulness of our proposed ontology in the context of pumps manufacturing and safety process compliance with the Machinery Directive and related harmonized standards including EN 809:1998+A1. Specifically, first, we identify the competencies that characterize departments and interdepartment interactions, then we formulate an initial set of competency questions that translate those identified competencies, then we show how the ontology can be exploited to retrieve the answers to the questions and how the answers can be exploited to build a justification for compliance. Precisely, we propose an argumentation pattern given in two different argumentation notations, and we show how it can be partly instantiated by exploiting the returned answers. The illustration also partly covers the compliance with the Machinery Regulation, expected to replace the Machinery Directive by January 2027. Finally, we sketch our intended future work.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"17 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142185446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NG_MDERANK: A software vulnerability feature knowledge extraction method based on N‐gram similarity NG_MDERANK：基于 N-gram 相似性的软件漏洞特征知识提取方法

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-08-27 DOI: 10.1002/smr.2727

Xiaoxue Wu, Shiyu Weng, Bin Zheng, Wei Zheng, Xiang Chen, Xiaobin Sun

As software grows in size and complexity, software vulnerabilities are increasing, leading to a range of serious insecurity issues. Open‐source software vulnerability reports and documentation can provide researchers with great convenience for analysis and detection. However, the quality of different data sources varies, the data are duplicated and lack of correlation, which often requires a lot of manual management and analysis. In order to solve the problems of scattered and heterogeneous data and lack of correlation in traditional vulnerability repositories, this paper proposes a software vulnerability feature knowledge extraction method that combines the N‐gram model and mask similarity. The method generates mask text data based on the extraction of N‐gram candidate keywords and extracts vulnerability feature knowledge by calculating the similarity of mask text. This method analyzes the samples efficiently and stably in the environment of large sample size and complex samples and can obtain high‐value semi‐structured data. Then, the final node, relationship, and attribute information are obtained by secondary knowledge cleaning and extraction of the extracted semi‐structured data results. And based on the extraction results, the corresponding software vulnerability domain knowledge graph is constructed to deeply explore the semantic information features and entity relationships of vulnerabilities, which can help to efficiently study software security problems and solve vulnerability problems. The effectiveness and superiority of the proposed method is verified by comparing it with several traditional keyword extraction algorithms on Common Weakness Enumeration (CWE) and Common Vulnerabilities and Exposures (CVE) vulnerability data.

随着软件规模和复杂性的增长，软件漏洞也在不断增加，导致了一系列严重的不安全问题。开源软件漏洞报告和文档可以为研究人员的分析和检测提供极大的便利。然而，不同数据源的质量参差不齐，数据重复且缺乏关联性，往往需要大量的人工管理和分析。为了解决传统漏洞库中数据分散、异构、缺乏关联性等问题，本文提出了一种结合 N-gram 模型和掩码相似性的软件漏洞特征知识提取方法。该方法在提取 N-gram 候选关键词的基础上生成掩码文本数据，并通过计算掩码文本的相似度提取漏洞特征知识。该方法能在样本量大、样本复杂的环境下高效、稳定地分析样本，并能获得高价值的半结构化数据。然后，通过对提取的半结构化数据结果进行二次知识清洗和提取，得到最终的节点、关系和属性信息。并根据提取结果构建相应的软件漏洞领域知识图谱，深入挖掘漏洞的语义信息特征和实体关系，有助于高效地研究软件安全问题和解决漏洞问题。通过在常见弱点枚举（CWE）和常见漏洞与暴露（CVE）漏洞数据上与几种传统关键词提取算法的比较，验证了所提方法的有效性和优越性。

{"title":"NG_MDERANK: A software vulnerability feature knowledge extraction method based on N‐gram similarity","authors":"Xiaoxue Wu, Shiyu Weng, Bin Zheng, Wei Zheng, Xiang Chen, Xiaobin Sun","doi":"10.1002/smr.2727","DOIUrl":"https://doi.org/10.1002/smr.2727","url":null,"abstract":"As software grows in size and complexity, software vulnerabilities are increasing, leading to a range of serious insecurity issues. Open‐source software vulnerability reports and documentation can provide researchers with great convenience for analysis and detection. However, the quality of different data sources varies, the data are duplicated and lack of correlation, which often requires a lot of manual management and analysis. In order to solve the problems of scattered and heterogeneous data and lack of correlation in traditional vulnerability repositories, this paper proposes a software vulnerability feature knowledge extraction method that combines the N‐gram model and mask similarity. The method generates mask text data based on the extraction of N‐gram candidate keywords and extracts vulnerability feature knowledge by calculating the similarity of mask text. This method analyzes the samples efficiently and stably in the environment of large sample size and complex samples and can obtain high‐value semi‐structured data. Then, the final node, relationship, and attribute information are obtained by secondary knowledge cleaning and extraction of the extracted semi‐structured data results. And based on the extraction results, the corresponding software vulnerability domain knowledge graph is constructed to deeply explore the semantic information features and entity relationships of vulnerabilities, which can help to efficiently study software security problems and solve vulnerability problems. The effectiveness and superiority of the proposed method is verified by comparing it with several traditional keyword extraction algorithms on Common Weakness Enumeration (CWE) and Common Vulnerabilities and Exposures (CVE) vulnerability data.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"6 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The PIM.3 process improvement process—Part of the iNTACS certified process expert training PIM.3 流程改进流程--iNTACS 认证流程专家培训的一部分

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-08-27 DOI: 10.1002/smr.2726

Richard Messnarz, Vesna Djordjevic, Viktor Grémen, Winifred Menezes, Ahmed Alborae, Rainer Dreves, So Norimatsu, Thomas Wegner, Bernhard Sechser

This paper documents the results of the PIM.3 (Process Improvement Management) working group in INTACS (International Assessor Certification Schema) supported by the VDA‐QMC (Verband der Deutschen Automobilindustrie/German Automotive Association–Quality Management Center). INTACS promotes Automotive SPICE, which is an international standard that allows process capability assessment of projects, which implement systems that integrate mechanics, electronics, and software including optionally cybersecurity, functional safety, and machine learning. The paper outlines that for the first time since more than 20 years, the INTACS and VDA‐QMC included a process like PIM.3 Process Improvement Management in the scope for the assessor training. Before that, the assessments focused on the management, engineering, and support processes of series projects, while the improvement management has not been trained or assessed.

本文记录了 INTACS（国际评估师认证模式）PIM.3（流程改进管理）工作组在 VDA-QMC（德国汽车工业协会/德国汽车协会-质量管理中心）支持下取得的成果。INTACS 推广汽车 SPICE，这是一项国际标准，允许对项目的过程能力进行评估，这些项目实施的系统集成了机械、电子和软件，包括可选的网络安全、功能安全和机器学习。文件概述了 INTACS 和 VDA-QMC 20 多年来首次将 PIM.3 流程改进管理等流程纳入评估员培训范围。在此之前，评估的重点是系列项目的管理、工程和支持过程，而改进管理则没有经过培训或评估。

引用次数: 0

Are your comments outdated? Toward automatically detecting code‐comment consistency 您的注释过时了吗？自动检测代码注释的一致性

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-08-27 DOI: 10.1002/smr.2718

Yuan Huang, Yinan Chen, Xiangping Chen, Xiaocong Zhou

In software development and maintenance, code comments can help developers understand source code and improve communication among developers. However, developers sometimes neglect to update the corresponding comment when changing the code, resulting in outdated comments (i.e., inconsistent codes and comments). Outdated comments are dangerous and harmful and may mislead subsequent developers. More seriously, the outdated comments may lead to a fatal flaw sometime in the future. To automatically identify the outdated comments in source code, we proposed a learning‐based method, called CoCC, to detect the consistency between code and comment. To efficiently identify outdated comments, we extract multiple features from both codes and comments before and after they change. Besides, we also consider the relation between code and comment in our model. Experiment results show that CoCC can effectively detect outdated comments with precision over 90%. In addition, we have identified the 15 most important factors that cause outdated comments and verified the applicability of CoCC in different programming languages. We also used CoCC to find outdated comments in the latest commits of open source projects, which further proves the effectiveness of the proposed method.

在软件开发和维护过程中，代码注释可以帮助开发人员理解源代码，改善开发人员之间的交流。然而，开发人员在修改代码时有时会忽略更新相应的注释，导致注释过时（即代码和注释不一致）。过时的注释是危险和有害的，可能会误导后续的开发人员。更严重的是，过时的注释可能会在将来的某个时候导致致命的缺陷。为了自动识别源代码中的过时注释，我们提出了一种基于学习的方法，称为 CoCC，用于检测代码与注释之间的一致性。为了有效识别过期注释，我们从代码和注释变化前后提取了多个特征。此外，我们还在模型中考虑了代码和注释之间的关系。实验结果表明，CoCC 可以有效检测过时注释，精确度超过 90%。此外，我们还找出了导致注释过时的 15 个最重要因素，并验证了 CoCC 在不同编程语言中的适用性。我们还使用 CoCC 查找了开源项目最新提交中的过时注释，这进一步证明了所提方法的有效性。

{"title":"Are your comments outdated? Toward automatically detecting code‐comment consistency","authors":"Yuan Huang, Yinan Chen, Xiangping Chen, Xiaocong Zhou","doi":"10.1002/smr.2718","DOIUrl":"https://doi.org/10.1002/smr.2718","url":null,"abstract":"In software development and maintenance, code comments can help developers understand source code and improve communication among developers. However, developers sometimes neglect to update the corresponding comment when changing the code, resulting in outdated comments (i.e., inconsistent codes and comments). Outdated comments are dangerous and harmful and may mislead subsequent developers. More seriously, the outdated comments may lead to a fatal flaw sometime in the future. To automatically identify the outdated comments in source code, we proposed a learning‐based method, called CoCC, to detect the consistency between code and comment. To efficiently identify outdated comments, we extract multiple features from both codes and comments before and after they change. Besides, we also consider the relation between code and comment in our model. Experiment results show that CoCC can effectively detect outdated comments with precision over 90%. In addition, we have identified the 15 most important factors that cause outdated comments and verified the applicability of CoCC in different programming languages. We also used CoCC to find outdated comments in the latest commits of open source projects, which further proves the effectiveness of the proposed method.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"52 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Complex outsourcing relationships management model 复杂的外包关系管理模式

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process

Pub Date : 2024-08-26 DOI: 10.1002/smr.2724

Ghulam Murtaza Khan, Siffat Ullah Khan, Mahmood Niazi, Muhammad Ilyas, Mamoona Humayun, Akash Ahmad, Javed Ali Khan, Sajjad Mahmood

Global software development (GSD) refers to developing software with a distributed team spanning multiple locations and time zones. Based on relationships, there are four types of outsourcing: dyadic (one client–one vendor), multi‐vendor (one client–many vendors), co‐sourcing (many clients–one vendor), and complex outsourcing (many clients–many vendors). Compared to the other types of outsourcing contracts, complex outsourcing contracts are the hardest to work on and have the highest risk of project failure. This paper presents a model, the complex outsourcing relationships management model (CORMM), to assist the complex outsourcing stakeholders (both the clients and vendors) in managing their relationships in the context of GSD. This paper aims to develop a CORMM to assist the complex outsourcing relationships management stakeholders in GSD. Also, we are interested in identifying the applicability and effectiveness of the CORMM in the real‐world industry. The research approach follows a structured methodology comprising multiple phases. Initially, it leverages a systematic literature review (SLR) as its primary research method. The second phase involves the validation of the SLR findings via an empirical study. Subsequently, in the third phase, a model is developed. Finally, the proposed research approach is validated by incorporating two industrial case studies to assess the organization's relationship management utilizing the Motorola tool. The case study results show that CORMM can successfully point out relationship management issues in a complex outsourcing context. The feedback received from the participants of both companies indicates several positive and valuable insights about the CORMM and its application in the context of complex outsourcing relationships. The results highlight that CORMM serves as an assessment tool for evaluating an organization's relationship management capability and a means for organizations to enhance their position. Through CORMM, complex outsourcing organizations (many clients–many vendors) can identify strengths and weaknesses in their relationship management practices, enabling targeted improvement efforts.

全球软件开发（GSD）是指由一个跨越多个地点和时区的分布式团队开发软件。根据关系，外包可分为四种类型：二元外包（一个客户-一个供应商）、多供应商外包（一个客户-多个供应商）、联合外包（多个客户-一个供应商）和复杂外包（多个客户-多个供应商）。与其他类型的外包合同相比，复杂外包合同最难处理，项目失败的风险也最高。本文提出了一个模型，即复杂外包关系管理模型（CORMM），以帮助复杂外包利益相关者（客户和供应商）在 GSD 的背景下管理他们之间的关系。本文旨在开发一个 CORMM，以协助全球可持续发展服务中的复杂外包关系管理相关方。此外，我们还希望确定 CORMM 在实际行业中的适用性和有效性。研究方法采用结构化方法，包括多个阶段。首先，它采用系统文献综述（SLR）作为主要研究方法。第二阶段是通过实证研究验证系统文献综述的结论。随后，在第三阶段建立模型。最后，通过结合两个行业案例研究，利用摩托罗拉工具对组织的关系管理进行评估，从而验证所提出的研究方法。案例研究结果表明，CORMM 可以成功指出复杂外包环境中的关系管理问题。从两家公司的参与者那里得到的反馈表明，CORMM 及其在复杂外包关系中的应用具有积极而宝贵的意义。结果突出表明，CORMM 是评估组织关系管理能力的一种评估工具，也是组织提高自身地位的一种手段。通过 CORMM，复杂的外包组织（许多客户-许多供应商）可以确定其关系管理实践中的优缺点，从而有针对性地改进工作。

{"title":"Complex outsourcing relationships management model","authors":"Ghulam Murtaza Khan, Siffat Ullah Khan, Mahmood Niazi, Muhammad Ilyas, Mamoona Humayun, Akash Ahmad, Javed Ali Khan, Sajjad Mahmood","doi":"10.1002/smr.2724","DOIUrl":"https://doi.org/10.1002/smr.2724","url":null,"abstract":"Global software development (GSD) refers to developing software with a distributed team spanning multiple locations and time zones. Based on relationships, there are four types of outsourcing: dyadic (one client–one vendor), multi‐vendor (one client–many vendors), co‐sourcing (many clients–one vendor), and complex outsourcing (many clients–many vendors). Compared to the other types of outsourcing contracts, complex outsourcing contracts are the hardest to work on and have the highest risk of project failure. This paper presents a model, the complex outsourcing relationships management model (CORMM), to assist the complex outsourcing stakeholders (both the clients and vendors) in managing their relationships in the context of GSD. This paper aims to develop a CORMM to assist the complex outsourcing relationships management stakeholders in GSD. Also, we are interested in identifying the applicability and effectiveness of the CORMM in the real‐world industry. The research approach follows a structured methodology comprising multiple phases. Initially, it leverages a systematic literature review (SLR) as its primary research method. The second phase involves the validation of the SLR findings via an empirical study. Subsequently, in the third phase, a model is developed. Finally, the proposed research approach is validated by incorporating two industrial case studies to assess the organization's relationship management utilizing the Motorola tool. The case study results show that CORMM can successfully point out relationship management issues in a complex outsourcing context. The feedback received from the participants of both companies indicates several positive and valuable insights about the CORMM and its application in the context of complex outsourcing relationships. The results highlight that CORMM serves as an assessment tool for evaluating an organization's relationship management capability and a means for organizations to enhance their position. Through CORMM, complex outsourcing organizations (many clients–many vendors) can identify strengths and weaknesses in their relationship management practices, enabling targeted improvement efforts.","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"62 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0