Journal of Systems and Software最新文献_第8页

Time to separate from StackOverflow and match with ChatGPT for encryption 是时候脱离 StackOverflow 并与 ChatGPT 进行加密匹配了

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-06-12 DOI: 10.1016/j.jss.2024.112135

Ehsan Firouzi, Mohammad Ghafari

Cryptography is known as a challenging topic for developers. We studied StackOverflow posts to identify the problems that developers encounter when using Java Cryptography Architecture (JCA) for symmetric encryption. We investigated security risks that are disseminated in these posts, and we examined whether ChatGPT helps avoid cryptography issues. We found that developers frequently struggle with key and IV generations, as well as padding. Security is a top concern among developers, but security issues are pervasive in code snippets. ChatGPT can effectively aid developers when they engage with it properly. Nevertheless, it does not substitute human expertise, and developers should remain alert.

众所周知，密码学对开发人员来说是一个具有挑战性的课题。我们研究了 StackOverflow 的帖子，以确定开发人员在使用 Java Cryptography Architecture（JCA）进行对称加密时遇到的问题。我们调查了这些帖子中传播的安全风险，并研究了 ChatGPT 是否有助于避免加密问题。我们发现，开发人员经常在密钥和 IV 生成以及填充方面遇到困难。安全是开发人员最关心的问题，但安全问题在代码片段中却无处不在。当开发人员正确使用 ChatGPT 时，它能有效地帮助他们。不过，它并不能取代人类的专业知识，开发人员应保持警惕。

引用次数: 0

Long-term software fault prediction with wavelet shrinkage estimation 利用小波收缩估算进行长期软件故障预测

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-06-07 DOI: 10.1016/j.jss.2024.112123

Jingchi Wu, Tadashi Dohi, Hiroyuki Okamura

Wavelet shrinkage estimation received considerable attentions to estimate stochastic processes such as a non-homogeneous Poisson process in a non-parametric way, and was applied to software reliability estimation/prediction. However, it lacks the prediction ability for unknown future patterns in long term and penalizes assessing the software reliability in practice. In this paper, we focus on the long-term prediction of the number of software faults detected in the testing phase and propose many novel long-term prediction methods based on the wavelet shrinkage estimation. The fundamental idea is to adopt both the denoised fault-count data and prediction values, and to minimize several kinds of loss functions to make effective predictions. We also develop an automated wavelet-based software reliability assessment tool, W-SRAT2, which is a drastic extension of the existing tool, W-SRAT, by adding those prediction algorithms. In numerical experiments with 6 actual software development project data, we investigate the predictive performance of our long-term prediction approaches, which consist of 2,640 combinations, and compare them with the common software reliability growth models with the maximum likelihood estimation. It is shown that our wavelet shrinkage estimation/prediction methods outperform the existing software reliability growth models.

小波收缩估计法在以非参数方式估计随机过程（如非均质泊松过程）方面受到广泛关注，并被应用于软件可靠性估计/预测。然而，它缺乏对未知未来长期模式的预测能力，在实际评估软件可靠性时会受到影响。本文将重点放在测试阶段检测到的软件故障数量的长期预测上，并提出了许多基于小波收缩估计的新型长期预测方法。其基本思想是同时采用去噪故障计数数据和预测值，并最小化几种损失函数，从而进行有效预测。我们还开发了基于小波的软件可靠性自动评估工具 W-SRAT2，它是对现有工具 W-SRAT 的大幅扩展，增加了这些预测算法。在使用 6 个实际软件开发项目数据进行的数值实验中，我们研究了由 2,640 种组合组成的长期预测方法的预测性能，并将其与使用最大似然估计的常见软件可靠性增长模型进行了比较。结果表明，我们的小波收缩估计/预测方法优于现有的软件可靠性增长模型。

{"title":"Long-term software fault prediction with wavelet shrinkage estimation","authors":"Jingchi Wu, Tadashi Dohi, Hiroyuki Okamura","doi":"10.1016/j.jss.2024.112123","DOIUrl":"10.1016/j.jss.2024.112123","url":null,"abstract":"<div><p>Wavelet shrinkage estimation received considerable attentions to estimate stochastic processes such as a non-homogeneous Poisson process in a non-parametric way, and was applied to software reliability estimation/prediction. However, it lacks the prediction ability for unknown future patterns in long term and penalizes assessing the software reliability in practice. In this paper, we focus on the long-term prediction of the number of software faults detected in the testing phase and propose many novel long-term prediction methods based on the wavelet shrinkage estimation. The fundamental idea is to adopt both the denoised fault-count data and prediction values, and to minimize several kinds of loss functions to make effective predictions. We also develop an automated wavelet-based software reliability assessment tool, W-SRAT2, which is a drastic extension of the existing tool, W-SRAT, by adding those prediction algorithms. In numerical experiments with 6 actual software development project data, we investigate the predictive performance of our long-term prediction approaches, which consist of 2,640 combinations, and compare them with the common software reliability growth models with the maximum likelihood estimation. It is shown that our wavelet shrinkage estimation/prediction methods outperform the existing software reliability growth models.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141399402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Set evolution based test data generation for killing stubborn mutants 基于集合进化的测试数据生成，用于杀死顽固突变体

IF 3.5 2区计算机科学 Q1 Computer Science

Journal of Systems and Software

Pub Date : 2024-06-06 DOI: 10.1016/j.jss.2024.112121

Changqing Wei , Xiangjuan Yao , Dunwei Gong , Huai Liu , Xiangying Dang

Mutation testing is a fault-based and powerful software testing technique, but the large number of mutations can result in extremely high costs. To reduce the cost of mutation testing, researchers attempt to identify stubborn mutants and generate test data to kill them, in order to achieve the same testing effect. However, existing methods suffer from inaccurate identification of stubborn mutants and low productiveness in generating test data, which will seriously affect the effectiveness and efficiency of mutation testing. Therefore, we propose a new method of generating test data for killing stubborn mutants based on set evolution, namely TDGMSE. We first propose an integrated indicator to identify stubborn mutants. Then, we establish a constrained multi-objective model for generating test data of killing stubborn mutants. Finally, we develop a new genetic algorithm based on set evolution to solve the mathematical model. The results on 14 programs depict that the average false positive (or negative) rate of TDGMSE is decreased about 81.87% (or 32.34%); the success rate of TDGMSE is 99.22%; and the average number of iterations of TDGMSE is 16132.23, which is lowest of all methods. The research highlights several potential research directions for mutation testing.

突变测试是一种基于故障的强大软件测试技术，但大量突变会导致极高的成本。为了降低突变测试的成本，研究人员试图找出顽固的突变体，并生成测试数据将其杀死，以达到相同的测试效果。然而，现有方法存在识别顽固突变体不准确、生成测试数据效率低等问题，这将严重影响突变测试的效果和效率。因此，我们提出了一种基于集合进化的生成测试数据以杀死顽固突变体的新方法，即 TDGMSE。我们首先提出了一种识别顽固突变体的综合指标。然后，我们建立了一个生成杀死顽固突变体测试数据的约束多目标模型。最后，我们开发了一种基于集合进化的新遗传算法来求解数学模型。对 14 个程序的研究结果表明，TDGMSE 的平均假阳性（或阴性）率降低了约 81.87%（或 32.34%）；TDGMSE 的成功率为 99.22%；TDGMSE 的平均迭代次数为 16132.23 次，是所有方法中最少的。研究强调了突变测试的几个潜在研究方向。

{"title":"Set evolution based test data generation for killing stubborn mutants","authors":"Changqing Wei , Xiangjuan Yao , Dunwei Gong , Huai Liu , Xiangying Dang","doi":"10.1016/j.jss.2024.112121","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112121","url":null,"abstract":"<div><p>Mutation testing is a fault-based and powerful software testing technique, but the large number of mutations can result in extremely high costs. To reduce the cost of mutation testing, researchers attempt to identify stubborn mutants and generate test data to kill them, in order to achieve the same testing effect. However, existing methods suffer from inaccurate identification of stubborn mutants and low productiveness in generating test data, which will seriously affect the effectiveness and efficiency of mutation testing. Therefore, we propose a new method of generating test data for killing stubborn mutants based on set evolution, namely TDGMSE. We first propose an integrated indicator to identify stubborn mutants. Then, we establish a constrained multi-objective model for generating test data of killing stubborn mutants. Finally, we develop a new genetic algorithm based on set evolution to solve the mathematical model. The results on 14 programs depict that the average false positive (or negative) rate of TDGMSE is decreased about 81.87% (or 32.34%); the success rate of TDGMSE is 99.22%; and the average number of iterations of TDGMSE is 16132.23, which is lowest of all methods. The research highlights several potential research directions for mutation testing.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.5,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Version control of speaker recognition systems 扬声器识别系统的版本控制

IF 3.5 2区计算机科学 Q1 Computer Science

Journal of Systems and Software

Pub Date : 2024-06-06 DOI: 10.1016/j.jss.2024.112122

Quan Wang, Ignacio Lopez Moreno

This paper discusses one of the most challenging practical engineering problems in speaker recognition systems — the version control of models and user profiles. A typical speaker recognition system consists of two stages: the enrollment stage, where a profile is generated from user-provided enrollment audio; and the runtime stage, where the voice identity of the runtime audio is compared against the stored profiles. As technology advances, the speaker recognition system needs to be updated for better performance. However, if the stored user profiles are not updated accordingly, version mismatch will result in meaningless recognition results. In this paper, we describe different version control strategies for speaker recognition systems that had been carefully studied at Google from years of engineering practice. These strategies are categorized into three groups according to how they are deployed in the production environment: device-side deployment, server-side deployment, and hybrid deployment. To compare different strategies with quantitative metrics under various network configurations, we present SpeakerVerSim, an easily-extensible Python-based simulation framework for different server-side deployment strategies of speaker recognition systems.

本文讨论了扬声器识别系统中最具挑战性的实际工程问题之一--模型和用户配置文件的版本控制。典型的说话者识别系统包括两个阶段：注册阶段，根据用户提供的注册音频生成配置文件；运行阶段，将运行音频的语音身份与存储的配置文件进行比较。随着技术的进步，扬声器识别系统也需要更新，以获得更好的性能。但是，如果存储的用户配置文件没有相应更新，版本不匹配将导致识别结果毫无意义。在本文中，我们介绍了谷歌在多年工程实践中仔细研究过的扬声器识别系统的不同版本控制策略。这些策略根据在生产环境中的部署方式分为三类：设备端部署、服务器端部署和混合部署。为了在不同的网络配置下用量化指标比较不同的策略，我们提出了 SpeakerVerSim，这是一个基于 Python、易于扩展的仿真框架，适用于扬声器识别系统的不同服务器端部署策略。

{"title":"Version control of speaker recognition systems","authors":"Quan Wang, Ignacio Lopez Moreno","doi":"10.1016/j.jss.2024.112122","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112122","url":null,"abstract":"<div><p>This paper discusses one of the most challenging practical engineering problems in speaker recognition systems — the version control of models and user profiles. A typical speaker recognition system consists of two stages: the <em>enrollment stage</em>, where a profile is generated from user-provided enrollment audio; and the <em>runtime stage</em>, where the voice identity of the runtime audio is compared against the stored profiles. As technology advances, the speaker recognition system needs to be updated for better performance. However, if the stored user profiles are not updated accordingly, version mismatch will result in meaningless recognition results. In this paper, we describe different version control strategies for speaker recognition systems that had been carefully studied at Google from years of engineering practice. These strategies are categorized into three groups according to how they are deployed in the production environment: device-side deployment, server-side deployment, and hybrid deployment. To compare different strategies with quantitative metrics under various network configurations, we present <span>SpeakerVerSim</span>, an easily-extensible Python-based simulation framework for different server-side deployment strategies of speaker recognition systems.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.5,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141325282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Requirements quality research artifacts: Recovery, analysis, and management guideline 要求高质量的研究人工制品：恢复、分析和管理指南

IF 3.5 2区计算机科学 Q1 Computer Science

Journal of Systems and Software

Pub Date : 2024-06-05 DOI: 10.1016/j.jss.2024.112120

Julian Frattini , Lloyd Montgomery , Davide Fucci , Michael Unterkalmsteiner , Daniel Mendez , Jannik Fischbach

Requirements quality research, which is dedicated to assessing and improving the quality of requirements specifications, is dependent on research artifacts like data sets (containing information about quality defects) and implementations (automatically detecting and removing these defects). However, recent research exposed that the majority of these research artifacts have become unavailable or have never been disclosed, which inhibits progress in the research domain. In this work, we aim to improve the availability of research artifacts in requirements quality research. To this end, we (1) extend an artifact recovery initiative, (2) empirically evaluate the reasons for artifact unavailability using Bayesian data analysis, and (3) compile a concise guideline for open science artifact disclosure. Our results include 10 recovered data sets and 7 recovered implementations, empirical support for artifact availability improving over time and the positive effect of public hosting services, and a pragmatic artifact management guideline open for community comments. With this work, we hope to encourage and support adherence to open science principles and improve the availability of research artifacts for the requirements research quality community.

需求质量研究致力于评估和改进需求规格的质量，它依赖于数据集（包含质量缺陷信息）和实现（自动检测和消除这些缺陷）等研究人工制品。然而，最近的研究表明，这些研究人工制品大多已不可用或从未公开过，这阻碍了研究领域的进展。在这项工作中，我们的目标是提高需求质量研究中研究工件的可用性。为此，我们（1）扩展了一项人工制品恢复计划；（2）利用贝叶斯数据分析对人工制品不可用的原因进行了实证评估；（3）为开放科学人工制品的公开编制了一份简明指南。我们的成果包括：10 个恢复的数据集和 7 个恢复的实现，对人工制品可用性随时间推移而提高的经验支持和公共托管服务的积极影响，以及一份开放供社区评论的实用人工制品管理指南。通过这项工作，我们希望鼓励和支持对开放科学原则的遵守，并提高研究工件对需求研究质量社区的可用性。

{"title":"Requirements quality research artifacts: Recovery, analysis, and management guideline","authors":"Julian Frattini , Lloyd Montgomery , Davide Fucci , Michael Unterkalmsteiner , Daniel Mendez , Jannik Fischbach","doi":"10.1016/j.jss.2024.112120","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112120","url":null,"abstract":"<div><p>Requirements quality research, which is dedicated to assessing and improving the quality of requirements specifications, is dependent on research artifacts like data sets (containing information about quality defects) and implementations (automatically detecting and removing these defects). However, recent research exposed that the majority of these research artifacts have become unavailable or have never been disclosed, which inhibits progress in the research domain. In this work, we aim to improve the availability of research artifacts in requirements quality research. To this end, we (1) extend an artifact recovery initiative, (2) empirically evaluate the reasons for artifact unavailability using Bayesian data analysis, and (3) compile a concise guideline for open science artifact disclosure. Our results include 10 recovered data sets and 7 recovered implementations, empirical support for artifact availability improving over time and the positive effect of public hosting services, and a pragmatic artifact management guideline open for community comments. With this work, we hope to encourage and support adherence to open science principles and improve the availability of research artifacts for the requirements research quality community.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.5,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0164121224001651/pdfft?md5=559326ef865ccc93128dc38d0eed1936&pid=1-s2.0-S0164121224001651-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing empirical software performance engineering research with kernel-level events: A comprehensive system tracing approach 利用内核级事件加强实证软件性能工程研究：综合系统跟踪方法

IF 3.5 2区计算机科学 Q1 Computer Science

Journal of Systems and Software

Pub Date : 2024-06-03 DOI: 10.1016/j.jss.2024.112117

Morteza Noferesti, Naser Ezzati-Jivan

Performance engineering is a proactive and systematic approach aimed at designing, building, and enhancing software systems to ensure their efficient and reliable operation. It involves observing and measuring the operational behavior of a software system without interference, assessing performance metrics like response times, throughput, and resource utilization. This entails delving into kernel-level events related to performance monitoring, which play a significant role in understanding system behavior and diagnosing performance-related issues. Kernel-level events offer insights into how both the operating system and hardware resources are utilized. This information empowers system administrators, developers, and performance analysts to optimize and troubleshoot the system effectively.

A critical aspect of performance analysis is root cause analysis, which involves delving deep into kernel-level events connected to performance monitoring. These events provide valuable insights into the utilization of operating system and hardware resources, equipping system administrators, developers, and performance analysts with tools to effectively troubleshoot and optimize the system. Our study introduces an innovative artifact that captures kernel-level events using Elasticsearch and Kibana, facilitating comprehensive performance analysis under diverse scenarios. By defining both Light-load and Heavy-load scenarios and simulating CPU, I/O, Network, and Memory noise, we offer researchers a realistic environment to explore innovative approaches to system performance enhancement.

The artifact comprises both kernel events and system calls, resulting in a cumulative count of 24,263,691 events. The proposed artifact can serve three distinct applications. The first application emphasizes performance analysis by utilizing kernel events for monitoring. The second application targets noise detection and root cause analysis, again using kernel events. Finally, the third application investigates software phase detection through monitoring at the kernel level. These applications demonstrate that through our artifact, researchers can effectively analyze performance, detect and address performance noise, and identify software phases, contributing to the advancement of performance engineering methodologies.

All the system configurations, scripts, and traces can be found in the artifact GitHub repository.¹

性能工程是一种积极主动的系统方法，旨在设计、构建和增强软件系统，确保其高效可靠地运行。它包括在不受干扰的情况下观察和测量软件系统的运行行为，评估响应时间、吞吐量和资源利用率等性能指标。这就需要深入研究与性能监控相关的内核级事件，它们在理解系统行为和诊断性能相关问题方面发挥着重要作用。内核级事件能让人深入了解操作系统和硬件资源的使用情况。性能分析的一个重要方面是根本原因分析，这涉及深入研究与性能监控相关的内核级事件。这些事件为了解操作系统和硬件资源的利用情况提供了宝贵的信息，为系统管理员、开发人员和性能分析人员提供了有效排除故障和优化系统的工具。我们的研究引入了一种创新工具，利用 Elasticsearch 和 Kibana 捕获内核级事件，便于在不同场景下进行全面的性能分析。通过定义轻载和重载场景并模拟 CPU、I/O、网络和内存噪声，我们为研究人员提供了一个真实的环境，以探索提高系统性能的创新方法。所提议的工具可用于三种不同的应用。第一种应用强调利用内核事件进行性能分析。第二个应用针对噪音检测和根本原因分析，同样使用内核事件。最后，第三个应用是通过内核级监控来研究软件阶段检测。这些应用表明，通过我们的工具，研究人员可以有效地分析性能、检测和解决性能噪声，并识别软件阶段，从而推动性能工程方法的发展。所有系统配置、脚本和跟踪都可以在工具的 GitHub 存储库中找到1。

{"title":"Enhancing empirical software performance engineering research with kernel-level events: A comprehensive system tracing approach","authors":"Morteza Noferesti, Naser Ezzati-Jivan","doi":"10.1016/j.jss.2024.112117","DOIUrl":"10.1016/j.jss.2024.112117","url":null,"abstract":"<div><p>Performance engineering is a proactive and systematic approach aimed at designing, building, and enhancing software systems to ensure their efficient and reliable operation. It involves observing and measuring the operational behavior of a software system without interference, assessing performance metrics like response times, throughput, and resource utilization. This entails delving into kernel-level events related to performance monitoring, which play a significant role in understanding system behavior and diagnosing performance-related issues. Kernel-level events offer insights into how both the operating system and hardware resources are utilized. This information empowers system administrators, developers, and performance analysts to optimize and troubleshoot the system effectively.</p><p>A critical aspect of performance analysis is root cause analysis, which involves delving deep into kernel-level events connected to performance monitoring. These events provide valuable insights into the utilization of operating system and hardware resources, equipping system administrators, developers, and performance analysts with tools to effectively troubleshoot and optimize the system. Our study introduces an innovative artifact that captures kernel-level events using Elasticsearch and Kibana, facilitating comprehensive performance analysis under diverse scenarios. By defining both Light-load and Heavy-load scenarios and simulating CPU, I/O, Network, and Memory noise, we offer researchers a realistic environment to explore innovative approaches to system performance enhancement.</p><p>The artifact comprises both kernel events and system calls, resulting in a cumulative count of 24,263,691 events. The proposed artifact can serve three distinct applications. The first application emphasizes performance analysis by utilizing kernel events for monitoring. The second application targets noise detection and root cause analysis, again using kernel events. Finally, the third application investigates software phase detection through monitoring at the kernel level. These applications demonstrate that through our artifact, researchers can effectively analyze performance, detect and address performance noise, and identify software phases, contributing to the advancement of performance engineering methodologies.</p><p>All the system configurations, scripts, and traces can be found in the artifact GitHub repository.<span><sup>1</sup></span></p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.5,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141280369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Confix: Combining node-level fix templates and masked language model for automatic program repair Confix：结合节点级修复模板和屏蔽语言模型实现程序自动修复

IF 3.5 2区计算机科学 Q1 Computer Science

Journal of Systems and Software

Pub Date : 2024-06-01 DOI: 10.1016/j.jss.2024.112116

Jianmao Xiao , Zhipeng Xu , Shiping Chen , Gang Lei , Guodong Fan , Yuanlong Cao , Shuiguang Deng , Zhiyong Feng

Automatic program repair (APR) is a promising technique to fix program defects by generating patches. In the current APR techniques, template-based and learning-based techniques have demonstrated different advantages. Template-based APR techniques rely on pre-defined fix templates, providing higher controllability but limited by the variety of templates and edit expressiveness. In contrast, learning-based APR techniques treat repair as a neural machine translation task, improving the edit expressiveness through training neural networks. However, this technique also faces the influence of quality and variety of training data, leading to numerous errors and redundant code generation. To overcome their limitations, this paper proposes an innovative APR technique called Confix. Confix first constructs a code information tree to assist in mining edit changes during historical repair. It then further enriches the types of fix templates using node information in the tree. Afterward, Confix defines masked lines based on node-level fix templates to control the scope of patch generation, avoiding redundant semantic code generation. Finally, Confix leverages the powerful edit expressiveness of the masked language model and combines it with fix strategies to generate correct patches more efficiently and accurately. Experimental results show that Confix exhibits state-of-the-art performance on the Defects4J 1.2 and QuixBugs benchmarks.

自动程序修复（APR）是一种通过生成补丁修复程序缺陷的有前途的技术。在目前的 APR 技术中，基于模板的技术和基于学习的技术表现出了不同的优势。基于模板的自动程序修复技术依赖于预定义的修复模板，可控性较高，但受限于模板的多样性和编辑表达能力。相比之下，基于学习的 APR 技术将修复视为一种神经机器翻译任务，通过训练神经网络来提高编辑表现力。然而，这种技术也面临着训练数据的质量和种类的影响，导致产生大量错误和冗余代码。为了克服其局限性，本文提出了一种名为 Confix 的创新 APR 技术。Confix 首先构建代码信息树，以帮助挖掘历史修复过程中的编辑变化。然后，它利用树中的节点信息进一步丰富修复模板的类型。之后，Confix 根据节点级修复模板定义屏蔽线，以控制补丁生成的范围，避免生成冗余的语义代码。最后，Confix 利用屏蔽语言模型强大的编辑表达能力，并将其与修复策略相结合，从而更高效、更准确地生成正确的补丁。实验结果表明，Confix 在 Defects4J 1.2 和 QuixBugs 基准测试中表现出了最先进的性能。

{"title":"Confix: Combining node-level fix templates and masked language model for automatic program repair","authors":"Jianmao Xiao , Zhipeng Xu , Shiping Chen , Gang Lei , Guodong Fan , Yuanlong Cao , Shuiguang Deng , Zhiyong Feng","doi":"10.1016/j.jss.2024.112116","DOIUrl":"10.1016/j.jss.2024.112116","url":null,"abstract":"<div><p>Automatic program repair (APR) is a promising technique to fix program defects by generating patches. In the current APR techniques, template-based and learning-based techniques have demonstrated different advantages. Template-based APR techniques rely on pre-defined fix templates, providing higher controllability but limited by the variety of templates and edit expressiveness. In contrast, learning-based APR techniques treat repair as a neural machine translation task, improving the edit expressiveness through training neural networks. However, this technique also faces the influence of quality and variety of training data, leading to numerous errors and redundant code generation. To overcome their limitations, this paper proposes an innovative APR technique called Confix. Confix first constructs a code information tree to assist in mining edit changes during historical repair. It then further enriches the types of fix templates using node information in the tree. Afterward, Confix defines masked lines based on node-level fix templates to control the scope of patch generation, avoiding redundant semantic code generation. Finally, Confix leverages the powerful edit expressiveness of the masked language model and combines it with fix strategies to generate correct patches more efficiently and accurately. Experimental results show that Confix exhibits state-of-the-art performance on the Defects4J 1.2 and QuixBugs benchmarks.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.5,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141276516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A vulnerability detection framework with enhanced graph feature learning 利用增强型图形特征学习的漏洞检测框架

IF 3.5 2区计算机科学 Q1 Computer Science

Journal of Systems and Software

Pub Date : 2024-06-01 DOI: 10.1016/j.jss.2024.112118

Jianxin Cheng , Yizhou Chen , Yongzhi Cao , Hanpin Wang

Vulnerability detection in smart contracts is critical to secure blockchain systems. Existing methods represent the bytecode as a graph structure and leverage graph neural networks to learn graph features for vulnerability detection. However, these methods are limited to handling the long-range dependencies between nodes. This means that they might focus on learning local node feature while ignoring global node information. In this paper, we propose a novel vulnerability detection framework with Enhanced Graph Feature Learning (EGFL), which aims to extract the global node information and utilize it to improve vulnerability detection in smart contracts. Specifically, we first represent the bytecode as a Control Flow Graph (CFG). To extract global node information, EGFL constructs a linear node feature matrix from CFG, and uses the feature-aware and relationship-aware modules to handle long-range dependencies between nodes. Meanwhile, a graph neural network is adopted to extract the local node feature from CFG. Subsequently, we fuse the global node information and local node feature to generate an enhanced graph feature for capturing more vulnerability features. We evaluate EGFL on the benchmark dataset with six types of smart contract vulnerabilities. Results show that EGFL outperforms fourteen state-of-the-art vulnerability detection methods by 10.83%–60.28% in F1 score.

智能合约中的漏洞检测对于确保区块链系统的安全至关重要。现有方法将字节码表示为图结构，并利用图神经网络来学习图特征，从而进行漏洞检测。然而，这些方法仅限于处理节点之间的长距离依赖关系。这意味着它们可能只关注局部节点特征的学习，而忽略了全局节点信息。在本文中，我们利用增强型图特征学习（EGFL）提出了一种新颖的漏洞检测框架，旨在提取全局节点信息并利用它来改进智能合约中的漏洞检测。具体来说，我们首先将字节码表示为控制流图（CFG）。为了提取全局节点信息，EGFL 从 CFG 中构建了一个线性节点特征矩阵，并使用特征感知模块和关系感知模块来处理节点之间的远距离依赖关系。同时，采用图神经网络从 CFG 中提取局部节点特征。随后，我们融合全局节点信息和局部节点特征，生成增强图特征，以捕捉更多的脆弱性特征。我们在包含六种智能合约漏洞的基准数据集上对 EGFL 进行了评估。结果表明，EGFL 的 F1 分数比 14 种最先进的漏洞检测方法高出 10.83%-60.28% 。

{"title":"A vulnerability detection framework with enhanced graph feature learning","authors":"Jianxin Cheng , Yizhou Chen , Yongzhi Cao , Hanpin Wang","doi":"10.1016/j.jss.2024.112118","DOIUrl":"10.1016/j.jss.2024.112118","url":null,"abstract":"<div><p>Vulnerability detection in smart contracts is critical to secure blockchain systems. Existing methods represent the bytecode as a graph structure and leverage graph neural networks to learn graph features for vulnerability detection. However, these methods are limited to handling the long-range dependencies between nodes. This means that they might focus on learning local node feature while ignoring global node information. In this paper, we propose a novel vulnerability detection framework with <strong>E</strong>nhanced <strong>G</strong>raph <strong>F</strong>eature <strong>L</strong>earning (EGFL), which aims to extract the global node information and utilize it to improve vulnerability detection in smart contracts. Specifically, we first represent the bytecode as a Control Flow Graph (CFG). To extract global node information, EGFL constructs a linear node feature matrix from CFG, and uses the feature-aware and relationship-aware modules to handle long-range dependencies between nodes. Meanwhile, a graph neural network is adopted to extract the local node feature from CFG. Subsequently, we fuse the global node information and local node feature to generate an enhanced graph feature for capturing more vulnerability features. We evaluate EGFL on the benchmark dataset with six types of smart contract vulnerabilities. Results show that EGFL outperforms fourteen state-of-the-art vulnerability detection methods by 10.83%–60.28% in F1 score.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.5,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141277020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generative AI in Software Engineering Must Be Human-Centered: The Copenhagen Manifesto 软件工程中的生成式人工智能必须以人为本：哥本哈根宣言

IF 3.5 2区计算机科学 Q1 Computer Science

Journal of Systems and Software

Pub Date : 2024-05-31 DOI: 10.1016/j.jss.2024.112115

Daniel Russo , Sebastian Baltes , Niels van Berkel , Paris Avgeriou , Fabio Calefato , Beatriz Cabrero-Daniel , Gemma Catolino , Jürgen Cito , Neil Ernst , Thomas Fritz , Hideaki Hata , Reid Holmes , Maliheh Izadi , Foutse Khomh , Mikkel Baun Kjærgaard , Grischa Liebel , Alberto Lluch Lafuente , Stefano Lambiase , Walid Maalej , Gail Murphy , Bogdan Vasilescu

引用次数: 0

Variability management and software product line knowledge in software companies 软件公司的变异性管理和软件产品系列知识

IF 3.5 2区计算机科学 Q1 Computer Science

Journal of Systems and Software

Pub Date : 2024-05-27 DOI: 10.1016/j.jss.2024.112114

Antonio M. Gutiérrez-Fernández , Ana Eva Chacón-Luna , David Benavides , Lidia Fuentes , Rick Rabiser

Software product line engineering aims to systematically generate similar products or services within a given domain to reduce cost and time to market while increasing reuse. Various studies recognize the success of product line engineering in different domains. Software variability have increased over the years in many different domains such as mobile applications, cyber–physical systems or car control systems to just mention a few. However, software product line engineering is not as widely adopted as other software development technologies. In this paper, we present an empirical study conducted through a survey distributed to many software development companies. Our goal is to understand their need of software variability management and the level of knowledge the companies have regarding software product line engineering. The survey was answered by 127 participants from more than a hundred of different software development companies. Our study reveals that most of companies manage a catalog of similar products in a way or another (e.g. clone-and-own, common modules that are statically imported,etc.), they mostly document the features of products using text or spreed sheet based documents and more than 66% of companies identify a base product from which they derive other similar products. We also found a correlation between the lack of Software Product Line (SPL) knowledge and the absence of reuse practices. Notably, this is the first study that explore software variability needs regardless of a company’s prior knowledge of SPL. The results encourages further research to understand the reason for the limited knowledge and application of software product line engineering practices, despite the growing demand of variability management.

软件产品系列工程旨在系统地生成特定领域内的类似产品或服务，以降低成本和缩短上市时间，同时提高重复利用率。多项研究表明，产品线工程在不同领域都取得了成功。多年来，软件的可变性在移动应用、网络物理系统或汽车控制系统等许多不同领域都有所提高。然而，软件产品线工程并没有像其他软件开发技术那样被广泛采用。在本文中，我们通过向许多软件开发公司发放调查问卷，进行了一项实证研究。我们的目标是了解他们对软件变异性管理的需求，以及公司对软件产品线工程的了解程度。来自一百多家不同软件开发公司的 127 名参与者回答了调查问卷。我们的研究表明，大多数公司都以某种方式管理着类似产品的目录（例如，克隆和拥有、静态导入的通用模块等），他们大多使用文本或基于 spreed sheet 的文档来记录产品的特性，超过 66% 的公司确定了一个基础产品，并由此衍生出其他类似产品。我们还发现，缺乏软件产品线（SPL）知识与缺乏重复使用实践之间存在关联。值得注意的是，这是第一项不考虑公司先前对 SPL 的了解程度而探讨软件可变性需求的研究。尽管对可变性管理的需求日益增长，但研究结果鼓励进一步研究，以了解对软件产品线工程实践的了解和应用有限的原因。

{"title":"Variability management and software product line knowledge in software companies","authors":"Antonio M. Gutiérrez-Fernández , Ana Eva Chacón-Luna , David Benavides , Lidia Fuentes , Rick Rabiser","doi":"10.1016/j.jss.2024.112114","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112114","url":null,"abstract":"<div><p>Software product line engineering aims to systematically generate similar products or services within a given domain to reduce cost and time to market while increasing reuse. Various studies recognize the success of product line engineering in different domains. Software variability have increased over the years in many different domains such as mobile applications, cyber–physical systems or car control systems to just mention a few. However, software product line engineering is not as widely adopted as other software development technologies. In this paper, we present an empirical study conducted through a survey distributed to many software development companies. Our goal is to understand their need of software variability management and the level of knowledge the companies have regarding software product line engineering. The survey was answered by 127 participants from more than a hundred of different software development companies. Our study reveals that most of companies manage a catalog of similar products in a way or another (e.g. clone-and-own, common modules that are statically imported,etc.), they mostly document the features of products using text or spreed sheet based documents and more than 66% of companies identify a base product from which they derive other similar products. We also found a correlation between the lack of Software Product Line (SPL) knowledge and the absence of reuse practices. Notably, this is the first study that explore software variability needs regardless of a company’s prior knowledge of SPL. The results encourages further research to understand the reason for the limited knowledge and application of software product line engineering practices, despite the growing demand of variability management.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.5,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0164121224001596/pdfft?md5=e564a1ddebf16a5c4e897addc2fb2e97&pid=1-s2.0-S0164121224001596-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141328294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0