Automated Software Engineering最新文献_第6页

A systematic review of refactoring opportunities by software antipattern detection 通过软件反模式检测系统审查重构机会

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-05-15 DOI: 10.1007/s10515-024-00443-y

Somayeh Kalhor, Mohammad Reza Keyvanpour, Afshin Salajegheh

The violation of the semantic and structural software principles, such as low connection, high coherence, high understanding, and others, are called anti-patterns, which is one of the concerns of the software development process. They are caused due to bad design or programming that must be detected and removed to improve the application’s source code. Refactoring operators efficiently eliminate antipatterns, but they must first be identified. Therefore, antipattern detection is a critical issue in software engineering, and to do this, various approaches have been proposed. So far, review articles have been published to classify and compare these approaches. However, a comprehensive study using evaluation parameters has not compared different anti-pattern detection methods at all software abstraction levels. In this article, all the methods presented so far are classified, then their advantages and disadvantages are highlighted. Finally, a complete comparison of each category by evaluation metrics is provided. Our proposed classification considers three aspects, levels of abstraction, degree of dependence on developers’ skills, and techniques used. Then, the evaluation metrics reported on this subject are analyzed, and the qualitative values of these metrics for each category are presented. This information can help researchers compare and understand existing methods and improve them.

违反语义和结构软件原则（如低连接、高一致性、高理解性等）的行为被称为反模式（anti-patterns），这也是软件开发过程中需要关注的问题之一。反模式是由于不良设计或编程造成的，必须加以检测和消除，以改进应用程序的源代码。重构操作员可以有效地消除反模式，但必须首先识别反模式。因此，反模式检测是软件工程中的一个关键问题，为此，人们提出了各种方法。迄今为止，已有评论文章对这些方法进行了分类和比较。然而，一项使用评估参数的综合研究还没有对所有软件抽象层次的不同反模式检测方法进行比较。本文将对迄今为止介绍的所有方法进行分类，然后重点介绍它们的优缺点。最后，通过评估指标对每一类方法进行全面比较。我们建议的分类考虑了三个方面，即抽象程度、对开发人员技能的依赖程度和使用的技术。然后，我们分析了这方面的评价指标，并给出了每个类别的这些指标的定性值。这些信息有助于研究人员比较和理解现有方法，并对其进行改进。

{"title":"A systematic review of refactoring opportunities by software antipattern detection","authors":"Somayeh Kalhor, Mohammad Reza Keyvanpour, Afshin Salajegheh","doi":"10.1007/s10515-024-00443-y","DOIUrl":"10.1007/s10515-024-00443-y","url":null,"abstract":"<div><p>The violation of the semantic and structural software principles, such as low connection, high coherence, high understanding, and others, are called anti-patterns, which is one of the concerns of the software development process. They are caused due to bad design or programming that must be detected and removed to improve the application’s source code. Refactoring operators efficiently eliminate antipatterns, but they must first be identified. Therefore, antipattern detection is a critical issue in software engineering, and to do this, various approaches have been proposed. So far, review articles have been published to classify and compare these approaches. However, a comprehensive study using evaluation parameters has not compared different anti-pattern detection methods at all software abstraction levels. In this article, all the methods presented so far are classified, then their advantages and disadvantages are highlighted. Finally, a complete comparison of each category by evaluation metrics is provided. Our proposed classification considers three aspects, levels of abstraction, degree of dependence on developers’ skills, and techniques used. Then, the evaluation metrics reported on this subject are analyzed, and the qualitative values of these metrics for each category are presented. This information can help researchers compare and understand existing methods and improve them.\u0000</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Extracting high-level activities from low-level program execution logs 从低级程序执行日志中提取高级活动

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-05-13 DOI: 10.1007/s10515-024-00441-0

Evgenii V. Stepanov, Alexey A. Mitsyuk

Modern runtime environments, standard libraries, and other frameworks provide many ways of diagnostics for software engineers. One form of such diagnostics is logging low-level events which characterize internal processes during program execution like garbage collection, assembly loading, just-in-time compilation, etc. Low-level program execution event logs contain a large number of events and event classes, which makes it impossible to discover meaningful process models straight from the event log, so extraction of high-level activities is a necessary step for further processing of such logs. In this paper, .NET applications execution logs are considered and an approach based on an unsupervised technique is extended with the domain-driven hierarchy built with the knowledge of a structure of logged events. The proposed approach allows treating events on different levels of abstraction, thus extending the number of patterns and activities found with the unsupervised technique. Experiments with real-life .NET programs execution event logs are conducted to demonstrate the proposed approach’s capability.

现代运行环境、标准库和其他框架为软件工程师提供了多种诊断方法。其中一种诊断方式是记录低级事件，这些事件描述了程序执行过程中的内部流程，如垃圾回收、程序集加载、即时编译等。底层程序执行事件日志包含大量的事件和事件类，因此不可能直接从事件日志中发现有意义的流程模型，所以提取高层活动是进一步处理此类日志的必要步骤。本文考虑了.NET 应用程序的执行日志，并在无监督技术的基础上扩展了一种方法，利用日志事件结构知识建立了领域驱动层次结构。所提出的方法允许在不同的抽象层次上处理事件，从而扩展了无监督技术所发现的模式和活动的数量。我们利用真实的 .NET 程序执行事件日志进行了实验，以证明所提方法的能力。

引用次数: 0

Optimizing software vulnerability detection using RoBERTa and machine learning 利用 RoBERTa 和机器学习优化软件漏洞检测

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-05-08 DOI: 10.1007/s10515-024-00440-1

Cho Xuan Do, Nguyen Trong Luu, Phuong Thi Lan Nguyen

Detecting vulnerabilities in source code written in C and C + + is currently essential as attack techniques against systems seek to find, exploit, and attack these vulnerabilities. In this article, to improve the effectiveness of the source code vulnerability detection process, we propose a new approach based on building and representing source code features using natural language processing (NLP) techniques. Our proposal in the article consists of two main stages: (i) building a feature profile of the source code using the RoBERTa model, and (ii) classifying source code based on the feature profile using a supervised machine learning algorithm. Specifically, with our proposal utilizing the pre-trained RoBERTa model, we have successfully built and represented important features of source code as complete vectors, thereby enhancing the effectiveness of prediction and vulnerability detection models. The experimental part of our article compared and evaluated our proposal with other approaches on the FFmpeg + Qume dataset. The experimental results in the article showed that the approach in this study was superior to other research directions on all measures. Therefore, the proposal to use NLP techniques based on the RoBERTa model not only has scientific significance as a new research direction that has not been proposed for application but also has practical significance when all experimental results are highly effective.

目前，检测用 C 和 C + + 编写的源代码中的漏洞至关重要，因为针对系统的攻击技术试图找到、利用和攻击这些漏洞。在本文中，为了提高源代码漏洞检测过程的有效性，我们提出了一种基于使用自然语言处理（NLP）技术构建和表示源代码特征的新方法。我们在文章中提出的建议包括两个主要阶段：(i) 使用 RoBERTa 模型建立源代码的特征轮廓；(ii) 使用监督机器学习算法根据特征轮廓对源代码进行分类。具体来说，我们利用预先训练好的 RoBERTa 模型，成功地构建了源代码的重要特征，并将其表示为完整的向量，从而提高了预测和漏洞检测模型的有效性。文章的实验部分在 FFmpeg + Qume 数据集上对我们的建议与其他方法进行了比较和评估。文章中的实验结果表明，本研究的方法在所有指标上都优于其他研究方向。因此，基于 RoBERTa 模型使用 NLP 技术的建议不仅具有科学意义，是一个尚未提出应用的新研究方向，而且在所有实验结果都非常有效的情况下，还具有实际意义。

{"title":"Optimizing software vulnerability detection using RoBERTa and machine learning","authors":"Cho Xuan Do, Nguyen Trong Luu, Phuong Thi Lan Nguyen","doi":"10.1007/s10515-024-00440-1","DOIUrl":"10.1007/s10515-024-00440-1","url":null,"abstract":"<div><p>Detecting vulnerabilities in source code written in C and C + + is currently essential as attack techniques against systems seek to find, exploit, and attack these vulnerabilities. In this article, to improve the effectiveness of the source code vulnerability detection process, we propose a new approach based on building and representing source code features using natural language processing (NLP) techniques. Our proposal in the article consists of two main stages: (i) building a feature profile of the source code using the RoBERTa model, and (ii) classifying source code based on the feature profile using a supervised machine learning algorithm. Specifically, with our proposal utilizing the pre-trained RoBERTa model, we have successfully built and represented important features of source code as complete vectors, thereby enhancing the effectiveness of prediction and vulnerability detection models. The experimental part of our article compared and evaluated our proposal with other approaches on the FFmpeg + Qume dataset. The experimental results in the article showed that the approach in this study was superior to other research directions on all measures. Therefore, the proposal to use NLP techniques based on the RoBERTa model not only has scientific significance as a new research direction that has not been proposed for application but also has practical significance when all experimental results are highly effective.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tsoa: a two-stage optimization approach for GCC compilation options to minimize execution time Tsoa：针对 GCC 编译选项的两阶段优化方法，最大限度地缩短执行时间

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-04-24 DOI: 10.1007/s10515-024-00437-w

Youcong Ni, Xin Du, Yuan Yuan, Ruliang Xiao, Gaolin Chen

The open-source compiler GCC offers numerous options to improve execution time. Two categories of approaches, machine learning-based and design space exploration, have emerged for selecting the optimal set of options. However, they continue to face challenge in quickly obtaining high-quality solutions due to the large and discrete optimization space, time-consuming utility evaluation for selected options, and complex interactions among options. To address these challenges, we propose TSOA, a Two-Stage Optimization Approach for GCC compilation options to minimize execution time. In the first stage, we present OPPM, an Option Preselection algorithm based on Pattern Mining. OPPM generates diverse samples to cover a wide range of option interactions. It subsequently mines frequent options from both objective-improved and non-improved samples. The mining results are further validated using CRC codes to precisely preselect options and reduce the optimization space. Transitioning to the second stage, we present OSEA, an Option Selection Evolutionary optimization Algorithm. OSEA is grounded in solution preselection and an option interaction graph. The solution preselection employs a random forest to build a classifier, efficiently identifying promising solutions for the next-generation population and thereby reducing the time spent on utility evaluation. Simultaneously, the option interaction graph is built to capture option interplays and their influence on objectives from evaluated solutions. Then, high-quality solutions are generated based on the option interaction graph. We evaluate the performance of TSOA by comparing it with representative machine learning-based and design space exploration approaches across a diverse set of 20 problem instances from two benchmark platforms. Additionally, we validate the effectiveness of OPPM and conduct related ablation experiments. The experimental results show that TSOA outperforms state-of-the-art approaches significantly in both optimization time and solution quality. Moreover, OPPM outperforms other option preselection algorithms, while the effectiveness of random forest-assisted solution preselection, along with new solution generation based on the option interaction graph, has been verified.

开源编译器 GCC 提供了许多选项来缩短执行时间。为选择最优选项集，出现了基于机器学习和设计空间探索的两类方法。然而，由于优化空间庞大且离散、所选选项的效用评估耗时以及选项间复杂的相互作用，这些方法在快速获得高质量解决方案方面仍然面临挑战。为了应对这些挑战，我们提出了 TSOA，一种针对 GCC 编译选项的两阶段优化方法，以最大限度地缩短执行时间。在第一阶段，我们提出了基于模式挖掘的选项预选算法 OPPM。OPPM 生成各种样本，以涵盖广泛的选项交互。随后，它从目标改进样本和非改进样本中挖掘频繁出现的期权。挖掘结果通过 CRC 代码进一步验证，以精确预选选项并缩小优化空间。进入第二阶段后，我们将介绍一种选项选择进化优化算法 OSEA。OSEA 以解决方案预选和选项交互图为基础。解决方案预选采用随机森林建立分类器，为下一代群体有效识别出有前途的解决方案，从而减少用于效用评估的时间。同时，建立选项交互图来捕捉选项间的相互作用及其对已评估解决方案目标的影响。然后，根据选项交互图生成高质量的解决方案。我们将 TSOA 与基于机器学习的代表性方法和设计空间探索方法进行了比较，评估了它在两个基准平台的 20 个问题实例中的性能。此外，我们还验证了 OPPM 的有效性，并进行了相关的消融实验。实验结果表明，TSOA 在优化时间和解决方案质量方面都明显优于最先进的方法。此外，OPPM 的性能也优于其他选项预选算法，而随机森林辅助解决方案预选以及基于选项交互图的新解决方案生成的有效性也得到了验证。

{"title":"Tsoa: a two-stage optimization approach for GCC compilation options to minimize execution time","authors":"Youcong Ni, Xin Du, Yuan Yuan, Ruliang Xiao, Gaolin Chen","doi":"10.1007/s10515-024-00437-w","DOIUrl":"10.1007/s10515-024-00437-w","url":null,"abstract":"<div><p>The open-source compiler GCC offers numerous options to improve execution time. Two categories of approaches, machine learning-based and design space exploration, have emerged for selecting the optimal set of options. However, they continue to face challenge in quickly obtaining high-quality solutions due to the large and discrete optimization space, time-consuming utility evaluation for selected options, and complex interactions among options. To address these challenges, we propose TSOA, a Two-Stage Optimization Approach for GCC compilation options to minimize execution time. In the first stage, we present OPPM, an Option Preselection algorithm based on Pattern Mining. OPPM generates diverse samples to cover a wide range of option interactions. It subsequently mines frequent options from both objective-improved and non-improved samples. The mining results are further validated using CRC codes to precisely preselect options and reduce the optimization space. Transitioning to the second stage, we present OSEA, an Option Selection Evolutionary optimization Algorithm. OSEA is grounded in solution preselection and an option interaction graph. The solution preselection employs a random forest to build a classifier, efficiently identifying promising solutions for the next-generation population and thereby reducing the time spent on utility evaluation. Simultaneously, the option interaction graph is built to capture option interplays and their influence on objectives from evaluated solutions. Then, high-quality solutions are generated based on the option interaction graph. We evaluate the performance of TSOA by comparing it with representative machine learning-based and design space exploration approaches across a diverse set of 20 problem instances from two benchmark platforms. Additionally, we validate the effectiveness of OPPM and conduct related ablation experiments. The experimental results show that TSOA outperforms state-of-the-art approaches significantly in both optimization time and solution quality. Moreover, OPPM outperforms other option preselection algorithms, while the effectiveness of random forest-assisted solution preselection, along with new solution generation based on the option interaction graph, has been verified.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140659966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ProRLearn: boosting prompt tuning-based vulnerability detection by reinforcement learning ProRLearn：通过强化学习提高基于提示调整的漏洞检测能力

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-04-20 DOI: 10.1007/s10515-024-00438-9

Zilong Ren, Xiaolin Ju, Xiang Chen, Hao Shen

Software vulnerability detection is a critical step in ensuring system security and data protection. Recent research has demonstrated the effectiveness of deep learning in automated vulnerability detection. However, it is difficult for deep learning models to understand the semantics and domain-specific knowledge of source code. In this study, we introduce a new vulnerability detection framework, ProRLearn, which leverages two main techniques: prompt tuning and reinforcement learning. Since existing fine-tuning of pre-trained language models (PLMs) struggles to leverage domain knowledge fully, we introduce a new automatic prompt-tuning technique. Precisely, prompt tuning mimics the pre-training process of PLMs by rephrasing task input and adding prompts, using the PLM’s output as the prediction output. The introduction of the reinforcement learning reward mechanism aims to guide the behavior of vulnerability detection through a reward and punishment model, enabling it to learn effective strategies for obtaining maximum long-term rewards in specific environments. The introduction of reinforcement learning aims to encourage the model to learn how to maximize rewards or minimize penalties, thus enhancing performance. Experiments on three datasets (FFMPeg+Qemu, Reveal, and Big-Vul) indicate that ProRLearn achieves performance improvement of 3.27–70.96% over state-of-the-art baselines in terms of F1 score. The combination of prompt tuning and reinforcement learning can offer a potential opportunity to improve performance in vulnerability detection. This means that it can effectively improve the performance in responding to constantly changing network environments and new threats. This interdisciplinary approach contributes to a better understanding of the interplay between natural language processing and reinforcement learning, opening up new opportunities and challenges for future research and applications.

软件漏洞检测是确保系统安全和数据保护的关键一步。最近的研究证明了深度学习在自动漏洞检测中的有效性。然而，深度学习模型很难理解源代码的语义和特定领域知识。在本研究中，我们介绍了一种新的漏洞检测框架 ProRLearn，它利用了两种主要技术：及时调整和强化学习。由于现有的预训练语言模型（PLM）的微调难以充分利用领域知识，我们引入了一种新的自动提示调整技术。确切地说，提示调整模拟了预训练语言模型的预训练过程，通过重新措辞任务输入并添加提示，将预训练语言模型的输出作为预测输出。引入强化学习奖励机制，旨在通过奖惩模型引导漏洞检测行为，使其学习有效策略，在特定环境中获得最大的长期回报。引入强化学习的目的是鼓励模型学习如何使奖励最大化或惩罚最小化，从而提高性能。在三个数据集（FFMPeg+Qemu、Reveal 和 Big-Vul）上的实验表明，就 F1 分数而言，ProRLearn 比最先进的基线模型提高了 3.27%-70.96% 的性能。提示调整与强化学习的结合为提高漏洞检测性能提供了潜在的机会。这意味着，它能有效提高应对不断变化的网络环境和新威胁的性能。这种跨学科方法有助于更好地理解自然语言处理和强化学习之间的相互作用，为未来的研究和应用带来了新的机遇和挑战。

{"title":"ProRLearn: boosting prompt tuning-based vulnerability detection by reinforcement learning","authors":"Zilong Ren, Xiaolin Ju, Xiang Chen, Hao Shen","doi":"10.1007/s10515-024-00438-9","DOIUrl":"10.1007/s10515-024-00438-9","url":null,"abstract":"<div><p>Software vulnerability detection is a critical step in ensuring system security and data protection. Recent research has demonstrated the effectiveness of deep learning in automated vulnerability detection. However, it is difficult for deep learning models to understand the semantics and domain-specific knowledge of source code. In this study, we introduce a new vulnerability detection framework, ProRLearn, which leverages two main techniques: prompt tuning and reinforcement learning. Since existing fine-tuning of pre-trained language models (PLMs) struggles to leverage domain knowledge fully, we introduce a new automatic prompt-tuning technique. Precisely, prompt tuning mimics the pre-training process of PLMs by rephrasing task input and adding prompts, using the PLM’s output as the prediction output. The introduction of the reinforcement learning reward mechanism aims to guide the behavior of vulnerability detection through a reward and punishment model, enabling it to learn effective strategies for obtaining maximum long-term rewards in specific environments. The introduction of reinforcement learning aims to encourage the model to learn how to maximize rewards or minimize penalties, thus enhancing performance. Experiments on three datasets (FFMPeg+Qemu, Reveal, and Big-Vul) indicate that ProRLearn achieves performance improvement of 3.27–70.96% over state-of-the-art baselines in terms of F1 score. The combination of prompt tuning and reinforcement learning can offer a potential opportunity to improve performance in vulnerability detection. This means that it can effectively improve the performance in responding to constantly changing network environments and new threats. This interdisciplinary approach contributes to a better understanding of the interplay between natural language processing and reinforcement learning, opening up new opportunities and challenges for future research and applications.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140625680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

OneLog: towards end-to-end software log anomaly detection OneLog：实现端到端软件日志异常检测

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-04-16 DOI: 10.1007/s10515-024-00428-x

Shayan Hashemi, Mika Mäntylä

With the growth of online services, IoT devices, and DevOps-oriented software development, software log anomaly detection is becoming increasingly important. Prior works mainly follow a traditional four-staged architecture (Preprocessor, Parser, Vectorizer, and Classifier). This paper proposes OneLog, which utilizes a single deep neural network instead of multiple separate components. OneLog harnesses convolutional neural network (CNN) at the character level to take digits, numbers, and punctuations, which were removed in prior works, into account alongside the main natural language text. We evaluate our approach in six message- and sequence-based data sets: HDFS, Hadoop, BGL, Thunderbird, Spirit, and Liberty. We experiment with Onelog with single-, multi-, and cross-project setups. Onelog offers state-of-the-art performance in our datasets. Onelog can utilize multi-project datasets simultaneously during training, which suggests our model can generalize between datasets. Multi-project training also improves Onelog performance making it ideal when limited training data is available for an individual project. We also found that cross-project anomaly detection is possible with a single project pair (Liberty and Spirit). Analysis of model internals shows that one log has multiple modes of detecting anomalies and that the model learns manually validated parsing rules for the log messages. We conclude that character-based CNNs are a promising approach toward end-to-end learning in log anomaly detection. They offer good performance and generalization over multiple datasets. We will make our scripts publicly available upon the acceptance of this paper.

随着在线服务、物联网设备和面向 DevOps 的软件开发的发展，软件日志异常检测变得越来越重要。之前的研究主要遵循传统的四阶段架构（预处理器、解析器、矢量器和分类器）。本文提出的 OneLog 采用单一深度神经网络，而非多个独立组件。OneLog 在字符层面利用卷积神经网络（CNN），将数字、数字和标点符号（在之前的研究中被删除）与主要自然语言文本一起考虑在内。我们在六个基于消息和序列的数据集中对我们的方法进行了评估：HDFS、Hadoop、BGL、Thunderbird、Spirit 和 Liberty。我们使用 Onelog 进行了单项目、多项目和跨项目设置实验。在我们的数据集中，Onelog 提供了最先进的性能。在训练过程中，Onelog 可以同时使用多个项目数据集，这表明我们的模型可以在数据集之间进行泛化。多项目训练也提高了 Onelog 的性能，使其成为单个项目训练数据有限时的理想选择。我们还发现，单个项目对（Liberty 和 Spirit）可以进行跨项目异常检测。对模型内部结构的分析表明，一个日志具有多种检测异常的模式，而且该模型可学习经人工验证的日志信息解析规则。我们的结论是，基于字符的 CNN 是日志异常检测中一种很有前途的端到端学习方法。它们在多个数据集上具有良好的性能和泛化能力。本文一经接受，我们将公开我们的脚本。

{"title":"OneLog: towards end-to-end software log anomaly detection","authors":"Shayan Hashemi, Mika Mäntylä","doi":"10.1007/s10515-024-00428-x","DOIUrl":"10.1007/s10515-024-00428-x","url":null,"abstract":"<div><p>With the growth of online services, IoT devices, and DevOps-oriented software development, software log anomaly detection is becoming increasingly important. Prior works mainly follow a traditional four-staged architecture (Preprocessor, Parser, Vectorizer, and Classifier). This paper proposes OneLog, which utilizes a single deep neural network instead of multiple separate components. OneLog harnesses convolutional neural network (CNN) at the character level to take digits, numbers, and punctuations, which were removed in prior works, into account alongside the main natural language text. We evaluate our approach in six message- and sequence-based data sets: HDFS, Hadoop, BGL, Thunderbird, Spirit, and Liberty. We experiment with Onelog with single-, multi-, and cross-project setups. Onelog offers state-of-the-art performance in our datasets. Onelog can utilize multi-project datasets simultaneously during training, which suggests our model can generalize between datasets. Multi-project training also improves Onelog performance making it ideal when limited training data is available for an individual project. We also found that cross-project anomaly detection is possible with a single project pair (Liberty and Spirit). Analysis of model internals shows that one log has multiple modes of detecting anomalies and that the model learns manually validated parsing rules for the log messages. We conclude that character-based CNNs are a promising approach toward end-to-end learning in log anomaly detection. They offer good performance and generalization over multiple datasets. We will make our scripts publicly available upon the acceptance of this paper.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10515-024-00428-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated quantum software engineering 自动化量子软件工程

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-04-12 DOI: 10.1007/s10515-024-00436-x

Aritra Sarkar

As bigger quantum processors with hundreds of qubits become increasingly available, the potential for quantum computing to solve problems intractable for classical computers is becoming more tangible. Designing efficient quantum algorithms and software in tandem is key to achieving quantum advantage. Quantum software engineering is challenging due to the unique counterintuitive nature of quantum logic. Moreover, with larger quantum systems, traditional programming using quantum assembly language and qubit-level reasoning is becoming infeasible. Automated Quantum Software Engineering (AQSE) can help to reduce the barrier to entry, speed up development, reduce errors, and improve the efficiency of quantum software. This article elucidates the motivation to research AQSE (why), a precise description of such a framework (what), and reflections on components that are required for implementing it (how).

随着拥有数百量子比特的大型量子处理器越来越多，量子计算解决经典计算机难以解决的问题的潜力正变得越来越明显。同时设计高效的量子算法和软件是实现量子优势的关键。由于量子逻辑具有独特的反直觉性质，量子软件工程极具挑战性。此外，随着量子系统规模的扩大，使用量子汇编语言和量子比特级推理进行传统编程变得不可行。自动化量子软件工程（AQSE）有助于降低入门门槛、加快开发速度、减少错误并提高量子软件的效率。本文阐明了研究 AQSE 的动机（为什么）、对这一框架的精确描述（是什么），以及对实现这一框架所需组件的思考（怎么做）。

引用次数: 0

Bug reports priority classification models. Replication study 错误报告优先级分类模型。复制研究

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-04-10 DOI: 10.1007/s10515-024-00432-1

Andreea Galbin-Nasui, Andreea Vescan

Bug tracking systems receive a large number of bugs on a daily basis. The process of maintaining the integrity of the software and producing high-quality software is challenging. The bug-sorting process is usually a manual task that can lead to human errors and be time-consuming. The purpose of this research is twofold: first, to conduct a literature review on the bug report priority classification approaches, and second, to replicate existing approaches with various classifiers to extract new insights about the priority classification approaches. We used a Systematic Literature Review methodology to identify the most relevant existing approaches related to the bug report priority classification problem. Furthermore, we conducted a replication study on three classifiers: Naive Bayes (NB), Support Vector Machines (SVM), and Convolutional Neural Network (CNN). Two sets of experiments are performed: first, our own NLTK implementation based on NB and CNN, and second, based on Weka implementation for NB, SVM, and CNN. The dataset used consists of several Eclipse projects and one project related to database systems. The obtained results are better for the bug priority P3 for the CNN classifier, and overall the quality relation between the three classifiers is preserved as in the original studies. The replication study confirmed the findings of the original studies, emphasizing the need to further investigate the relationship between the characteristics of the projects used as training and those used as testing.

错误跟踪系统每天都会收到大量的错误。保持软件的完整性和生产高质量软件的过程充满挑战。错误分类过程通常是一项人工任务，可能会导致人为错误并耗费大量时间。本研究有两个目的：首先，对错误报告优先级分类方法进行文献综述；其次，使用各种分类器复制现有方法，以提取关于优先级分类方法的新见解。我们采用了系统文献综述的方法来确定与错误报告优先级分类问题相关的最相关的现有方法。此外，我们还对三种分类器进行了复制研究：Naive Bayes (NB)、支持向量机 (SVM) 和卷积神经网络 (CNN)。我们进行了两组实验：第一组是我们自己基于 NB 和 CNN 的 NLTK 实现，第二组是基于 Weka 实现的 NB、SVM 和 CNN。使用的数据集包括几个 Eclipse 项目和一个与数据库系统相关的项目。对于 CNN 分类器来说，错误优先级 P3 得到的结果更好，总体而言，三种分类器之间的质量关系与最初的研究结果相同。复制研究证实了原始研究的结果，强调了进一步研究用作训练的项目和用作测试的项目的特征之间的关系的必要性。

{"title":"Bug reports priority classification models. Replication study","authors":"Andreea Galbin-Nasui, Andreea Vescan","doi":"10.1007/s10515-024-00432-1","DOIUrl":"10.1007/s10515-024-00432-1","url":null,"abstract":"<div><p>Bug tracking systems receive a large number of bugs on a daily basis. The process of maintaining the integrity of the software and producing high-quality software is challenging. The bug-sorting process is usually a manual task that can lead to human errors and be time-consuming. The purpose of this research is twofold: first, to conduct a literature review on the bug report priority classification approaches, and second, to replicate existing approaches with various classifiers to extract new insights about the priority classification approaches. We used a Systematic Literature Review methodology to identify the most relevant existing approaches related to the bug report priority classification problem. Furthermore, we conducted a replication study on three classifiers: Naive Bayes (NB), Support Vector Machines (SVM), and Convolutional Neural Network (CNN). Two sets of experiments are performed: first, our own NLTK implementation based on NB and CNN, and second, based on Weka implementation for NB, SVM, and CNN. The dataset used consists of several Eclipse projects and one project related to database systems. The obtained results are better for the bug priority P3 for the CNN classifier, and overall the quality relation between the three classifiers is preserved as in the original studies. The replication study confirmed the findings of the original studies, emphasizing the need to further investigate the relationship between the characteristics of the projects used as training and those used as testing.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140598070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Agile meets quantum: a novel genetic algorithm model for predicting the success of quantum software development project 敏捷与量子的结合：预测量子软件开发项目成功与否的新型遗传算法模型

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-04-04 DOI: 10.1007/s10515-024-00434-z

Arif Ali Khan, Muhammad Azeem Akbar, Valtteri Lahtinen, Marko Paavola, Mahmood Niazi, Mohammed Naif Alatawi, Shoayee Dlaim Alotaibi

Quantum software systems represent a new realm in software engineering, utilizing quantum bits (Qubits) and quantum gates (Qgates) to solve the complex problems more efficiently than classical counterparts. Agile software development approaches are considered to address many inherent challenges in quantum software development, but their effective integration remains unexplored. This study investigates key causes of challenges that could hinders the adoption of traditional agile approaches in quantum software projects and develop an Agile-Quantum Software Project Success Prediction Model (AQSSPM). Firstly, we identified 19 causes of challenging factors discussed in our previous study, which are potentially impacting agile-quantum project success. Secondly, a survey was conducted to collect expert opinions on these causes and applied Genetic Algorithm (GA) with Naive Bayes Classifier (NBC) and Logistic Regression (LR) to develop the AQSSPM. Utilizing GA with NBC, project success probability improved from 53.17 to 99.68%, with cost reductions from 0.463 to 0.403%. Similarly, GA with LR increased success rates from 55.52 to 98.99%, and costs decreased from 0.496 to 0.409% after 100 iterations. Both methods result showed a strong positive correlation (rs = 0.955) in causes ranking, with no significant difference between them (t = 1.195, p = 0.240 > 0.05). The AQSSPM highlights critical focus areas for efficiently and successfully implementing agile-quantum projects considering the cost factor of a particular project.

量子软件系统代表了软件工程的一个新领域，它利用量子比特（Qubits）和量子门（Qgates）解决复杂问题的效率高于经典软件系统。敏捷软件开发方法被认为可以解决量子软件开发中的许多固有挑战，但其有效整合仍有待探索。本研究调查了可能阻碍量子软件项目采用传统敏捷方法的主要挑战原因，并开发了敏捷-量子软件项目成功预测模型（AQSSPM）。首先，我们确定了之前研究中讨论的 19 个挑战性因素的原因，这些因素可能会影响敏捷-量子项目的成功。其次，我们进行了一项调查，收集了专家对这些原因的意见，并应用遗传算法（GA）、Naive Bayes 分类器（NBC）和逻辑回归（LR）开发了 AQSSPM。利用带有 NBC 的遗传算法，项目成功概率从 53.17% 提高到 99.68%，成本从 0.463% 降低到 0.403%。同样，利用 LR 的 GA 经过 100 次迭代后，成功率从 55.52% 提高到 98.99%，成本从 0.496% 降低到 0.409%。两种方法的结果在原因排序方面都显示出很强的正相关性（rs = 0.955），两者之间没有显著差异（t = 1.195，p = 0.240 >0.05）。考虑到特定项目的成本因素，AQSSPM 突出了高效、成功实施敏捷量子项目的关键重点领域。

{"title":"Agile meets quantum: a novel genetic algorithm model for predicting the success of quantum software development project","authors":"Arif Ali Khan, Muhammad Azeem Akbar, Valtteri Lahtinen, Marko Paavola, Mahmood Niazi, Mohammed Naif Alatawi, Shoayee Dlaim Alotaibi","doi":"10.1007/s10515-024-00434-z","DOIUrl":"10.1007/s10515-024-00434-z","url":null,"abstract":"<div><p>Quantum software systems represent a new realm in software engineering, utilizing quantum bits (Qubits) and quantum gates (Qgates) to solve the complex problems more efficiently than classical counterparts. Agile software development approaches are considered to address many inherent challenges in quantum software development, but their effective integration remains unexplored. This study investigates key causes of challenges that could hinders the adoption of traditional agile approaches in quantum software projects and develop an Agile-Quantum Software Project Success Prediction Model (AQSSPM). Firstly, we identified 19 causes of challenging factors discussed in our previous study, which are potentially impacting agile-quantum project success. Secondly, a survey was conducted to collect expert opinions on these causes and applied Genetic Algorithm (GA) with Naive Bayes Classifier (NBC) and Logistic Regression (LR) to develop the AQSSPM. Utilizing GA with NBC, project success probability improved from 53.17 to 99.68%, with cost reductions from 0.463 to 0.403%. Similarly, GA with LR increased success rates from 55.52 to 98.99%, and costs decreased from 0.496 to 0.409% after 100 iterations. Both methods result showed a strong positive correlation (rs = 0.955) in causes ranking, with no significant difference between them (<i>t</i> = 1.195, <i>p</i> = 0.240 > 0.05). The AQSSPM highlights critical focus areas for efficiently and successfully implementing agile-quantum projects considering the cost factor of a particular project.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10515-024-00434-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140598071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mitigating the impact of mislabeled data on deep predictive models: an empirical study of learning with noise approaches in software engineering tasks 减轻误标注数据对深度预测模型的影响：软件工程任务中利用噪声学习方法的实证研究

IF 2 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering

Pub Date : 2024-04-04 DOI: 10.1007/s10515-024-00435-y

Jian Shen, Zhong Li, Yifei Lu, Minxue Pan, Xuandong Li

Deep predictive models have been widely employed in software engineering (SE) tasks due to their remarkable success in artificial intelligence (AI). Most of these models are trained in a supervised manner, and their performance heavily relies on the quality of training data. Unfortunately, mislabeling or label noise is a common issue in SE datasets, which can significantly affect the validity of models trained on such datasets. Although learning with noise approaches based on deep learning (DL) have been proposed to address the issue of mislabeling in AI datasets, the distinct characteristics of SE datasets in terms of size and data quality raise questions about the effectiveness of these approaches within the SE context. In this paper, we conduct a comprehensive study to understand how mislabeled samples exist in SE datasets, how they impact deep predictive models, and how well existing learning with noise approaches perform on SE datasets. Through an empirical evaluation on two representative datasets for the Bug Report Classification and Software Defect Prediction tasks, our study reveals that learning with noise approaches have the potential to handle mislabeled samples in SE tasks, but their effectiveness is not always consistent. Our research shows that it is crucial to address mislabeled samples in SE tasks. To achieve this, it is essential to take into account the specific properties of the dataset to develop effective solutions. We also highlight the importance of addressing potential class distribution changes caused by mislabeled samples and present the limitations of existing approaches for addressing mislabeled samples. Therefore, we urge the development of more advanced techniques to improve the effectiveness and reliability of deep predictive models in SE tasks.

由于深度预测模型在人工智能（AI）领域取得了巨大成功，因此在软件工程（SE）任务中得到了广泛应用。这些模型大多采用监督方式进行训练，其性能在很大程度上依赖于训练数据的质量。不幸的是，误标注或标签噪声是 SE 数据集中的常见问题，会严重影响在此类数据集上训练的模型的有效性。虽然已经提出了基于深度学习（DL）的带噪声学习方法来解决人工智能数据集中的误标注问题，但 SE 数据集在规模和数据质量方面的显著特点让人们对这些方法在 SE 环境中的有效性产生了疑问。在本文中，我们进行了一项综合研究，以了解误标注样本在 SE 数据集中是如何存在的，它们如何影响深度预测模型，以及现有的带噪声学习方法在 SE 数据集中的表现如何。通过对 Bug 报告分类和软件缺陷预测任务的两个代表性数据集进行实证评估，我们的研究揭示了带噪声学习方法具有处理 SE 任务中误标样本的潜力，但其有效性并不总是一致的。我们的研究表明，处理 SE 任务中的误标注样本至关重要。要做到这一点，必须考虑到数据集的特定属性，以制定有效的解决方案。我们还强调了解决由误标样本引起的潜在类分布变化的重要性，并介绍了现有解决误标样本方法的局限性。因此，我们敦促开发更先进的技术，以提高深度预测模型在 SE 任务中的有效性和可靠性。

{"title":"Mitigating the impact of mislabeled data on deep predictive models: an empirical study of learning with noise approaches in software engineering tasks","authors":"Jian Shen, Zhong Li, Yifei Lu, Minxue Pan, Xuandong Li","doi":"10.1007/s10515-024-00435-y","DOIUrl":"10.1007/s10515-024-00435-y","url":null,"abstract":"<div><p>Deep predictive models have been widely employed in software engineering (SE) tasks due to their remarkable success in artificial intelligence (AI). Most of these models are trained in a supervised manner, and their performance heavily relies on the quality of training data. Unfortunately, mislabeling or label noise is a common issue in SE datasets, which can significantly affect the validity of models trained on such datasets. Although learning with noise approaches based on deep learning (DL) have been proposed to address the issue of mislabeling in AI datasets, the distinct characteristics of SE datasets in terms of size and data quality raise questions about the effectiveness of these approaches within the SE context. In this paper, we conduct a comprehensive study to understand how mislabeled samples exist in SE datasets, how they impact deep predictive models, and how well existing learning with noise approaches perform on SE datasets. Through an empirical evaluation on two representative datasets for the Bug Report Classification and Software Defect Prediction tasks, our study reveals that learning with noise approaches have the potential to handle mislabeled samples in SE tasks, but their effectiveness is not always consistent. Our research shows that it is crucial to address mislabeled samples in SE tasks. To achieve this, it is essential to take into account the specific properties of the dataset to develop effective solutions. We also highlight the importance of addressing potential class distribution changes caused by mislabeled samples and present the limitations of existing approaches for addressing mislabeled samples. Therefore, we urge the development of more advanced techniques to improve the effectiveness and reliability of deep predictive models in SE tasks.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140598068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0