Pub Date : 2024-05-15DOI: 10.1007/s10515-024-00443-y
Somayeh Kalhor, Mohammad Reza Keyvanpour, Afshin Salajegheh
The violation of the semantic and structural software principles, such as low connection, high coherence, high understanding, and others, are called anti-patterns, which is one of the concerns of the software development process. They are caused due to bad design or programming that must be detected and removed to improve the application’s source code. Refactoring operators efficiently eliminate antipatterns, but they must first be identified. Therefore, antipattern detection is a critical issue in software engineering, and to do this, various approaches have been proposed. So far, review articles have been published to classify and compare these approaches. However, a comprehensive study using evaluation parameters has not compared different anti-pattern detection methods at all software abstraction levels. In this article, all the methods presented so far are classified, then their advantages and disadvantages are highlighted. Finally, a complete comparison of each category by evaluation metrics is provided. Our proposed classification considers three aspects, levels of abstraction, degree of dependence on developers’ skills, and techniques used. Then, the evaluation metrics reported on this subject are analyzed, and the qualitative values of these metrics for each category are presented. This information can help researchers compare and understand existing methods and improve them.
{"title":"A systematic review of refactoring opportunities by software antipattern detection","authors":"Somayeh Kalhor, Mohammad Reza Keyvanpour, Afshin Salajegheh","doi":"10.1007/s10515-024-00443-y","DOIUrl":"10.1007/s10515-024-00443-y","url":null,"abstract":"<div><p>The violation of the semantic and structural software principles, such as low connection, high coherence, high understanding, and others, are called anti-patterns, which is one of the concerns of the software development process. They are caused due to bad design or programming that must be detected and removed to improve the application’s source code. Refactoring operators efficiently eliminate antipatterns, but they must first be identified. Therefore, antipattern detection is a critical issue in software engineering, and to do this, various approaches have been proposed. So far, review articles have been published to classify and compare these approaches. However, a comprehensive study using evaluation parameters has not compared different anti-pattern detection methods at all software abstraction levels. In this article, all the methods presented so far are classified, then their advantages and disadvantages are highlighted. Finally, a complete comparison of each category by evaluation metrics is provided. Our proposed classification considers three aspects, levels of abstraction, degree of dependence on developers’ skills, and techniques used. Then, the evaluation metrics reported on this subject are analyzed, and the qualitative values of these metrics for each category are presented. This information can help researchers compare and understand existing methods and improve them.\u0000</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-13DOI: 10.1007/s10515-024-00441-0
Evgenii V. Stepanov, Alexey A. Mitsyuk
Modern runtime environments, standard libraries, and other frameworks provide many ways of diagnostics for software engineers. One form of such diagnostics is logging low-level events which characterize internal processes during program execution like garbage collection, assembly loading, just-in-time compilation, etc. Low-level program execution event logs contain a large number of events and event classes, which makes it impossible to discover meaningful process models straight from the event log, so extraction of high-level activities is a necessary step for further processing of such logs. In this paper, .NET applications execution logs are considered and an approach based on an unsupervised technique is extended with the domain-driven hierarchy built with the knowledge of a structure of logged events. The proposed approach allows treating events on different levels of abstraction, thus extending the number of patterns and activities found with the unsupervised technique. Experiments with real-life .NET programs execution event logs are conducted to demonstrate the proposed approach’s capability.
{"title":"Extracting high-level activities from low-level program execution logs","authors":"Evgenii V. Stepanov, Alexey A. Mitsyuk","doi":"10.1007/s10515-024-00441-0","DOIUrl":"10.1007/s10515-024-00441-0","url":null,"abstract":"<div><p>Modern runtime environments, standard libraries, and other frameworks provide many ways of diagnostics for software engineers. One form of such diagnostics is logging low-level events which characterize internal processes during program execution like garbage collection, assembly loading, just-in-time compilation, etc. Low-level program execution event logs contain a large number of events and event classes, which makes it impossible to discover meaningful process models straight from the event log, so extraction of high-level activities is a necessary step for further processing of such logs. In this paper, .NET applications execution logs are considered and an approach based on an unsupervised technique is extended with the domain-driven hierarchy built with the knowledge of a structure of logged events. The proposed approach allows treating events on different levels of abstraction, thus extending the number of patterns and activities found with the unsupervised technique. Experiments with real-life .NET programs execution event logs are conducted to demonstrate the proposed approach’s capability.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08DOI: 10.1007/s10515-024-00440-1
Cho Xuan Do, Nguyen Trong Luu, Phuong Thi Lan Nguyen
Detecting vulnerabilities in source code written in C and C + + is currently essential as attack techniques against systems seek to find, exploit, and attack these vulnerabilities. In this article, to improve the effectiveness of the source code vulnerability detection process, we propose a new approach based on building and representing source code features using natural language processing (NLP) techniques. Our proposal in the article consists of two main stages: (i) building a feature profile of the source code using the RoBERTa model, and (ii) classifying source code based on the feature profile using a supervised machine learning algorithm. Specifically, with our proposal utilizing the pre-trained RoBERTa model, we have successfully built and represented important features of source code as complete vectors, thereby enhancing the effectiveness of prediction and vulnerability detection models. The experimental part of our article compared and evaluated our proposal with other approaches on the FFmpeg + Qume dataset. The experimental results in the article showed that the approach in this study was superior to other research directions on all measures. Therefore, the proposal to use NLP techniques based on the RoBERTa model not only has scientific significance as a new research direction that has not been proposed for application but also has practical significance when all experimental results are highly effective.
目前,检测用 C 和 C + + 编写的源代码中的漏洞至关重要,因为针对系统的攻击技术试图找到、利用和攻击这些漏洞。在本文中,为了提高源代码漏洞检测过程的有效性,我们提出了一种基于使用自然语言处理(NLP)技术构建和表示源代码特征的新方法。我们在文章中提出的建议包括两个主要阶段:(i) 使用 RoBERTa 模型建立源代码的特征轮廓;(ii) 使用监督机器学习算法根据特征轮廓对源代码进行分类。具体来说,我们利用预先训练好的 RoBERTa 模型,成功地构建了源代码的重要特征,并将其表示为完整的向量,从而提高了预测和漏洞检测模型的有效性。文章的实验部分在 FFmpeg + Qume 数据集上对我们的建议与其他方法进行了比较和评估。文章中的实验结果表明,本研究的方法在所有指标上都优于其他研究方向。因此,基于 RoBERTa 模型使用 NLP 技术的建议不仅具有科学意义,是一个尚未提出应用的新研究方向,而且在所有实验结果都非常有效的情况下,还具有实际意义。
{"title":"Optimizing software vulnerability detection using RoBERTa and machine learning","authors":"Cho Xuan Do, Nguyen Trong Luu, Phuong Thi Lan Nguyen","doi":"10.1007/s10515-024-00440-1","DOIUrl":"10.1007/s10515-024-00440-1","url":null,"abstract":"<div><p>Detecting vulnerabilities in source code written in C and C + + is currently essential as attack techniques against systems seek to find, exploit, and attack these vulnerabilities. In this article, to improve the effectiveness of the source code vulnerability detection process, we propose a new approach based on building and representing source code features using natural language processing (NLP) techniques. Our proposal in the article consists of two main stages: (i) building a feature profile of the source code using the RoBERTa model, and (ii) classifying source code based on the feature profile using a supervised machine learning algorithm. Specifically, with our proposal utilizing the pre-trained RoBERTa model, we have successfully built and represented important features of source code as complete vectors, thereby enhancing the effectiveness of prediction and vulnerability detection models. The experimental part of our article compared and evaluated our proposal with other approaches on the FFmpeg + Qume dataset. The experimental results in the article showed that the approach in this study was superior to other research directions on all measures. Therefore, the proposal to use NLP techniques based on the RoBERTa model not only has scientific significance as a new research direction that has not been proposed for application but also has practical significance when all experimental results are highly effective.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-24DOI: 10.1007/s10515-024-00437-w
Youcong Ni, Xin Du, Yuan Yuan, Ruliang Xiao, Gaolin Chen
The open-source compiler GCC offers numerous options to improve execution time. Two categories of approaches, machine learning-based and design space exploration, have emerged for selecting the optimal set of options. However, they continue to face challenge in quickly obtaining high-quality solutions due to the large and discrete optimization space, time-consuming utility evaluation for selected options, and complex interactions among options. To address these challenges, we propose TSOA, a Two-Stage Optimization Approach for GCC compilation options to minimize execution time. In the first stage, we present OPPM, an Option Preselection algorithm based on Pattern Mining. OPPM generates diverse samples to cover a wide range of option interactions. It subsequently mines frequent options from both objective-improved and non-improved samples. The mining results are further validated using CRC codes to precisely preselect options and reduce the optimization space. Transitioning to the second stage, we present OSEA, an Option Selection Evolutionary optimization Algorithm. OSEA is grounded in solution preselection and an option interaction graph. The solution preselection employs a random forest to build a classifier, efficiently identifying promising solutions for the next-generation population and thereby reducing the time spent on utility evaluation. Simultaneously, the option interaction graph is built to capture option interplays and their influence on objectives from evaluated solutions. Then, high-quality solutions are generated based on the option interaction graph. We evaluate the performance of TSOA by comparing it with representative machine learning-based and design space exploration approaches across a diverse set of 20 problem instances from two benchmark platforms. Additionally, we validate the effectiveness of OPPM and conduct related ablation experiments. The experimental results show that TSOA outperforms state-of-the-art approaches significantly in both optimization time and solution quality. Moreover, OPPM outperforms other option preselection algorithms, while the effectiveness of random forest-assisted solution preselection, along with new solution generation based on the option interaction graph, has been verified.
{"title":"Tsoa: a two-stage optimization approach for GCC compilation options to minimize execution time","authors":"Youcong Ni, Xin Du, Yuan Yuan, Ruliang Xiao, Gaolin Chen","doi":"10.1007/s10515-024-00437-w","DOIUrl":"10.1007/s10515-024-00437-w","url":null,"abstract":"<div><p>The open-source compiler GCC offers numerous options to improve execution time. Two categories of approaches, machine learning-based and design space exploration, have emerged for selecting the optimal set of options. However, they continue to face challenge in quickly obtaining high-quality solutions due to the large and discrete optimization space, time-consuming utility evaluation for selected options, and complex interactions among options. To address these challenges, we propose TSOA, a Two-Stage Optimization Approach for GCC compilation options to minimize execution time. In the first stage, we present OPPM, an Option Preselection algorithm based on Pattern Mining. OPPM generates diverse samples to cover a wide range of option interactions. It subsequently mines frequent options from both objective-improved and non-improved samples. The mining results are further validated using CRC codes to precisely preselect options and reduce the optimization space. Transitioning to the second stage, we present OSEA, an Option Selection Evolutionary optimization Algorithm. OSEA is grounded in solution preselection and an option interaction graph. The solution preselection employs a random forest to build a classifier, efficiently identifying promising solutions for the next-generation population and thereby reducing the time spent on utility evaluation. Simultaneously, the option interaction graph is built to capture option interplays and their influence on objectives from evaluated solutions. Then, high-quality solutions are generated based on the option interaction graph. We evaluate the performance of TSOA by comparing it with representative machine learning-based and design space exploration approaches across a diverse set of 20 problem instances from two benchmark platforms. Additionally, we validate the effectiveness of OPPM and conduct related ablation experiments. The experimental results show that TSOA outperforms state-of-the-art approaches significantly in both optimization time and solution quality. Moreover, OPPM outperforms other option preselection algorithms, while the effectiveness of random forest-assisted solution preselection, along with new solution generation based on the option interaction graph, has been verified.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140659966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-20DOI: 10.1007/s10515-024-00438-9
Zilong Ren, Xiaolin Ju, Xiang Chen, Hao Shen
Software vulnerability detection is a critical step in ensuring system security and data protection. Recent research has demonstrated the effectiveness of deep learning in automated vulnerability detection. However, it is difficult for deep learning models to understand the semantics and domain-specific knowledge of source code. In this study, we introduce a new vulnerability detection framework, ProRLearn, which leverages two main techniques: prompt tuning and reinforcement learning. Since existing fine-tuning of pre-trained language models (PLMs) struggles to leverage domain knowledge fully, we introduce a new automatic prompt-tuning technique. Precisely, prompt tuning mimics the pre-training process of PLMs by rephrasing task input and adding prompts, using the PLM’s output as the prediction output. The introduction of the reinforcement learning reward mechanism aims to guide the behavior of vulnerability detection through a reward and punishment model, enabling it to learn effective strategies for obtaining maximum long-term rewards in specific environments. The introduction of reinforcement learning aims to encourage the model to learn how to maximize rewards or minimize penalties, thus enhancing performance. Experiments on three datasets (FFMPeg+Qemu, Reveal, and Big-Vul) indicate that ProRLearn achieves performance improvement of 3.27–70.96% over state-of-the-art baselines in terms of F1 score. The combination of prompt tuning and reinforcement learning can offer a potential opportunity to improve performance in vulnerability detection. This means that it can effectively improve the performance in responding to constantly changing network environments and new threats. This interdisciplinary approach contributes to a better understanding of the interplay between natural language processing and reinforcement learning, opening up new opportunities and challenges for future research and applications.
软件漏洞检测是确保系统安全和数据保护的关键一步。最近的研究证明了深度学习在自动漏洞检测中的有效性。然而,深度学习模型很难理解源代码的语义和特定领域知识。在本研究中,我们介绍了一种新的漏洞检测框架 ProRLearn,它利用了两种主要技术:及时调整和强化学习。由于现有的预训练语言模型(PLM)的微调难以充分利用领域知识,我们引入了一种新的自动提示调整技术。确切地说,提示调整模拟了预训练语言模型的预训练过程,通过重新措辞任务输入并添加提示,将预训练语言模型的输出作为预测输出。引入强化学习奖励机制,旨在通过奖惩模型引导漏洞检测行为,使其学习有效策略,在特定环境中获得最大的长期回报。引入强化学习的目的是鼓励模型学习如何使奖励最大化或惩罚最小化,从而提高性能。在三个数据集(FFMPeg+Qemu、Reveal 和 Big-Vul)上的实验表明,就 F1 分数而言,ProRLearn 比最先进的基线模型提高了 3.27%-70.96% 的性能。提示调整与强化学习的结合为提高漏洞检测性能提供了潜在的机会。这意味着,它能有效提高应对不断变化的网络环境和新威胁的性能。这种跨学科方法有助于更好地理解自然语言处理和强化学习之间的相互作用,为未来的研究和应用带来了新的机遇和挑战。
{"title":"ProRLearn: boosting prompt tuning-based vulnerability detection by reinforcement learning","authors":"Zilong Ren, Xiaolin Ju, Xiang Chen, Hao Shen","doi":"10.1007/s10515-024-00438-9","DOIUrl":"10.1007/s10515-024-00438-9","url":null,"abstract":"<div><p>Software vulnerability detection is a critical step in ensuring system security and data protection. Recent research has demonstrated the effectiveness of deep learning in automated vulnerability detection. However, it is difficult for deep learning models to understand the semantics and domain-specific knowledge of source code. In this study, we introduce a new vulnerability detection framework, ProRLearn, which leverages two main techniques: prompt tuning and reinforcement learning. Since existing fine-tuning of pre-trained language models (PLMs) struggles to leverage domain knowledge fully, we introduce a new automatic prompt-tuning technique. Precisely, prompt tuning mimics the pre-training process of PLMs by rephrasing task input and adding prompts, using the PLM’s output as the prediction output. The introduction of the reinforcement learning reward mechanism aims to guide the behavior of vulnerability detection through a reward and punishment model, enabling it to learn effective strategies for obtaining maximum long-term rewards in specific environments. The introduction of reinforcement learning aims to encourage the model to learn how to maximize rewards or minimize penalties, thus enhancing performance. Experiments on three datasets (FFMPeg+Qemu, Reveal, and Big-Vul) indicate that ProRLearn achieves performance improvement of 3.27–70.96% over state-of-the-art baselines in terms of F1 score. The combination of prompt tuning and reinforcement learning can offer a potential opportunity to improve performance in vulnerability detection. This means that it can effectively improve the performance in responding to constantly changing network environments and new threats. This interdisciplinary approach contributes to a better understanding of the interplay between natural language processing and reinforcement learning, opening up new opportunities and challenges for future research and applications.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140625680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1007/s10515-024-00428-x
Shayan Hashemi, Mika Mäntylä
With the growth of online services, IoT devices, and DevOps-oriented software development, software log anomaly detection is becoming increasingly important. Prior works mainly follow a traditional four-staged architecture (Preprocessor, Parser, Vectorizer, and Classifier). This paper proposes OneLog, which utilizes a single deep neural network instead of multiple separate components. OneLog harnesses convolutional neural network (CNN) at the character level to take digits, numbers, and punctuations, which were removed in prior works, into account alongside the main natural language text. We evaluate our approach in six message- and sequence-based data sets: HDFS, Hadoop, BGL, Thunderbird, Spirit, and Liberty. We experiment with Onelog with single-, multi-, and cross-project setups. Onelog offers state-of-the-art performance in our datasets. Onelog can utilize multi-project datasets simultaneously during training, which suggests our model can generalize between datasets. Multi-project training also improves Onelog performance making it ideal when limited training data is available for an individual project. We also found that cross-project anomaly detection is possible with a single project pair (Liberty and Spirit). Analysis of model internals shows that one log has multiple modes of detecting anomalies and that the model learns manually validated parsing rules for the log messages. We conclude that character-based CNNs are a promising approach toward end-to-end learning in log anomaly detection. They offer good performance and generalization over multiple datasets. We will make our scripts publicly available upon the acceptance of this paper.
{"title":"OneLog: towards end-to-end software log anomaly detection","authors":"Shayan Hashemi, Mika Mäntylä","doi":"10.1007/s10515-024-00428-x","DOIUrl":"10.1007/s10515-024-00428-x","url":null,"abstract":"<div><p>With the growth of online services, IoT devices, and DevOps-oriented software development, software log anomaly detection is becoming increasingly important. Prior works mainly follow a traditional four-staged architecture (Preprocessor, Parser, Vectorizer, and Classifier). This paper proposes OneLog, which utilizes a single deep neural network instead of multiple separate components. OneLog harnesses convolutional neural network (CNN) at the character level to take digits, numbers, and punctuations, which were removed in prior works, into account alongside the main natural language text. We evaluate our approach in six message- and sequence-based data sets: HDFS, Hadoop, BGL, Thunderbird, Spirit, and Liberty. We experiment with Onelog with single-, multi-, and cross-project setups. Onelog offers state-of-the-art performance in our datasets. Onelog can utilize multi-project datasets simultaneously during training, which suggests our model can generalize between datasets. Multi-project training also improves Onelog performance making it ideal when limited training data is available for an individual project. We also found that cross-project anomaly detection is possible with a single project pair (Liberty and Spirit). Analysis of model internals shows that one log has multiple modes of detecting anomalies and that the model learns manually validated parsing rules for the log messages. We conclude that character-based CNNs are a promising approach toward end-to-end learning in log anomaly detection. They offer good performance and generalization over multiple datasets. We will make our scripts publicly available upon the acceptance of this paper.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10515-024-00428-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-12DOI: 10.1007/s10515-024-00436-x
Aritra Sarkar
As bigger quantum processors with hundreds of qubits become increasingly available, the potential for quantum computing to solve problems intractable for classical computers is becoming more tangible. Designing efficient quantum algorithms and software in tandem is key to achieving quantum advantage. Quantum software engineering is challenging due to the unique counterintuitive nature of quantum logic. Moreover, with larger quantum systems, traditional programming using quantum assembly language and qubit-level reasoning is becoming infeasible. Automated Quantum Software Engineering (AQSE) can help to reduce the barrier to entry, speed up development, reduce errors, and improve the efficiency of quantum software. This article elucidates the motivation to research AQSE (why), a precise description of such a framework (what), and reflections on components that are required for implementing it (how).
{"title":"Automated quantum software engineering","authors":"Aritra Sarkar","doi":"10.1007/s10515-024-00436-x","DOIUrl":"10.1007/s10515-024-00436-x","url":null,"abstract":"<div><p>As bigger quantum processors with hundreds of qubits become increasingly available, the potential for quantum computing to solve problems intractable for classical computers is becoming more tangible. Designing efficient quantum algorithms and software in tandem is key to achieving quantum advantage. Quantum software engineering is challenging due to the unique counterintuitive nature of quantum logic. Moreover, with larger quantum systems, traditional programming using quantum assembly language and qubit-level reasoning is becoming infeasible. Automated Quantum Software Engineering (AQSE) can help to reduce the barrier to entry, speed up development, reduce errors, and improve the efficiency of quantum software. This article elucidates the motivation to research AQSE (why), a precise description of such a framework (what), and reflections on components that are required for implementing it (how).</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10515-024-00436-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140598066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-10DOI: 10.1007/s10515-024-00432-1
Andreea Galbin-Nasui, Andreea Vescan
Bug tracking systems receive a large number of bugs on a daily basis. The process of maintaining the integrity of the software and producing high-quality software is challenging. The bug-sorting process is usually a manual task that can lead to human errors and be time-consuming. The purpose of this research is twofold: first, to conduct a literature review on the bug report priority classification approaches, and second, to replicate existing approaches with various classifiers to extract new insights about the priority classification approaches. We used a Systematic Literature Review methodology to identify the most relevant existing approaches related to the bug report priority classification problem. Furthermore, we conducted a replication study on three classifiers: Naive Bayes (NB), Support Vector Machines (SVM), and Convolutional Neural Network (CNN). Two sets of experiments are performed: first, our own NLTK implementation based on NB and CNN, and second, based on Weka implementation for NB, SVM, and CNN. The dataset used consists of several Eclipse projects and one project related to database systems. The obtained results are better for the bug priority P3 for the CNN classifier, and overall the quality relation between the three classifiers is preserved as in the original studies. The replication study confirmed the findings of the original studies, emphasizing the need to further investigate the relationship between the characteristics of the projects used as training and those used as testing.
{"title":"Bug reports priority classification models. Replication study","authors":"Andreea Galbin-Nasui, Andreea Vescan","doi":"10.1007/s10515-024-00432-1","DOIUrl":"10.1007/s10515-024-00432-1","url":null,"abstract":"<div><p>Bug tracking systems receive a large number of bugs on a daily basis. The process of maintaining the integrity of the software and producing high-quality software is challenging. The bug-sorting process is usually a manual task that can lead to human errors and be time-consuming. The purpose of this research is twofold: first, to conduct a literature review on the bug report priority classification approaches, and second, to replicate existing approaches with various classifiers to extract new insights about the priority classification approaches. We used a Systematic Literature Review methodology to identify the most relevant existing approaches related to the bug report priority classification problem. Furthermore, we conducted a replication study on three classifiers: Naive Bayes (NB), Support Vector Machines (SVM), and Convolutional Neural Network (CNN). Two sets of experiments are performed: first, our own NLTK implementation based on NB and CNN, and second, based on Weka implementation for NB, SVM, and CNN. The dataset used consists of several Eclipse projects and one project related to database systems. The obtained results are better for the bug priority P3 for the CNN classifier, and overall the quality relation between the three classifiers is preserved as in the original studies. The replication study confirmed the findings of the original studies, emphasizing the need to further investigate the relationship between the characteristics of the projects used as training and those used as testing.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140598070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.1007/s10515-024-00434-z
Arif Ali Khan, Muhammad Azeem Akbar, Valtteri Lahtinen, Marko Paavola, Mahmood Niazi, Mohammed Naif Alatawi, Shoayee Dlaim Alotaibi
Quantum software systems represent a new realm in software engineering, utilizing quantum bits (Qubits) and quantum gates (Qgates) to solve the complex problems more efficiently than classical counterparts. Agile software development approaches are considered to address many inherent challenges in quantum software development, but their effective integration remains unexplored. This study investigates key causes of challenges that could hinders the adoption of traditional agile approaches in quantum software projects and develop an Agile-Quantum Software Project Success Prediction Model (AQSSPM). Firstly, we identified 19 causes of challenging factors discussed in our previous study, which are potentially impacting agile-quantum project success. Secondly, a survey was conducted to collect expert opinions on these causes and applied Genetic Algorithm (GA) with Naive Bayes Classifier (NBC) and Logistic Regression (LR) to develop the AQSSPM. Utilizing GA with NBC, project success probability improved from 53.17 to 99.68%, with cost reductions from 0.463 to 0.403%. Similarly, GA with LR increased success rates from 55.52 to 98.99%, and costs decreased from 0.496 to 0.409% after 100 iterations. Both methods result showed a strong positive correlation (rs = 0.955) in causes ranking, with no significant difference between them (t = 1.195, p = 0.240 > 0.05). The AQSSPM highlights critical focus areas for efficiently and successfully implementing agile-quantum projects considering the cost factor of a particular project.
{"title":"Agile meets quantum: a novel genetic algorithm model for predicting the success of quantum software development project","authors":"Arif Ali Khan, Muhammad Azeem Akbar, Valtteri Lahtinen, Marko Paavola, Mahmood Niazi, Mohammed Naif Alatawi, Shoayee Dlaim Alotaibi","doi":"10.1007/s10515-024-00434-z","DOIUrl":"10.1007/s10515-024-00434-z","url":null,"abstract":"<div><p>Quantum software systems represent a new realm in software engineering, utilizing quantum bits (Qubits) and quantum gates (Qgates) to solve the complex problems more efficiently than classical counterparts. Agile software development approaches are considered to address many inherent challenges in quantum software development, but their effective integration remains unexplored. This study investigates key causes of challenges that could hinders the adoption of traditional agile approaches in quantum software projects and develop an Agile-Quantum Software Project Success Prediction Model (AQSSPM). Firstly, we identified 19 causes of challenging factors discussed in our previous study, which are potentially impacting agile-quantum project success. Secondly, a survey was conducted to collect expert opinions on these causes and applied Genetic Algorithm (GA) with Naive Bayes Classifier (NBC) and Logistic Regression (LR) to develop the AQSSPM. Utilizing GA with NBC, project success probability improved from 53.17 to 99.68%, with cost reductions from 0.463 to 0.403%. Similarly, GA with LR increased success rates from 55.52 to 98.99%, and costs decreased from 0.496 to 0.409% after 100 iterations. Both methods result showed a strong positive correlation (rs = 0.955) in causes ranking, with no significant difference between them (<i>t</i> = 1.195, <i>p</i> = 0.240 > 0.05). The AQSSPM highlights critical focus areas for efficiently and successfully implementing agile-quantum projects considering the cost factor of a particular project.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10515-024-00434-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140598071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-04DOI: 10.1007/s10515-024-00435-y
Jian Shen, Zhong Li, Yifei Lu, Minxue Pan, Xuandong Li
Deep predictive models have been widely employed in software engineering (SE) tasks due to their remarkable success in artificial intelligence (AI). Most of these models are trained in a supervised manner, and their performance heavily relies on the quality of training data. Unfortunately, mislabeling or label noise is a common issue in SE datasets, which can significantly affect the validity of models trained on such datasets. Although learning with noise approaches based on deep learning (DL) have been proposed to address the issue of mislabeling in AI datasets, the distinct characteristics of SE datasets in terms of size and data quality raise questions about the effectiveness of these approaches within the SE context. In this paper, we conduct a comprehensive study to understand how mislabeled samples exist in SE datasets, how they impact deep predictive models, and how well existing learning with noise approaches perform on SE datasets. Through an empirical evaluation on two representative datasets for the Bug Report Classification and Software Defect Prediction tasks, our study reveals that learning with noise approaches have the potential to handle mislabeled samples in SE tasks, but their effectiveness is not always consistent. Our research shows that it is crucial to address mislabeled samples in SE tasks. To achieve this, it is essential to take into account the specific properties of the dataset to develop effective solutions. We also highlight the importance of addressing potential class distribution changes caused by mislabeled samples and present the limitations of existing approaches for addressing mislabeled samples. Therefore, we urge the development of more advanced techniques to improve the effectiveness and reliability of deep predictive models in SE tasks.
由于深度预测模型在人工智能(AI)领域取得了巨大成功,因此在软件工程(SE)任务中得到了广泛应用。这些模型大多采用监督方式进行训练,其性能在很大程度上依赖于训练数据的质量。不幸的是,误标注或标签噪声是 SE 数据集中的常见问题,会严重影响在此类数据集上训练的模型的有效性。虽然已经提出了基于深度学习(DL)的带噪声学习方法来解决人工智能数据集中的误标注问题,但 SE 数据集在规模和数据质量方面的显著特点让人们对这些方法在 SE 环境中的有效性产生了疑问。在本文中,我们进行了一项综合研究,以了解误标注样本在 SE 数据集中是如何存在的,它们如何影响深度预测模型,以及现有的带噪声学习方法在 SE 数据集中的表现如何。通过对 Bug 报告分类和软件缺陷预测任务的两个代表性数据集进行实证评估,我们的研究揭示了带噪声学习方法具有处理 SE 任务中误标样本的潜力,但其有效性并不总是一致的。我们的研究表明,处理 SE 任务中的误标注样本至关重要。要做到这一点,必须考虑到数据集的特定属性,以制定有效的解决方案。我们还强调了解决由误标样本引起的潜在类分布变化的重要性,并介绍了现有解决误标样本方法的局限性。因此,我们敦促开发更先进的技术,以提高深度预测模型在 SE 任务中的有效性和可靠性。
{"title":"Mitigating the impact of mislabeled data on deep predictive models: an empirical study of learning with noise approaches in software engineering tasks","authors":"Jian Shen, Zhong Li, Yifei Lu, Minxue Pan, Xuandong Li","doi":"10.1007/s10515-024-00435-y","DOIUrl":"10.1007/s10515-024-00435-y","url":null,"abstract":"<div><p>Deep predictive models have been widely employed in software engineering (SE) tasks due to their remarkable success in artificial intelligence (AI). Most of these models are trained in a supervised manner, and their performance heavily relies on the quality of training data. Unfortunately, mislabeling or label noise is a common issue in SE datasets, which can significantly affect the validity of models trained on such datasets. Although learning with noise approaches based on deep learning (DL) have been proposed to address the issue of mislabeling in AI datasets, the distinct characteristics of SE datasets in terms of size and data quality raise questions about the effectiveness of these approaches within the SE context. In this paper, we conduct a comprehensive study to understand how mislabeled samples exist in SE datasets, how they impact deep predictive models, and how well existing learning with noise approaches perform on SE datasets. Through an empirical evaluation on two representative datasets for the Bug Report Classification and Software Defect Prediction tasks, our study reveals that learning with noise approaches have the potential to handle mislabeled samples in SE tasks, but their effectiveness is not always consistent. Our research shows that it is crucial to address mislabeled samples in SE tasks. To achieve this, it is essential to take into account the specific properties of the dataset to develop effective solutions. We also highlight the importance of addressing potential class distribution changes caused by mislabeled samples and present the limitations of existing approaches for addressing mislabeled samples. Therefore, we urge the development of more advanced techniques to improve the effectiveness and reliability of deep predictive models in SE tasks.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140598068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}