Software: Practice and Experience最新文献_第2页

Bug numbers matter: An empirical study of effort‐aware defect prediction using class labels versus bug numbers 错误数量很重要：使用类标签和错误数量进行努力感知缺陷预测的实证研究

Software: Practice and Experience

Pub Date : 2024-07-10 DOI: 10.1002/spe.3363

Peixin Yang, Ziyao Zeng, Lin Zhu, Yanjiao Zhang, Xin Wang, Chuanxiang Ma, Wenhua Hu

Previous research have utilized public software defect datasets such as NASA, RELINK, and SOFTLAB, which only contain class label information. Most effort‐aware defect prediction (EADP) studies are carried out around these datasets. However, EADP studies typically relying on predicted bug number (i.e., considering modules as effort) or density (i.e., considering lines of code as effort) for ranking software modules. To explore the impact of bug number information in constructing EADP models, we access the performance degradation of the best‐performing learning‐to‐rank methods when using class labels instead of bug numbers for training. The experimental results show that using class labels instead of bug numbers in building EADP models results in an decrease in the detected bugs when module is considering as effort. When effort is LOC, using class labels to construct EADP models can lead to a significant increase in the initial false alarms and a significant increase in the modules that need to be inspected. Therefore, we recommend not only the class labels but also the bug number information should be disclosed when publishing software defect datasets, in order to construct more accurate EADP models.

以往的研究利用了 NASA、RELINK 和 SOFTLAB 等公共软件缺陷数据集，这些数据集只包含类标签信息。大多数努力感知缺陷预测（EADP）研究都是围绕这些数据集进行的。然而，EADP 研究通常依靠预测的缺陷数量（即把模块视为工作量）或密度（即把代码行数视为工作量）对软件模块进行排序。为了探索错误数信息对构建 EADP 模型的影响，我们访问了使用类标签而不是错误数进行训练时表现最好的学习排名方法的性能下降情况。实验结果表明，在构建 EADP 模型时使用类标签而不是错误编号，会导致在将模块视为努力时检测到的错误数量减少。如果将工作量视为 LOC，使用类标签构建 EADP 模型会导致初始误报率大幅上升，需要检查的模块也会大幅增加。因此，我们建议在发布软件缺陷数据集时，不仅要公开类标签，还要公开错误编号信息，以便构建更准确的 EADP 模型。

{"title":"Bug numbers matter: An empirical study of effort‐aware defect prediction using class labels versus bug numbers","authors":"Peixin Yang, Ziyao Zeng, Lin Zhu, Yanjiao Zhang, Xin Wang, Chuanxiang Ma, Wenhua Hu","doi":"10.1002/spe.3363","DOIUrl":"https://doi.org/10.1002/spe.3363","url":null,"abstract":"Previous research have utilized public software defect datasets such as NASA, RELINK, and SOFTLAB, which only contain class label information. Most effort‐aware defect prediction (EADP) studies are carried out around these datasets. However, EADP studies typically relying on predicted bug number (i.e., considering modules as effort) or density (i.e., considering lines of code as effort) for ranking software modules. To explore the impact of bug number information in constructing EADP models, we access the performance degradation of the best‐performing learning‐to‐rank methods when using class labels instead of bug numbers for training. The experimental results show that using class labels instead of bug numbers in building EADP models results in an decrease in the detected bugs when module is considering as effort. When effort is LOC, using class labels to construct EADP models can lead to a significant increase in the initial false alarms and a significant increase in the modules that need to be inspected. Therefore, we recommend not only the class labels but also the bug number information should be disclosed when publishing software defect datasets, in order to construct more accurate EADP models.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141586713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Are the smart contracts on Q&A site reliable? 问答网站上的智能合约可靠吗？

Software: Practice and Experience

Pub Date : 2024-07-01 DOI: 10.1002/spe.3361

Xiaocong Zhou, Quanqi Wang, Yifan Liu, Xiangping Chen, Yuan Huang, Zibin Zheng

Ethereum, as a leading blockchain platform, has attracted a significant number of practitioners. These practitioners require a platform for communication and collaborative problem‐solving, which led to Ethereum Stack Exchange (ESE), a Q&A site dedicated to Ethereum‐related issues. While the Q&A site facilitates communication among practitioners, it also introduces new challenges. Practitioners adopt code snippets from Q&A sites to address problems encountered. However, the quality of code snippets on ESE remains largely unexplored. Vulnerabilities and gas‐inefficient patterns in ESE may spread to the code in Ethereum and threaten its regular operation. In this article, we conduct an empirical study investigating the distribution of vulnerabilities and gas‐inefficient patterns in ESE. Further, we analyze the potential impact of vulnerabilities and gas‐inefficient patterns from ESE on Ethereum. However, we encounter a problem during the vulnerability and gas‐inefficient pattern detection. Established smart contract analysis tools in the mainstream realm necessitate complete source code files for thorough analysis, while codes on ESE are often incomplete code snippets. To address this, we introduce the AST‐based code clone detection technique to construct detectable files corresponding to code snippets. This enables us to detect vulnerabilities and gas‐inefficient patterns in code snippets. In the end, our findings demonstrate that 11.18% of the contract‐level code snippets and 4.06% of function‐level code snippets in ESE have vulnerabilities. And 27.21% of contract‐level code snippets and 17.89% of function‐level code snippets contain gas‐inefficient patterns. The additional consumption caused by the gas‐inefficient pattern in ESE is approximately $1,695,002. Based on these findings, we provide recommendations for both ESE and its users, aiming to foster collaborative efforts and create a more reliable Q&A site for practitioners.

以太坊作为领先的区块链平台，吸引了大量从业者。这些从业人员需要一个交流和协作解决问题的平台，这就催生了以太坊堆栈交换（ESE）--一个专门讨论以太坊相关问题的问答网站。虽然 Q&A 网站促进了从业人员之间的交流，但也带来了新的挑战。从业人员采用 Q&A 网站上的代码片段来解决遇到的问题。然而，ESE 上代码片段的质量在很大程度上仍有待探索。ESE 中的漏洞和气体效率低下的模式可能会传播到以太坊的代码中，威胁其正常运行。在本文中，我们进行了一项实证研究，调查 ESE 中漏洞和气体效率低下模式的分布情况。此外，我们还分析了 ESE 中的漏洞和气体无效模式对以太坊的潜在影响。然而，我们在漏洞和气体不足模式检测过程中遇到了一个问题。主流领域的成熟智能合约分析工具需要完整的源代码文件才能进行全面分析，而 ESE 上的代码往往是不完整的代码片段。为此，我们引入了基于 AST 的代码克隆检测技术，构建与代码片段相对应的可检测文件。这样，我们就能检测到代码片段中的漏洞和气体不足模式。最后，我们的研究结果表明，ESE 中 11.18% 的合同级代码片段和 4.06% 的函数级代码片段存在漏洞。此外，27.21% 的合同级代码片段和 17.89% 的函数级代码片段包含气体效率低的模式。ESE 中气体效率低下模式造成的额外消耗约为 1,695,002 美元。基于这些发现，我们为 ESE 及其用户提供了建议，旨在促进合作，为从业人员创建一个更可靠的 Q&A 网站。

{"title":"Are the smart contracts on Q&A site reliable?","authors":"Xiaocong Zhou, Quanqi Wang, Yifan Liu, Xiangping Chen, Yuan Huang, Zibin Zheng","doi":"10.1002/spe.3361","DOIUrl":"https://doi.org/10.1002/spe.3361","url":null,"abstract":"Ethereum, as a leading blockchain platform, has attracted a significant number of practitioners. These practitioners require a platform for communication and collaborative problem‐solving, which led to Ethereum Stack Exchange (ESE), a Q&A site dedicated to Ethereum‐related issues. While the Q&A site facilitates communication among practitioners, it also introduces new challenges. Practitioners adopt code snippets from Q&A sites to address problems encountered. However, the quality of code snippets on ESE remains largely unexplored. Vulnerabilities and gas‐inefficient patterns in ESE may spread to the code in Ethereum and threaten its regular operation. In this article, we conduct an empirical study investigating the distribution of vulnerabilities and gas‐inefficient patterns in ESE. Further, we analyze the potential impact of vulnerabilities and gas‐inefficient patterns from ESE on Ethereum. However, we encounter a problem during the vulnerability and gas‐inefficient pattern detection. Established smart contract analysis tools in the mainstream realm necessitate complete source code files for thorough analysis, while codes on ESE are often incomplete code snippets. To address this, we introduce the AST‐based code clone detection technique to construct detectable files corresponding to code snippets. This enables us to detect vulnerabilities and gas‐inefficient patterns in code snippets. In the end, our findings demonstrate that 11.18% of the contract‐level code snippets and 4.06% of function‐level code snippets in ESE have vulnerabilities. And 27.21% of contract‐level code snippets and 17.89% of function‐level code snippets contain gas‐inefficient patterns. The additional consumption caused by the gas‐inefficient pattern in ESE is approximately $1,695,002. Based on these findings, we provide recommendations for both ESE and its users, aiming to foster collaborative efforts and create a more reliable Q&A site for practitioners.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"22-23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Special Issue on “Ensuring security for artificial intelligence applications in mobile edge computing software systems” 确保移动边缘计算软件系统中人工智能应用的安全性 "特刊

Software: Practice and Experience

Pub Date : 2024-06-28 DOI: 10.1002/spe.3362

Lianyong Qi, Victor S. Sheng, Xiaolong Xu, Jinjun Chen

引用次数: 0

Large language model ChatGPT versus small deep learning models for self‐admitted technical debt detection: Why not together? 用于自我承认技术债务检测的大型语言模型 ChatGPT 与小型深度学习模型：为什么不在一起？

Software: Practice and Experience

Pub Date : 2024-06-28 DOI: 10.1002/spe.3360

Jun Li, Lixian Li, Jin Liu, Xiao Yu, Xiao Liu, Jacky Wai Keung

SummaryGiven the increasing complexity and volume of Self‐Admitted Technical Debts (SATDs), how to efficiently detect them becomes critical in software engineering practice for improving code quality and project efficiency. Although current deep learning methods have achieved good performance in detecting SATDs in code comments, they lack explanation. Large language models such as ChatGPT are increasingly being applied to text classification tasks due to their ability to provide explanations for classification results, but it is unclear how effective ChatGPT is for SATD classification. As the first in‐depth study of ChatGPT for SATD detection, we evaluate ChatGPT's effectiveness, compare it with small deep learning models, and find that ChatGPT performs better on Recall, while small models perform better on Precision. Furthermore, to enhance the performance of these approaches, we propose a novel fusion approach named FSATD which combines ChatGPT with small models for SATD detection so as to provide reliable explanations. Through extensive experiments on 62,276 comments from 10 open‐source projects, we show that FSATD outperforms existing methods in performance of F1‐score in cross‐project scenarios. Additionally, FSATD allows for flexible adjustment of fusion strategies, adapting to different requirements of various application scenarios, and can achieve the best Precision, Recall, or F1‐score.

摘要鉴于自认技术债务（SATD）的复杂性和数量都在不断增加，如何有效地检测它们成为软件工程实践中提高代码质量和项目效率的关键。虽然目前的深度学习方法在检测代码注释中的 SATD 方面取得了不错的成绩，但它们缺乏解释。由于 ChatGPT 等大型语言模型能够为分类结果提供解释，因此越来越多地应用于文本分类任务，但目前还不清楚 ChatGPT 在 SATD 分类中的效果如何。作为首次针对 ChatGPT 在 SATD 检测方面的深入研究，我们评估了 ChatGPT 的有效性，并将其与小型深度学习模型进行了比较，结果发现 ChatGPT 在 Recall 方面表现更好，而小型模型在 Precision 方面表现更好。此外，为了提高这些方法的性能，我们提出了一种名为 FSATD 的新型融合方法，它将 ChatGPT 与小型模型结合起来进行 SATD 检测，从而提供可靠的解释。通过对 10 个开源项目的 62276 条评论进行广泛实验，我们发现 FSATD 在跨项目场景下的 F1 分数表现优于现有方法。此外，FSATD 还能灵活调整融合策略，适应各种应用场景的不同要求，并能获得最佳精度、召回率或 F1 分数。

{"title":"Large language model ChatGPT versus small deep learning models for self‐admitted technical debt detection: Why not together?","authors":"Jun Li, Lixian Li, Jin Liu, Xiao Yu, Xiao Liu, Jacky Wai Keung","doi":"10.1002/spe.3360","DOIUrl":"https://doi.org/10.1002/spe.3360","url":null,"abstract":"SummaryGiven the increasing complexity and volume of Self‐Admitted Technical Debts (SATDs), how to efficiently detect them becomes critical in software engineering practice for improving code quality and project efficiency. Although current deep learning methods have achieved good performance in detecting SATDs in code comments, they lack explanation. Large language models such as ChatGPT are increasingly being applied to text classification tasks due to their ability to provide explanations for classification results, but it is unclear how effective ChatGPT is for SATD classification. As the first in‐depth study of ChatGPT for SATD detection, we evaluate ChatGPT's effectiveness, compare it with small deep learning models, and find that ChatGPT performs better on Recall, while small models perform better on Precision. Furthermore, to enhance the performance of these approaches, we propose a novel fusion approach named FSATD which combines ChatGPT with small models for SATD detection so as to provide reliable explanations. Through extensive experiments on 62,276 comments from 10 open‐source projects, we show that FSATD outperforms existing methods in performance of F1‐score in cross‐project scenarios. Additionally, FSATD allows for flexible adjustment of fusion strategies, adapting to different requirements of various application scenarios, and can achieve the best Precision, Recall, or F1‐score.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FogDEFTKube: Standards‐compliant dynamic deployment of fog service containers FogDEFTKube：符合标准的动态部署雾服务容器

Software: Practice and Experience

Pub Date : 2024-06-15 DOI: 10.1002/spe.3354

Rajesh Thalla, S. Srirama

The traditional cloud‐centric approach in IoT applications lack the speed and efficiency required for time‐critical tasks, resulting in network inefficiencies. To address this, the notions of Edge and Fog computing have emerged as alternatives. Fog computing facilitates the deployment of services and applications closer to the network's edge, lowering latency and allowing real‐time capabilities. It enhances reliability, fault tolerance, and connectivity in areas with spotty network coverage. Despite the fact that fog computing overcomes the limitations of cloud‐centric IoT processing, its adoption faces challenges like platform independence, interoperability, and portability. To tackle these challenges, the FogDEFT (Fog computing out of the box: Dynamic dEployment of Fog service containers with TOSCA) framework was developed. It complies to OASIS‐TOSCA standards and guarantees dynamic deployment of fog services on resource‐constrained devices while leveraging Docker containerization technology to ensure platform independence and interoperability. Due to its tight coupling with Docker Swarm, which is designed for medium‐sized deployments, the fogDEFT framework is constrained by Docker Swarm's limitations, hindering its ability to effectively manage large‐scale, automated, and resource‐efficient microservice deployments. To address these limitations, we propose FogDEFTKube, an extension of the FogDEFT architecture that incorporates Kubernetes for orchestration, Jenkins for continuous integration and deployment, and a comprehensive redefinition of the core capabilities of the FogDEFT architecture. This offers a promising solution that supports Kubernetes for handling scalable and highly available fog applications with ease while offering CI/CD. FogDEFTKube simplifies the modeling and deployment of fog services while abstracting the complexities of underlying fog networks.

在物联网应用中，传统的以云为中心的方法缺乏时间关键型任务所需的速度和效率，导致网络效率低下。为解决这一问题，边缘计算和雾计算的概念应运而生。雾计算有助于在更靠近网络边缘的地方部署服务和应用，降低延迟并实现实时功能。在网络覆盖不稳定的地区，它还能提高可靠性、容错性和连接性。尽管雾计算克服了以云为中心的物联网处理的局限性，但它的应用也面临着平台独立性、互操作性和可移植性等挑战。为了应对这些挑战，FogDEFT（开箱即用的雾计算：FogDEFT（Fog computing out of the box：Dynamic dEployment of Fog service containers with TOSCA）框架。该框架符合 OASIS-TOSCA 标准，保证在资源受限的设备上动态部署雾服务，同时利用 Docker 容器化技术确保平台独立性和互操作性。由于与专为中型部署而设计的 Docker Swarm 紧密耦合，fogDEFT 框架受到了 Docker Swarm 限制的制约，妨碍了其有效管理大规模、自动化和资源高效型微服务部署的能力。为了解决这些局限性，我们提出了 FogDEFTKube，它是 FogDEFT 架构的扩展，整合了用于协调的 Kubernetes、用于持续集成和部署的 Jenkins 以及对 FogDEFT 架构核心功能的全面重新定义。这提供了一个前景广阔的解决方案，它支持 Kubernetes，可轻松处理可扩展和高可用性的雾应用，同时提供 CI/CD。FogDEFTKube 简化了雾服务的建模和部署，同时抽象了底层雾网络的复杂性。

{"title":"FogDEFTKube: Standards‐compliant dynamic deployment of fog service containers","authors":"Rajesh Thalla, S. Srirama","doi":"10.1002/spe.3354","DOIUrl":"https://doi.org/10.1002/spe.3354","url":null,"abstract":"The traditional cloud‐centric approach in IoT applications lack the speed and efficiency required for time‐critical tasks, resulting in network inefficiencies. To address this, the notions of Edge and Fog computing have emerged as alternatives. Fog computing facilitates the deployment of services and applications closer to the network's edge, lowering latency and allowing real‐time capabilities. It enhances reliability, fault tolerance, and connectivity in areas with spotty network coverage. Despite the fact that fog computing overcomes the limitations of cloud‐centric IoT processing, its adoption faces challenges like platform independence, interoperability, and portability. To tackle these challenges, the FogDEFT (Fog computing out of the box: Dynamic dEployment of Fog service containers with TOSCA) framework was developed. It complies to OASIS‐TOSCA standards and guarantees dynamic deployment of fog services on resource‐constrained devices while leveraging Docker containerization technology to ensure platform independence and interoperability. Due to its tight coupling with Docker Swarm, which is designed for medium‐sized deployments, the fogDEFT framework is constrained by Docker Swarm's limitations, hindering its ability to effectively manage large‐scale, automated, and resource‐efficient microservice deployments. To address these limitations, we propose FogDEFTKube, an extension of the FogDEFT architecture that incorporates Kubernetes for orchestration, Jenkins for continuous integration and deployment, and a comprehensive redefinition of the core capabilities of the FogDEFT architecture. This offers a promising solution that supports Kubernetes for handling scalable and highly available fog applications with ease while offering CI/CD. FogDEFTKube simplifies the modeling and deployment of fog services while abstracting the complexities of underlying fog networks.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"2 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141336798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An automated model‐based testing approach for the self‐adaptive behavior of the unmanned aircraft system application software 无人机系统应用软件自适应行为的基于模型的自动测试方法

Software: Practice and Experience

Pub Date : 2024-06-12 DOI: 10.1002/spe.3358

Zainab Javed, Muhammad Zohaib Iqbal, Muhammad Uzair Khan, M. Usman, A. A. Jilani

The unmanned aircraft system (UAS) is rapidly gaining popularity in civil and military domains. A UAS consists of an application software that is responsible for defining a UAS mission and its expected behavior. A UAS during its mission experiences changes (or interruptions) that require the unmanned aerial vehicle (UAV) in a UAS to self‐adapt, that is, to adjust both its behavior and position in real‐time, particularly for maintaining formation in the case of a UAS swarm. This adaptation is critical as the UAS operates in an open environment, interacting with humans, buildings, and neighboring UAVs. To verify if a UAS correctly makes an adaptation, it is important to test it. The current industrial practice for testing the self‐adaptive behaviors in UAS is to carry out testing activities manually. This is particularly true for existing UAS rather than newly developed ones. Manual testing is time‐consuming and allows the execution of a limited set of test cases. To address this problem, we propose an automated model‐based approach to test the self‐adaptive behavior of UAS application software. The work is conducted in collaboration with an industrial partner and demonstrated through a case study of UAS swarm formation flight application software. Further, the approach is verified on various self‐adaptive behaviors for three open‐source autopilots (i.e., Ardu‐Copter, Ardu‐Plane, and Quad‐Plane). Using the proposed model‐based testing approach we are able to test sixty unique self‐adaptive behaviors. The testing results show that around 80% of the behavior adaptations are correctly executed by UAS application software.

无人机系统（UAS）在民用和军用领域迅速普及。无人机系统由负责定义无人机系统任务及其预期行为的应用软件组成。无人机系统在执行任务过程中会遇到一些变化（或中断），这就要求无人机系统中的无人飞行器（UAV）进行自适应，即实时调整其行为和位置，特别是在无人机群的情况下保持编队。由于无人机系统在开放环境中运行，会与人类、建筑物和邻近的无人机发生交互，因此这种自适应至关重要。要验证无人机系统是否正确地进行了适应性调整，必须对其进行测试。目前测试无人机系统自适应行为的工业实践是手动进行测试活动。这尤其适用于现有的无人机系统，而不是新开发的无人机系统。人工测试非常耗时，而且只能执行有限的测试用例集。为解决这一问题，我们提出了一种基于模型的自动化方法，用于测试无人机系统应用软件的自适应行为。这项工作是与一家工业合作伙伴合作开展的，并通过无人机系统蜂群编队飞行应用软件的案例研究进行了演示。此外，该方法还验证了三种开源自动驾驶仪（即 Ardu-Copter、Ardu-Plane 和 Quad-Plane）的各种自适应行为。利用所提出的基于模型的测试方法，我们能够测试六十种独特的自适应行为。测试结果表明，无人机系统应用软件正确执行了约 80% 的行为自适应。

{"title":"An automated model‐based testing approach for the self‐adaptive behavior of the unmanned aircraft system application software","authors":"Zainab Javed, Muhammad Zohaib Iqbal, Muhammad Uzair Khan, M. Usman, A. A. Jilani","doi":"10.1002/spe.3358","DOIUrl":"https://doi.org/10.1002/spe.3358","url":null,"abstract":"The unmanned aircraft system (UAS) is rapidly gaining popularity in civil and military domains. A UAS consists of an application software that is responsible for defining a UAS mission and its expected behavior. A UAS during its mission experiences changes (or interruptions) that require the unmanned aerial vehicle (UAV) in a UAS to self‐adapt, that is, to adjust both its behavior and position in real‐time, particularly for maintaining formation in the case of a UAS swarm. This adaptation is critical as the UAS operates in an open environment, interacting with humans, buildings, and neighboring UAVs. To verify if a UAS correctly makes an adaptation, it is important to test it. The current industrial practice for testing the self‐adaptive behaviors in UAS is to carry out testing activities manually. This is particularly true for existing UAS rather than newly developed ones. Manual testing is time‐consuming and allows the execution of a limited set of test cases. To address this problem, we propose an automated model‐based approach to test the self‐adaptive behavior of UAS application software. The work is conducted in collaboration with an industrial partner and demonstrated through a case study of UAS swarm formation flight application software. Further, the approach is verified on various self‐adaptive behaviors for three open‐source autopilots (i.e., Ardu‐Copter, Ardu‐Plane, and Quad‐Plane). Using the proposed model‐based testing approach we are able to test sixty unique self‐adaptive behaviors. The testing results show that around 80% of the behavior adaptations are correctly executed by UAS application software.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"120 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141351762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Nearest‐neighbor, BERT‐based, scalable clone detection: A practical approach for large‐scale industrial code bases 基于近邻 BERT 的可扩展克隆检测：大规模工业代码库的实用方法

Software: Practice and Experience

Pub Date : 2024-06-12 DOI: 10.1002/spe.3355

Gul Aftab Ahmed, James Patten, Yuanhua Han, Guoxian Lu, Wei Hou, David Gregg, Jim Buckley, Muslim Chochlov

Hidden code clones negatively impact software maintenance, but manually detecting them in large codebases is impractical. Additionally, automated approaches find detection of syntactically‐divergent clones very challenging. While recent deep neural networks (for example BERT‐based artificial neural networks) seem more effective in detecting such clones, their pairwise comparison of every code pair in the target system(s) is inefficient and scales poorly on large codebases. We present SSCD, a BERT‐based clone detection approach that targets high recall of Type 3 and Type 4 clones at a very large scale (in line with our industrial partner's requirements). It computes a representative embedding for each code fragment and finds similar fragments using a nearest neighbor search. Thus, SSCD avoids the pairwise‐comparison bottleneck of other neural network approaches, while also using a parallel, GPU‐accelerated search to tackle scalability. This article describes the approach, proposing and evaluating several refinements to improve Type 3/4 clone detection at scale. It provides a substantial empirical evaluation of the technique, including a speed/efficacy comparison of the approach against SourcererCC and Oreo, the only other neural‐network approach currently capable of scaling to hundreds of millions of LOC. It also includes a large in‐situ evaluation on our industrial collaborator's code base that assesses the original technique, the impact of the proposed refinements and illustrates the impact of incremental, active learning on its efficacy. We find that SSCD is significantly faster and more accurate than SourcererCC and Oreo. SAGA, a GPU‐accelerated traditional clone detection approach, is a little better than SSCD for T1/T2 clones, but substantially worse for T3/T4 clones. Thus, SSCD is both scalable to industrial code sizes, and comparatively more accurate than existing approaches for difficult T3/T4 clone searching. In‐situ evaluation on company datasets shows that SSCD outperforms the baseline approach (CCFinderX) for T3/T4 clones. Whitespace removal and active learning further improve SSCD effectiveness.

隐藏的代码克隆会对软件维护造成负面影响，但在大型代码库中手动检测它们并不现实。此外，自动方法发现检测语法不同的克隆非常具有挑战性。虽然最近的深度神经网络（例如基于 BERT 的人工神经网络）在检测此类克隆方面似乎更有效，但它们对目标系统中的每个代码对进行成对比较的效率很低，而且在大型代码库中的扩展性也很差。我们提出的 SSCD 是一种基于 BERT 的克隆检测方法，其目标是在非常大的范围内（符合我们行业合作伙伴的要求）实现对第 3 类和第 4 类克隆的高召回率。它为每个代码片段计算代表性嵌入，并使用近邻搜索找到相似片段。因此，SSCD 避免了其他神经网络方法的成对比较瓶颈，同时还使用 GPU 加速的并行搜索来解决可扩展性问题。本文介绍了这种方法，提出并评估了几项改进措施，以提高 3/4 型克隆检测的规模。文章对该技术进行了大量实证评估，包括与 SourcererCC 和 Oreo（目前唯一能扩展到数亿 LOC 的神经网络方法）的速度/功效比较。报告还包括对我们工业合作者代码库的大型现场评估，评估了原始技术、建议改进的影响，并说明了增量主动学习对其功效的影响。我们发现，SSCD 比 SourcererCC 和 Oreo 更快、更准确。在 T1/T2 克隆方面，GPU 加速的传统克隆检测方法 SAGA 比 SSCD 略胜一筹，但在 T3/T4 克隆方面则差得多。因此，SSCD 既可扩展到工业代码大小，又比现有的 T3/T4 克隆搜索方法更准确。在公司数据集上进行的现场评估表明，在 T3/T4 克隆方面，SSCD 优于基准方法（CCFinderX）。空白去除和主动学习进一步提高了 SSCD 的有效性。

{"title":"Nearest‐neighbor, BERT‐based, scalable clone detection: A practical approach for large‐scale industrial code bases","authors":"Gul Aftab Ahmed, James Patten, Yuanhua Han, Guoxian Lu, Wei Hou, David Gregg, Jim Buckley, Muslim Chochlov","doi":"10.1002/spe.3355","DOIUrl":"https://doi.org/10.1002/spe.3355","url":null,"abstract":"Hidden code clones negatively impact software maintenance, but manually detecting them in large codebases is impractical. Additionally, automated approaches find detection of syntactically‐divergent clones very challenging. While recent deep neural networks (for example BERT‐based artificial neural networks) seem more effective in detecting such clones, their pairwise comparison of every code pair in the target system(s) is inefficient and scales poorly on large codebases. We present SSCD, a BERT‐based clone detection approach that targets high recall of Type 3 and Type 4 clones at a very large scale (in line with our industrial partner's requirements). It computes a representative embedding for each code fragment and finds similar fragments using a nearest neighbor search. Thus, SSCD avoids the pairwise‐comparison bottleneck of other neural network approaches, while also using a parallel, GPU‐accelerated search to tackle scalability. This article describes the approach, proposing and evaluating several refinements to improve Type 3/4 clone detection at scale. It provides a substantial empirical evaluation of the technique, including a speed/efficacy comparison of the approach against SourcererCC and Oreo, the only other neural‐network approach currently capable of scaling to hundreds of millions of LOC. It also includes a large in‐situ evaluation on our industrial collaborator's code base that assesses the original technique, the impact of the proposed refinements and illustrates the impact of incremental, active learning on its efficacy. We find that SSCD is significantly faster and more accurate than SourcererCC and Oreo. SAGA, a GPU‐accelerated traditional clone detection approach, is a little better than SSCD for T1/T2 clones, but substantially worse for T3/T4 clones. Thus, SSCD is both scalable to industrial code sizes, and comparatively more accurate than existing approaches for difficult T3/T4 clone searching. In‐situ evaluation on company datasets shows that SSCD outperforms the baseline approach (CCFinderX) for T3/T4 clones. Whitespace removal and active learning further improve SSCD effectiveness.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141354046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Balancing performance and comfort in virtual reality: A study of FPS, latency, and batch values 平衡虚拟现实的性能和舒适度：对 FPS、延迟和批次值的研究

Software: Practice and Experience

Pub Date : 2024-06-11 DOI: 10.1002/spe.3356

Ali Geriş, Baris Cukurbasi, Murat Kilinc, Orkun Teke

This manuscript investigates the relationships among various performance metrics in a virtual reality (VR), namely frames per second (FPS), latency, batches, and the number of triangles (tris) and vertices (verts). The study aims to uncover correlations and directional associations between these metrics, shedding light on their impact on VR performance. The findings reveal a significant correlation between FPS and latency, albeit in opposite directions. Higher FPS values are associated with reduced latency, indicating that a smoother visual experience is accompanied by shorter delays in the VR. Conversely, lower FPS values are linked to increased latency, suggesting a potential degradation in overall system responsiveness. Additionally, a strong correlation is observed between latency and batches processed. This finding implies that latency has a direct impact on the system's ability to efficiently process and render objects within VR. Furthermore, a positive correlation is identified between the number of batches and the values of tris and verts. This relationship suggests that higher batch counts are associated with larger quantities of triangles and vertices, reflecting a more complex scene rendering process. Consequently, the performance of VR may be influenced by the density and intricacy of the virtual environments, as indicated by these metrics.

本手稿研究了虚拟现实（VR）中各种性能指标之间的关系，即每秒帧数（FPS）、延迟、批次以及三角形（tris）和顶点（verts）的数量。研究旨在揭示这些指标之间的相关性和方向性关联，从而揭示它们对 VR 性能的影响。研究结果表明，FPS 和延迟之间存在明显的相关性，尽管方向相反。较高的 FPS 值与较短的延迟时间相关，这表明在 VR 中，较流畅的视觉体验伴随着较短的延迟时间。相反，FPS 值越低，延迟越长，这表明整体系统响应速度可能会下降。此外，我们还观察到延迟与处理批次之间存在很强的相关性。这一发现意味着，延迟会直接影响系统在 VR 中高效处理和渲染对象的能力。此外，批次数量与 tris 和 verts 值之间也存在正相关关系。这种关系表明，批次数越多，三角形和顶点的数量就越大，反映出场景渲染过程越复杂。因此，虚拟现实的性能可能会受到虚拟环境密度和复杂程度的影响，正如这些指标所显示的那样。

{"title":"Balancing performance and comfort in virtual reality: A study of FPS, latency, and batch values","authors":"Ali Geriş, Baris Cukurbasi, Murat Kilinc, Orkun Teke","doi":"10.1002/spe.3356","DOIUrl":"https://doi.org/10.1002/spe.3356","url":null,"abstract":"This manuscript investigates the relationships among various performance metrics in a virtual reality (VR), namely frames per second (FPS), latency, batches, and the number of triangles (tris) and vertices (verts). The study aims to uncover correlations and directional associations between these metrics, shedding light on their impact on VR performance. The findings reveal a significant correlation between FPS and latency, albeit in opposite directions. Higher FPS values are associated with reduced latency, indicating that a smoother visual experience is accompanied by shorter delays in the VR. Conversely, lower FPS values are linked to increased latency, suggesting a potential degradation in overall system responsiveness. Additionally, a strong correlation is observed between latency and batches processed. This finding implies that latency has a direct impact on the system's ability to efficiently process and render objects within VR. Furthermore, a positive correlation is identified between the number of batches and the values of tris and verts. This relationship suggests that higher batch counts are associated with larger quantities of triangles and vertices, reflecting a more complex scene rendering process. Consequently, the performance of VR may be influenced by the density and intricacy of the virtual environments, as indicated by these metrics.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"18 17","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141356512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Threat models over space and time: A case study of end‐to‐end‐encrypted messaging applications 时空威胁模型：端到端加密信息应用案例研究

Software: Practice and Experience

Pub Date : 2024-05-22 DOI: 10.1002/spe.3341

Partha Das Chowdhury, Maria Sameen, Jenny Blessing, Nicholas Boucher, Joseph Gardiner, Tom Burrows, Ross Anderson, Awais Rashid

Threat modeling is one of the foundations of secure systems engineering and must take heed of the context within which systems operate. In this work, we explore the extent to which real‐world systems engineering reflects a changing threat context. We examine the desktop clients of six widely used end‐to‐end‐encrypted mobile messaging applications to understand the extent to which they adjusted their threat model over space (when enabling clients on new platforms, such as desktop clients) and time (as new threats emerged). We experimented with short‐lived adversarial access against these desktop clients and analyzed the results using two popular threat elicitation frameworks, STRIDE and LINDDUN. The results demonstrate that system designers need to track threats in the evolving context within which systems operate and, more importantly, mitigate them by rescoping trust boundaries so that they remain consistent with administrative boundaries. A nuanced understanding of the relationship between trust and administration is vital for robust security, including the provision of safe defaults.

威胁建模是安全系统工程的基础之一，必须考虑到系统运行的环境。在这项工作中，我们探讨了现实世界的系统工程在多大程度上反映了不断变化的威胁环境。我们研究了六个广泛使用的端到端加密移动信息应用程序的桌面客户端，以了解它们在多大程度上随着空间（在新平台上启用客户端，如桌面客户端）和时间（新威胁出现时）调整了威胁模型。我们试验了针对这些桌面客户端的短时对抗访问，并使用两个流行的威胁诱发框架 STRIDE 和 LINDDUN 对结果进行了分析。结果表明，系统设计人员需要在系统运行的不断变化的环境中跟踪威胁，更重要的是，要通过重新划分信任边界来减轻威胁，使其与管理边界保持一致。对信任与管理之间关系的细致理解对强大的安全性至关重要，包括提供安全的默认设置。

{"title":"Threat models over space and time: A case study of end‐to‐end‐encrypted messaging applications","authors":"Partha Das Chowdhury, Maria Sameen, Jenny Blessing, Nicholas Boucher, Joseph Gardiner, Tom Burrows, Ross Anderson, Awais Rashid","doi":"10.1002/spe.3341","DOIUrl":"https://doi.org/10.1002/spe.3341","url":null,"abstract":"Threat modeling is one of the foundations of secure systems engineering and must take heed of the context within which systems operate. In this work, we explore the extent to which real‐world systems engineering reflects a changing threat context. We examine the desktop clients of six widely used end‐to‐end‐encrypted mobile messaging applications to understand the extent to which they adjusted their threat model over space (when enabling clients on new platforms, such as desktop clients) and time (as new threats emerged). We experimented with short‐lived adversarial access against these desktop clients and analyzed the results using two popular threat elicitation frameworks, STRIDE and LINDDUN. The results demonstrate that system designers need to track threats in the evolving context within which systems operate and, more importantly, mitigate them by rescoping trust boundaries so that they remain consistent with administrative boundaries. A nuanced understanding of the relationship between trust and administration is vital for robust security, including the provision of safe defaults.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"65 34","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141110388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gamifying software engineering subject to enhance the quality of knowledge 将软件工程学科游戏化，提高知识质量

Software: Practice and Experience

Pub Date : 2024-05-16 DOI: 10.1002/spe.3339

I. Aldalur

Gamification has been widely used in education in recent years. Gamification has been proven to be a useful tool for students, as it motivates them and helps them to learn. The lack of motivation on the part of the students and their poor perception of the importance of software engineering has led to the definition of SESGA. This game has been designed and carried out on third year students of the computer science degree. SESGA has been designed for students to improve their knowledge of static code analysis, unit testing and Git. A software engineering project was carried out in groups, in which each of the students had to perform common software development tasks. The evaluation has measured the accomplishment, challenge, competence, guidance, immersion, playfulness and social involvement of SESGA obtaining remarkable results. It has also shown that SESGA has allowed students to have better academic results and the number of failures has decreased.

近年来，游戏化被广泛应用于教育领域。事实证明，游戏化对学生来说是一种有用的工具，因为它能激发他们的学习动机，帮助他们学习。由于学生缺乏学习动力，对软件工程的重要性认识不足，因此我们设计了 SESGA。这个游戏是针对计算机科学专业三年级学生设计和开展的。SESGA 的设计目的是提高学生对静态代码分析、单元测试和 Git 的认识。以小组为单位开展了一个软件工程项目，每个学生都必须完成常见的软件开发任务。评估对 SESGA 的成就感、挑战性、能力、指导、沉浸感、游戏性和社会参与度进行了衡量，结果令人瞩目。评估还表明，SESGA 使学生取得了更好的学习成绩，失败的次数也减少了。

引用次数: 0