Software: Practice and Experience最新文献_第8页

CloudSim express: A novel framework for rapid low code simulation of cloud computing environments CloudSim express:用于快速低代码模拟云计算环境的新框架

Software: Practice and Experience

Pub Date : 2023-11-28 DOI: 10.1002/spe.3290

Tharindu B. Hewage, Shashikant Ilager, Maria A. Rodriguez, Rajkumar Buyya

Cloud computing environment simulators enable cost-effective experimentation of novel infrastructure designs and management approaches by avoiding significant costs incurred from repetitive deployments in real Cloud platforms. However, widely used Cloud environment simulators compromise on usability due to complexities in design and configuration, along with the added overhead of programming language expertise. Existing approaches attempting to reduce this overhead, such as script-based simulators and graphical user interface (GUI) based simulators, often compromise on the extensibility of the simulator. Simulator extensibility allows for customization at a fine-grained level, thus reducing it significantly affects flexibility in creating simulations. To address these challenges, we propose an architectural framework to enable human-readable script-based simulations in existing Cloud environment simulators while minimizing the impact on simulator extensibility. We implement the proposed framework for the widely used Cloud environment simulator, the CloudSim toolkit, and compare it against state-of-the-art baselines using a practical use case. The resulting framework, called CloudSim Express, achieves extensible simulations while surpassing baselines with over a

� � � 71 � . � 43 � �$$ 71.43 $$�

% reduction in code complexity and an 89.42% reduction in lines of code.

云计算环境模拟器通过避免在真实的云平台中重复部署所产生的巨大成本，使新型基础设施设计和管理方法的实验具有成本效益。然而，由于设计和配置的复杂性，以及编程语言专业知识的额外开销，广泛使用的云环境模拟器会影响可用性。试图减少这种开销的现有方法，如基于脚本的模拟器和基于图形用户界面(GUI)的模拟器，通常会损害模拟器的可扩展性。模拟器的可扩展性允许在细粒度级别上进行定制，因此减少它会显著影响创建模拟的灵活性。为了应对这些挑战，我们提出了一个架构框架，在现有的云环境模拟器中实现人类可读的基于脚本的模拟，同时最大限度地减少对模拟器可扩展性的影响。我们为广泛使用的云环境模拟器CloudSim工具包实现建议的框架，并使用实际用例将其与最先进的基线进行比较。由此产生的框架，称为CloudSim Express，实现了可扩展的模拟，同时超过基线超过71.43$$ 71.43 $$% reduction in code complexity and an 89.42% reduction in lines of code.

{"title":"CloudSim express: A novel framework for rapid low code simulation of cloud computing environments","authors":"Tharindu B. Hewage, Shashikant Ilager, Maria A. Rodriguez, Rajkumar Buyya","doi":"10.1002/spe.3290","DOIUrl":"https://doi.org/10.1002/spe.3290","url":null,"abstract":"Cloud computing environment simulators enable cost-effective experimentation of novel infrastructure designs and management approaches by avoiding significant costs incurred from repetitive deployments in real Cloud platforms. However, widely used Cloud environment simulators compromise on usability due to complexities in design and configuration, along with the added overhead of programming language expertise. Existing approaches attempting to reduce this overhead, such as script-based simulators and graphical user interface (GUI) based simulators, often compromise on the extensibility of the simulator. Simulator extensibility allows for customization at a fine-grained level, thus reducing it significantly affects flexibility in creating simulations. To address these challenges, we propose an architectural framework to enable human-readable script-based simulations in existing Cloud environment simulators while minimizing the impact on simulator extensibility. We implement the proposed framework for the widely used Cloud environment simulator, the CloudSim toolkit, and compare it against state-of-the-art baselines using a practical use case. The resulting framework, called CloudSim Express, achieves extensible simulations while surpassing baselines with over a <math altimg=\"urn:x-wiley:spe:media:spe3290:spe3290-math-0001\" display=\"inline\" location=\"graphic/spe3290-math-0001.png\" overflow=\"scroll\">\u0000<semantics>\u0000<mrow>\u0000<mn>71</mn>\u0000<mo>.</mo>\u0000<mn>43</mn>\u0000</mrow>\u0000$$ 71.43 $$</annotation>\u0000</semantics></math>% reduction in code complexity and an 89.42% reduction in lines of code.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"30 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138513644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Principled and practical static analysis for Python: Weakest precondition inference of hyperparameter constraints Python的原则性和实用性静态分析:超参数约束的最弱前提推理

Software: Practice and Experience

Pub Date : 2023-11-22 DOI: 10.1002/spe.3279

Ingkarat Rak-amnouykit, Ana Milanova, Guillaume Baudart, Martin Hirzel, Julian Dolby

Application programming interfaces often have correctness constraints that cut across multiple arguments. Violating these constraints causes the underlying code to raise runtime exceptions, but at the interface level, these are usually documented at most informally. This article presents novel principled static analysis and the first interprocedural weakest-precondition analysis for Python to extract inter-argument constraints. The analysis is mostly static, but to make it tractable for typical Python idioms, it selectively switches to the concrete domain for some cases. This article focuses on the important case where the interfaces are machine-learning operators and their arguments are hyperparameters, rife with constraints. We extracted hyperparameter constraints for 429 functions and operators from 11 libraries and found real bugs. We used a methodology to obtain ground truth for 181 operators from 8 machine-learning libraries; the analysis achieved high precision and recall for them. Our technique advances static analysis for Python and is a step towards safer and more robust machine learning.

应用程序编程接口通常具有跨多个参数的正确性约束。违反这些约束会导致底层代码引发运行时异常，但在接口级别，这些异常通常最多是非正式的记录。本文提出了新的有原则的静态分析和Python的第一个过程间最弱先决条件分析，以提取参数间的约束。分析主要是静态的，但为了使其易于处理典型的Python习惯用法，它在某些情况下选择性地切换到具体域。本文关注的是这样一种重要情况:接口是机器学习操作符，它们的参数是充满约束的超参数。我们从11个库中提取了429个函数和操作符的超参数约束，并发现了真正的bug。我们使用了一种方法，从8个机器学习库中获得181个算子的地面真值;分析结果具有较高的精密度和召回率。我们的技术促进了Python的静态分析，是迈向更安全、更健壮的机器学习的一步。

{"title":"Principled and practical static analysis for Python: Weakest precondition inference of hyperparameter constraints","authors":"Ingkarat Rak-amnouykit, Ana Milanova, Guillaume Baudart, Martin Hirzel, Julian Dolby","doi":"10.1002/spe.3279","DOIUrl":"https://doi.org/10.1002/spe.3279","url":null,"abstract":"Application programming interfaces often have correctness constraints that cut across multiple arguments. Violating these constraints causes the underlying code to raise runtime exceptions, but at the interface level, these are usually documented at most informally. This article presents novel principled static analysis and the first interprocedural weakest-precondition analysis for Python to extract inter-argument constraints. The analysis is mostly static, but to make it tractable for typical Python idioms, it selectively switches to the concrete domain for some cases. This article focuses on the important case where the interfaces are machine-learning operators and their arguments are hyperparameters, rife with constraints. We extracted hyperparameter constraints for 429 functions and operators from 11 libraries and found real bugs. We used a methodology to obtain ground truth for 181 operators from 8 machine-learning libraries; the analysis achieved high precision and recall for them. Our technique advances static analysis for Python and is a step towards safer and more robust machine learning.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":" 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138494436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Object counting in remote sensing via selective spatial-frequency pyramid network 基于选择性空间频率金字塔网络的遥感目标计数

Software: Practice and Experience

Pub Date : 2023-11-21 DOI: 10.1002/spe.3287

Jinyong Chen, Mingliang Gao, Xiangyu Guo, Wenzhe Zhai, Qilei Li, Gwanggil Jeon

The integration of remote sensing object counting in the Mobile Edge Computing (MEC) environment is of crucial significance and practical value. However, the presence of significant background interference in remote sensing images poses a challenge to accurate object counting, as the results are easily affected by background noise. Additionally, scale variation within remote sensing images presents a further difficulty, as traditional counting methods face challenges in adapting to objects of different scales. To address these challenges, we propose a selective spatial-frequency pyramid network (SSFPNet). Specifically, the SSFPNet consists of two core modules, namely the pyramid attention (PA) module and the hybrid feature pyramid (HFP) module. The PA module accurately extracts target regions and eliminates background interference by operating on four parallel branches. This enables more precise object counting. The HFP module is introduced to fuse spatial and frequency domain information, leveraging scale information from different domains for object counting, so as to improve the accuracy and robustness of counting. Experimental results on RSOC, CARPK, and PUCPR+ benchmark datasets demonstrate that the SSFPNet achieves state-of-the-art performance in terms of accuracy and robustness.

在移动边缘计算(MEC)环境下集成遥感目标计数具有重要的意义和实用价值。然而，由于遥感图像中存在明显的背景干扰，其结果容易受到背景噪声的影响，给准确的目标计数带来了挑战。此外，遥感图像的尺度变化给传统的计数方法带来了进一步的困难，因为传统的计数方法在适应不同尺度的目标方面面临挑战。为了解决这些挑战，我们提出了一种选择性空间频率金字塔网络(SSFPNet)。具体来说，SSFPNet由两个核心模块组成，即金字塔注意力(PA)模块和混合特征金字塔(HFP)模块。PA模块通过在四个并行支路上工作，精确地提取目标区域并消除背景干扰。这样可以实现更精确的对象计数。引入HFP模块融合空间域和频域信息，利用不同域的尺度信息进行目标计数，提高计数的准确性和鲁棒性。在RSOC、CARPK和PUCPR+基准数据集上的实验结果表明，SSFPNet在准确性和鲁棒性方面达到了最先进的性能。

{"title":"Object counting in remote sensing via selective spatial-frequency pyramid network","authors":"Jinyong Chen, Mingliang Gao, Xiangyu Guo, Wenzhe Zhai, Qilei Li, Gwanggil Jeon","doi":"10.1002/spe.3287","DOIUrl":"https://doi.org/10.1002/spe.3287","url":null,"abstract":"The integration of remote sensing object counting in the Mobile Edge Computing (MEC) environment is of crucial significance and practical value. However, the presence of significant background interference in remote sensing images poses a challenge to accurate object counting, as the results are easily affected by background noise. Additionally, scale variation within remote sensing images presents a further difficulty, as traditional counting methods face challenges in adapting to objects of different scales. To address these challenges, we propose a selective spatial-frequency pyramid network (SSFPNet). Specifically, the SSFPNet consists of two core modules, namely the pyramid attention (PA) module and the hybrid feature pyramid (HFP) module. The PA module accurately extracts target regions and eliminates background interference by operating on four parallel branches. This enables more precise object counting. The HFP module is introduced to fuse spatial and frequency domain information, leveraging scale information from different domains for object counting, so as to improve the accuracy and robustness of counting. Experimental results on RSOC, CARPK, and PUCPR+ benchmark datasets demonstrate that the SSFPNet achieves state-of-the-art performance in terms of accuracy and robustness.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"32 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138513633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Context-based transfer learning for low resource code summarization 基于上下文的迁移学习低资源代码摘要

Software: Practice and Experience

Pub Date : 2023-11-20 DOI: 10.1002/spe.3288

Yi Guo, Yu Chai, Lehuan Zhang, Hui Li, Mengzhi Luo, Shikai Guo

Source code summaries improve the readability and intelligibility of code, help developers understand programs, and improve the efficiency of software maintenance and upgrade processes. Unfortunately, these code comments are often mismatched, missing, or outdated in software projects, resulting in developers needing to infer functionality from source code, affecting the efficiency of software maintenance and evolution. Various methods based on neuronal networks are proposed to solve the problem of synthesis of source code. However, the current work is being carried out on resource-rich programming languages such as Java and Python, and some low-resource languages may not perform well. In order to solve the above challenges, we propose a context-based transfer learning model for low resource code summarization (LRCS), which learns the common information from the language with rich resources, and then transfers it to the target language model for further learning. It consists of two components: the summary generation component is used to learn the syntactic and semantic information of the code, and the learning transfer component is used to improve the generalization ability of the model in the learning process of cross-language code summarization. Experimental results show that LRCS outperforms baseline methods in code summarization in terms of sentence-level BLEU, corpus-level BLEU and METEOR. For example, LRCS improves corpus-level BLEU scores by 52.90%, 41.10%, and 14.97%, respectively, compared to baseline methods.

源代码摘要可以提高代码的可读性和可理解性，帮助开发人员理解程序，提高软件维护和升级过程的效率。不幸的是，这些代码注释在软件项目中经常不匹配、缺失或过时，导致开发人员需要从源代码中推断功能，从而影响软件维护和发展的效率。提出了基于神经网络的各种方法来解决源代码的合成问题。然而，目前的工作是在资源丰富的编程语言(如Java和Python)上进行的，一些资源不足的语言可能表现不佳。为了解决上述挑战，我们提出了一种基于上下文的低资源代码摘要迁移学习模型，该模型从资源丰富的语言中学习公共信息，然后将其迁移到目标语言模型中进行进一步学习。它由两部分组成:摘要生成组件用于学习代码的语法和语义信息，学习迁移组件用于提高模型在跨语言代码摘要学习过程中的泛化能力。实验结果表明，LRCS在句子级BLEU、语料库级BLEU和METEOR方面都优于基线方法。例如，与基线方法相比，LRCS将语料库水平的BLEU分数分别提高了52.90%、41.10%和14.97%。

{"title":"Context-based transfer learning for low resource code summarization","authors":"Yi Guo, Yu Chai, Lehuan Zhang, Hui Li, Mengzhi Luo, Shikai Guo","doi":"10.1002/spe.3288","DOIUrl":"https://doi.org/10.1002/spe.3288","url":null,"abstract":"Source code summaries improve the readability and intelligibility of code, help developers understand programs, and improve the efficiency of software maintenance and upgrade processes. Unfortunately, these code comments are often mismatched, missing, or outdated in software projects, resulting in developers needing to infer functionality from source code, affecting the efficiency of software maintenance and evolution. Various methods based on neuronal networks are proposed to solve the problem of synthesis of source code. However, the current work is being carried out on resource-rich programming languages such as Java and Python, and some low-resource languages may not perform well. In order to solve the above challenges, we propose a context-based transfer learning model for low resource code summarization (LRCS), which learns the common information from the language with rich resources, and then transfers it to the target language model for further learning. It consists of two components: the summary generation component is used to learn the syntactic and semantic information of the code, and the learning transfer component is used to improve the generalization ability of the model in the learning process of cross-language code summarization. Experimental results show that LRCS outperforms baseline methods in code summarization in terms of sentence-level BLEU, corpus-level BLEU and METEOR. For example, LRCS improves corpus-level BLEU scores by 52.90%, 41.10%, and 14.97%, respectively, compared to baseline methods.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138513632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PD-Gait: Contactless and privacy-preserving gait measurement of Parkinson's disease patients using acoustic signals pd -步态:使用声信号测量帕金森病患者的非接触式和隐私保护步态

Software: Practice and Experience

Pub Date : 2023-11-20 DOI: 10.1002/spe.3289

Zeshui Li, Yang Pan, Haipeng Dai, Wenhao Zhang, Zhen Li, Wei Wang, Guihai Chen

In this article, we propose a mobile edge computing (MEC)-related system named PD-Gait, which can measure gait parameters of Parkinson's disease patients in a contactless and privacy-preserving manner. We utilize inaudible acoustic signals and band-pass filters to achieve privacy data protection in the physical layer. The proposed framework can be easily deployed in the mobile end of MEC, and hence release the edge server in cybersecurity attacks fighting. The gait parameters include stride cycle time length and moving speed, and hence providing an objective basis for the doctors' judgment. PD-Gait utilizes acoustic signals in bands from 16 to 23 kHz to achieve device-free sensing, which would release both doctors and patients from the tedious wearing process and psychological burden caused by traditional wearable devices. To achieve robust measurement, we propose a novel acoustic ranging method to avoid “broken tones” and “uneven peak distribution” in the received data. The corresponding ranging accuracy is 0.1 m. We also propose auto-focus micro-Doppler features to extract robust stride cycle time length, and can achieve an accuracy of 0.052 s. We deployed PD-Gait in a brain hospital and collected data from 8 patients. The total walked distance is over 330 m. From the overall trend, our results are highly correlated with the doctor's judgment.

在本文中，我们提出了一种名为pd -步态的移动边缘计算(MEC)相关系统，该系统可以以非接触和隐私保护的方式测量帕金森病患者的步态参数。我们利用听不见的声学信号和带通滤波器来实现物理层的隐私数据保护。该框架可以很容易地部署在MEC的移动端，从而释放边缘服务器在网络安全攻击的战斗中。步态参数包括步幅周期时间长度和移动速度，从而为医生的判断提供客观依据。pd -步态利用16 ~ 23khz波段的声信号实现无设备传感，将医生和患者从传统可穿戴设备带来的繁琐佩戴过程和心理负担中解脱出来。为了实现鲁棒性测量，我们提出了一种新的声学测距方法，以避免接收数据中的“破碎音”和“峰值分布不均匀”。相应的测距精度为0.1 m。我们还提出了自动对焦微多普勒特征来提取稳健的步幅周期时间长度，其精度可达到0.052 s。我们在一家脑科医院部署了pd -步态，并收集了8名患者的数据。总步行距离超过330米。从整体趋势来看，我们的结果与医生的判断高度相关。

{"title":"PD-Gait: Contactless and privacy-preserving gait measurement of Parkinson's disease patients using acoustic signals","authors":"Zeshui Li, Yang Pan, Haipeng Dai, Wenhao Zhang, Zhen Li, Wei Wang, Guihai Chen","doi":"10.1002/spe.3289","DOIUrl":"https://doi.org/10.1002/spe.3289","url":null,"abstract":"In this article, we propose a mobile edge computing (MEC)-related system named PD-Gait, which can measure gait parameters of Parkinson's disease patients in a contactless and privacy-preserving manner. We utilize inaudible acoustic signals and band-pass filters to achieve privacy data protection in the physical layer. The proposed framework can be easily deployed in the mobile end of MEC, and hence release the edge server in cybersecurity attacks fighting. The gait parameters include stride cycle time length and moving speed, and hence providing an objective basis for the doctors' judgment. PD-Gait utilizes acoustic signals in bands from 16 to 23 kHz to achieve device-free sensing, which would release both doctors and patients from the tedious wearing process and psychological burden caused by traditional wearable devices. To achieve robust measurement, we propose a novel acoustic ranging method to avoid “broken tones” and “uneven peak distribution” in the received data. The corresponding ranging accuracy is 0.1 m. We also propose auto-focus micro-Doppler features to extract robust stride cycle time length, and can achieve an accuracy of 0.052 s. We deployed PD-Gait in a brain hospital and collected data from 8 patients. The total walked distance is over 330 m. From the overall trend, our results are highly correlated with the doctor's judgment.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"34 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138513631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The effect of distance metrics in a general purpose synthesizer of imperative programs: A second empirical study using enlarged search spaces 距离度量在命令式程序的通用合成器中的作用:使用扩大搜索空间的第二次实证研究

Software: Practice and Experience

Pub Date : 2023-11-15 DOI: 10.1002/spe.3286

Alexandre R. S. Correia, Juliano M. Iyoda, Alexandre C. Mota

Program synthesis is the task of automatically finding a program that satisfies the user intention. In previous work, we developed APS-GA, a program synthesizer based on a genetic algorithm. As genetic algorithms depend on a fitness function, so does APS-GA. Researchers argue that different distance metrics for a fitness function may reveal behavioral differences in the genetic algorithm. More recently, we presented initial evidence that APS-GA was not affected by different distance metrics for its fitness function. However, that study was carried out on a medium-sized scale.

程序合成是自动找到满足用户意图的程序的任务。在之前的工作中，我们开发了基于遗传算法的程序合成器APS-GA。遗传算法依赖于适应度函数，APS-GA也是如此。研究人员认为，适应度函数的不同距离度量可能揭示遗传算法中的行为差异。最近，我们提出了初步证据，表明APS-GA的适应度函数不受不同距离度量的影响。不过，这项研究是在中等规模上进行的。

引用次数: 0

Usefulness of open domain model for identifying missing software requirements concepts 开放领域模型用于识别缺失的软件需求概念的有效性

Software: Practice and Experience

Pub Date : 2023-11-05 DOI: 10.1002/spe.3285

Ziyan Zhao, Li Zhang, Xiaoli Lian

Summary Detecting missing requirements during software development is crucial to avoid unexpected consequences. However, this task is challenging due to limited domain knowledge of requirements analysts and the dynamic nature of software requirements. Previous studies have shown that requirement‐oriented domain models can help identify omissions in requirements, but they are often incomplete for many domains. Meanwhile, domain models constructed from other artifacts are available online. This raises the question: Can these domain models be useful in identifying missing functional information in requirement specifications? To address this question, we conducted a study to measure the overlap between entities in domain models and requirements. We analyzed the occurrence of overlapped entities, considering four distribution characteristics: the type of entities in the domain model, the distribution of mapped entities in the domain model, the family belonging of the mapped entities in the domain model, and the distribution of mapped entities in the requirements. Based on our findings, we proposed recommendations for missing requirements. Additionally, we performed experiments, including the use of the proposed metric “ancestors of the highest level with the most mapped entities” (AHME). The results showed significant improvements with gains of 146% and 223% in the two domains, highlighting the benefits of these distribution characteristics.

在软件开发过程中检测缺失的需求对于避免意外的后果是至关重要的。然而，由于需求分析人员有限的领域知识和软件需求的动态性，这项任务是具有挑战性的。以前的研究已经表明，面向需求的领域模型可以帮助识别需求中的遗漏，但是对于许多领域来说，它们通常是不完整的。同时，从其他工件构建的领域模型可以在线获得。这就提出了一个问题:这些领域模型在识别需求规范中缺失的功能信息方面是否有用?为了解决这个问题，我们进行了一项研究来度量领域模型和需求中实体之间的重叠。我们分析了重叠实体的发生，考虑了四个分布特征:实体在领域模型中的类型、映射实体在领域模型中的分布、映射实体在领域模型中的族归属以及映射实体在需求中的分布。根据我们的发现，我们对缺失的需求提出了建议。此外，我们进行了实验，包括使用建议的度量“具有最多映射实体的最高级别的祖先”(AHME)。结果显示，这两个领域的收益分别提高了146%和223%，突出了这些分布特征的好处。

{"title":"Usefulness of open domain model for identifying missing software requirements concepts","authors":"Ziyan Zhao, Li Zhang, Xiaoli Lian","doi":"10.1002/spe.3285","DOIUrl":"https://doi.org/10.1002/spe.3285","url":null,"abstract":"Summary Detecting missing requirements during software development is crucial to avoid unexpected consequences. However, this task is challenging due to limited domain knowledge of requirements analysts and the dynamic nature of software requirements. Previous studies have shown that requirement‐oriented domain models can help identify omissions in requirements, but they are often incomplete for many domains. Meanwhile, domain models constructed from other artifacts are available online. This raises the question: Can these domain models be useful in identifying missing functional information in requirement specifications? To address this question, we conducted a study to measure the overlap between entities in domain models and requirements. We analyzed the occurrence of overlapped entities, considering four distribution characteristics: the type of entities in the domain model, the distribution of mapped entities in the domain model, the family belonging of the mapped entities in the domain model, and the distribution of mapped entities in the requirements. Based on our findings, we proposed recommendations for missing requirements. Additionally, we performed experiments, including the use of the proposed metric “ancestors of the highest level with the most mapped entities” (AHME). The results showed significant improvements with gains of 146% and 223% in the two domains, highlighting the benefits of these distribution characteristics.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"52 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135725558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DRS: A deep reinforcement learning enhanced Kubernetes scheduler for microservice‐based system DRS:一个深度强化学习增强的Kubernetes调度器，用于基于微服务的系统

Software: Practice and Experience

Pub Date : 2023-10-25 DOI: 10.1002/spe.3284

Zhaolong Jian, Xueshuo Xie, Yaozheng Fang, Yibing Jiang, Ye Lu, Ankan Dash, Tao Li, Guiling Wang

Summary Recently, Kubernetes is widely used to manage and schedule the resources of microservices in cloud‐native distributed applications, as the most famous container orchestration framework. However, Kubernetes preferentially schedules microservices to nodes with rich and balanced CPU and memory resources on a single node. The native scheduler of Kubernetes, called Kube‐scheduler, may cause resource fragmentation and decrease resource utilization. In this paper, we propose a deep reinforcement learning enhanced Kubernetes scheduler named DRS. We initially frame the Kubernetes scheduling problem as a Markov decision process with intricately designed state , action , and reward structures in an effort to increase resource usage and decrease load imbalance. Then, we design and implement DRS mointor to perceive six parameters concerning resource utilization and create a thorough picture of all available resources globally. Finally, DRS can automatically learn the scheduling policy through interaction with the Kubernetes cluster, without relying on expert knowledge about workload and cluster status. We implement a prototype of DRS in a Kubernetes cluster with five nodes and evaluate its performance. Experimental results highlight that DRS overcomes the shortcomings of Kube‐scheduler and achieves the expected scheduling target with three workloads. With only 3.27% CPU overhead and 0.648% communication delay, DRS outperforms Kube‐scheduler by 27.29% in terms of resource utilization and reduces load imbalance by 2.90 times on average.

最近，Kubernetes作为最著名的容器编排框架，被广泛用于管理和调度云原生分布式应用程序中的微服务资源。然而，Kubernetes优先将微服务调度到单个节点上具有丰富且均衡的CPU和内存资源的节点上。Kubernetes的本机调度器，称为Kube - scheduler，可能会导致资源碎片并降低资源利用率。在本文中，我们提出了一个深度强化学习增强的Kubernetes调度器DRS。我们最初将Kubernetes调度问题框架为具有复杂设计的状态、动作和奖励结构的马尔可夫决策过程，以努力增加资源使用并减少负载不平衡。然后，我们设计并实现了DRS监测器来感知与资源利用有关的六个参数，并创建了全局所有可用资源的全景图。最后，DRS可以通过与Kubernetes集群的交互自动学习调度策略，而不依赖于关于工作负载和集群状态的专家知识。我们在一个有5个节点的Kubernetes集群中实现了DRS的原型，并对其性能进行了评估。实验结果表明，DRS克服了Kube - scheduler的缺点，在三种工作负载下达到了预期的调度目标。DRS仅具有3.27%的CPU开销和0.648%的通信延迟，在资源利用率方面比Kube - scheduler高出27.29%，平均减少了2.90倍的负载不平衡。

{"title":"DRS: A deep reinforcement learning enhanced Kubernetes scheduler for microservice‐based system","authors":"Zhaolong Jian, Xueshuo Xie, Yaozheng Fang, Yibing Jiang, Ye Lu, Ankan Dash, Tao Li, Guiling Wang","doi":"10.1002/spe.3284","DOIUrl":"https://doi.org/10.1002/spe.3284","url":null,"abstract":"Summary Recently, Kubernetes is widely used to manage and schedule the resources of microservices in cloud‐native distributed applications, as the most famous container orchestration framework. However, Kubernetes preferentially schedules microservices to nodes with rich and balanced CPU and memory resources on a single node. The native scheduler of Kubernetes, called Kube‐scheduler, may cause resource fragmentation and decrease resource utilization. In this paper, we propose a deep reinforcement learning enhanced Kubernetes scheduler named DRS. We initially frame the Kubernetes scheduling problem as a Markov decision process with intricately designed state , action , and reward structures in an effort to increase resource usage and decrease load imbalance. Then, we design and implement DRS mointor to perceive six parameters concerning resource utilization and create a thorough picture of all available resources globally. Finally, DRS can automatically learn the scheduling policy through interaction with the Kubernetes cluster, without relying on expert knowledge about workload and cluster status. We implement a prototype of DRS in a Kubernetes cluster with five nodes and evaluate its performance. Experimental results highlight that DRS overcomes the shortcomings of Kube‐scheduler and achieves the expected scheduling target with three workloads. With only 3.27% CPU overhead and 0.648% communication delay, DRS outperforms Kube‐scheduler by 27.29% in terms of resource utilization and reduces load imbalance by 2.90 times on average.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135113393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Managing asynchronous workloads in a multi‐tenant microservice enterprise environment 在多租户微服务企业环境中管理异步工作负载

Software: Practice and Experience

Pub Date : 2023-10-20 DOI: 10.1002/spe.3278

Cesar Batista, Felipe Morais, Everton Cavalcante, Thais Batista, Bruno Proença, William Breno Rodrigues Cavalcante

Abstract A multi‐tenant microservice architecture involving components with asynchronous interactions and batch jobs requires efficient strategies for managing asynchronous workloads. This article addresses this issue in the context of a leading company developing tax software solutions for many national and multi‐national corporations in Brazil. A critical process provided by the company's cloud‐based solutions encompasses tax integration, which includes coordinating complex tax calculation tasks and needs to be supported by asynchronous operations using a message broker to guarantee order correctness. We explored and implemented two approaches for managing asynchronous workloads related to tax integration within a multi‐tenant microservice architecture in the company's context: (i) a polling‐based approach that employs a queue as a distributed lock (DL) and (ii) a push‐based approach named single active consumer (SAC) that relies on the message broker's logic to deliver messages. These approaches aim to achieve efficient resource allocation when dealing with a growing number of container replicas and tenants. In this article, we evaluate the correctness and performance of the DL and SAC approaches to shed light on how asynchronous workloads impact the management of multi‐tenant microservice architectures from delivery and deployment perspectives.

多租户微服务架构涉及具有异步交互和批处理作业的组件，需要有效的策略来管理异步工作负载。本文在一家领先的公司为巴西的许多国家和跨国公司开发税务软件解决方案的背景下解决了这个问题。该公司基于云的解决方案提供的关键流程包括税务集成，其中包括协调复杂的税务计算任务，需要使用消息代理的异步操作来支持，以保证订单的正确性。我们探索并实现了两种方法来管理与公司上下文中多租户微服务架构中的税务集成相关的异步工作负载:(i)基于轮询的方法，采用队列作为分布式锁(DL); (ii)基于推送的方法，称为单活动消费者(SAC)，依赖于消息代理的逻辑来传递消息。这些方法的目的是在处理越来越多的容器副本和租户时实现有效的资源分配。在本文中，我们评估了DL和SAC方法的正确性和性能，以从交付和部署的角度阐明异步工作负载如何影响多租户微服务架构的管理。

{"title":"Managing asynchronous workloads in a multi‐tenant microservice enterprise environment","authors":"Cesar Batista, Felipe Morais, Everton Cavalcante, Thais Batista, Bruno Proença, William Breno Rodrigues Cavalcante","doi":"10.1002/spe.3278","DOIUrl":"https://doi.org/10.1002/spe.3278","url":null,"abstract":"Abstract A multi‐tenant microservice architecture involving components with asynchronous interactions and batch jobs requires efficient strategies for managing asynchronous workloads. This article addresses this issue in the context of a leading company developing tax software solutions for many national and multi‐national corporations in Brazil. A critical process provided by the company's cloud‐based solutions encompasses tax integration, which includes coordinating complex tax calculation tasks and needs to be supported by asynchronous operations using a message broker to guarantee order correctness. We explored and implemented two approaches for managing asynchronous workloads related to tax integration within a multi‐tenant microservice architecture in the company's context: (i) a polling‐based approach that employs a queue as a distributed lock (DL) and (ii) a push‐based approach named single active consumer (SAC) that relies on the message broker's logic to deliver messages. These approaches aim to achieve efficient resource allocation when dealing with a growing number of container replicas and tenants. In this article, we evaluate the correctness and performance of the DL and SAC approaches to shed light on how asynchronous workloads impact the management of multi‐tenant microservice architectures from delivery and deployment perspectives.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135616750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Code smells in pull requests: An exploratory study pull请求中的代码气味:一项探索性研究

Software: Practice and Experience

Pub Date : 2023-10-20 DOI: 10.1002/spe.3283

Muhammad Ilyas Azeem, Saad Shafiq, Atif Mashkoor, Alexander Egyed

The quality of a pull request is the primary factor integrators consider for its acceptance or rejection. Code smells indicate sub‐optimal design or implementation choices in the source code that often lead to a fault‐prone outcome, threatening the quality of pull requests. This study explores code smells in 21k pull requests from 25 popular Java projects. We find that both accepted (37%) and rejected (44%) pull requests have code smells, affected mainly by god classes and long methods. Besides, we observe that smelly pull requests are more complex and challenging to understand as they have significantly large sizes, long latency times, more discussion and review comments, and are submitted by contributors with less experience. Our results show that features used in previous studies for pull request acceptance prediction could be potentially employed to predict smell in incoming pull requests. We propose a dynamic approach to predict the presence of such code smells in the newly added pull requests. We evaluate our approach on a dataset of 25 Java projects extracted from GitHub. We further conduct a benchmark study to compare the performance of eight machine learning classifiers. Results of the benchmark study show that XGBoost is the best‐performing classifier for smell prediction.

拉请求的质量是集成商接受或拒绝拉请求的主要考虑因素。代码气味表明源代码中的次优设计或实现选择通常会导致容易出错的结果，从而威胁到pull请求的质量。本研究探讨了来自25个流行Java项目的21000个拉取请求中的代码气味。我们发现接受的(37%)和拒绝的(44%)pull request都有代码异味，主要受类和长方法的影响。此外，我们观察到臭拉请求更复杂，更难以理解，因为它们具有显着的大尺寸，长延迟时间，更多的讨论和审查评论，并且是由经验较少的贡献者提交的。我们的研究结果表明，先前研究中用于拉取请求接受预测的特征可能被用于预测传入拉取请求中的气味。我们提出了一种动态方法来预测新添加的拉取请求中是否存在这种代码气味。我们在从GitHub提取的25个Java项目的数据集上评估了我们的方法。我们进一步进行基准研究，比较八种机器学习分类器的性能。基准研究结果表明，XGBoost是气味预测中表现最好的分类器。

{"title":"Code smells in pull requests: An exploratory study","authors":"Muhammad Ilyas Azeem, Saad Shafiq, Atif Mashkoor, Alexander Egyed","doi":"10.1002/spe.3283","DOIUrl":"https://doi.org/10.1002/spe.3283","url":null,"abstract":"The quality of a pull request is the primary factor integrators consider for its acceptance or rejection. Code smells indicate sub‐optimal design or implementation choices in the source code that often lead to a fault‐prone outcome, threatening the quality of pull requests. This study explores code smells in 21k pull requests from 25 popular Java projects. We find that both accepted (37%) and rejected (44%) pull requests have code smells, affected mainly by god classes and long methods. Besides, we observe that smelly pull requests are more complex and challenging to understand as they have significantly large sizes, long latency times, more discussion and review comments, and are submitted by contributors with less experience. Our results show that features used in previous studies for pull request acceptance prediction could be potentially employed to predict smell in incoming pull requests. We propose a dynamic approach to predict the presence of such code smells in the newly added pull requests. We evaluate our approach on a dataset of 25 Java projects extracted from GitHub. We further conduct a benchmark study to compare the performance of eight machine learning classifiers. Results of the benchmark study show that XGBoost is the best‐performing classifier for smell prediction.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135570146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0