Journal of Systems and Software最新文献_第6页

A systematic literature review on characteristics of the front-end phase of agile software development projects and their connections to project success 关于敏捷软件开发项目前端阶段特点及其与项目成功关系的系统性文献综述

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-07-08 DOI: 10.1016/j.jss.2024.112155

Context

Software development of new products and services often involves a front-end phase where user needs are analysed, costs and benefits are estimated, and initial plans are created.

Goal

This study aims to learn more about how the introduction of agile software development has affected practices and outcomes related to cost and benefit estimation in this front-end phase and to understand better what would improve this phase.

Method

We identified, reviewed and aggregated the results from 42 relevant research articles by searching literature databases and snowballing relevant articles.

Results

The front-end phase of agile was found to be, on average, similar and just as comprehensive as that of non-agile software development. This may be unfortunate, given the finding that more successful agile software development is connected with less detail in cost estimation and planning-related activities. A less comprehensive front-end phase may be especially beneficial for low-risk agile software development.

Conclusion

The results of this review suggest that agile principles, so far, have had a limited influence on the front-end phase. We recommend more flexibility and context-dependency in how the front-end phase of agile software development is conducted, including less comprehensive estimation and planning activities for low-risk software development contexts.

背景新产品和服务的软件开发通常涉及一个前端阶段，在这一阶段需要分析用户需求、估算成本和收益并制定初步计划。本研究旨在进一步了解敏捷软件开发的引入如何影响这一前端阶段与成本和收益估算相关的实践和结果，并更好地了解如何改进这一阶段。方法我们通过搜索文献数据库和 "滚雪球 "式搜索相关文章，确定、审查并汇总了 42 篇相关研究文章的结果。结果我们发现，平均而言，敏捷软件开发的前端阶段与非敏捷软件开发的前端阶段相似，且同样全面。考虑到敏捷软件开发的成功与成本估算和规划相关活动的细节较少有关，这可能是令人遗憾的。本综述的结果表明，迄今为止，敏捷原则对前端阶段的影响有限。我们建议在如何进行敏捷软件开发的前端阶段时要更加灵活，并根据具体情况而定，包括在低风险的软件开发环境中开展不太全面的估算和规划活动。

{"title":"A systematic literature review on characteristics of the front-end phase of agile software development projects and their connections to project success","authors":"","doi":"10.1016/j.jss.2024.112155","DOIUrl":"10.1016/j.jss.2024.112155","url":null,"abstract":"<div><h3>Context</h3><p>Software development of new products and services often involves a front-end phase where user needs are analysed, costs and benefits are estimated, and initial plans are created.</p></div><div><h3>Goal</h3><p>This study aims to learn more about how the introduction of agile software development has affected practices and outcomes related to cost and benefit estimation in this front-end phase and to understand better what would improve this phase.</p></div><div><h3>Method</h3><p>We identified, reviewed and aggregated the results from 42 relevant research articles by searching literature databases and snowballing relevant articles.</p></div><div><h3>Results</h3><p>The front-end phase of agile was found to be, on average, similar and just as comprehensive as that of non-agile software development. This may be unfortunate, given the finding that more successful agile software development is connected with less detail in cost estimation and planning-related activities. A less comprehensive front-end phase may be especially beneficial for low-risk agile software development.</p></div><div><h3>Conclusion</h3><p>The results of this review suggest that agile principles, so far, have had a limited influence on the front-end phase. We recommend more flexibility and context-dependency in how the front-end phase of agile software development is conducted, including less comprehensive estimation and planning activities for low-risk software development contexts.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141690673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AML: An accuracy metric model for effective evaluation of log parsing techniques AML：有效评估日志解析技术的准确度度量模型

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-07-06 DOI: 10.1016/j.jss.2024.112154

Logs are essential for the maintenance of large software systems. Software engineers often analyze logs for debugging, root cause analysis, and anomaly detection tasks. Logs, however, are partly structured, making the extraction of useful information from massive log files a challenging task. Recently, many log parsing techniques have been proposed to automatically extract log templates from unstructured log files. These parsers, however, are evaluated using different accuracy metrics. In this paper, we show that these metrics have several drawbacks, making it challenging to understand the strengths and limitations of existing parsers. To address this, we propose a novel accuracy metric, called AML (Accuracy Metric for Log Parsing). AML is a robust accuracy metric that is inspired by research in the field of remote sensing. It is based on measuring omission and commission errors. We use AML to assess the accuracy of 14 log parsing tools applied to the parsing of 16 log datasets. We also show how AML compares to existing accuracy metrics. Our findings demonstrate that AML is a promising accuracy metric for log parsing compared to alternative solutions, which enables a comprehensive evaluation of log parsing tools to help better decision-making in selecting and improving log parsing techniques.

日志对于大型软件系统的维护至关重要。软件工程师经常分析日志，以完成调试、根本原因分析和异常检测任务。然而，日志是部分结构化的，因此从海量日志文件中提取有用信息是一项极具挑战性的任务。最近，人们提出了许多日志解析技术，用于从非结构化日志文件中自动提取日志模板。然而，这些解析器使用不同的准确度指标进行评估。在本文中，我们发现这些指标存在一些缺陷，使得了解现有解析器的优势和局限性变得十分困难。为了解决这个问题，我们提出了一种新的准确度指标，称为 AML（日志解析的准确度指标）。AML 是一种稳健的准确度度量，其灵感来自遥感领域的研究。它以测量遗漏和误差为基础。我们使用 AML 来评估 14 种日志解析工具解析 16 个日志数据集的准确性。我们还展示了 AML 与现有准确度指标的比较。我们的研究结果表明，与其他解决方案相比，AML 是一种很有前途的日志解析准确度指标，它可以对日志解析工具进行全面评估，从而帮助在选择和改进日志解析技术时做出更好的决策。

{"title":"AML: An accuracy metric model for effective evaluation of log parsing techniques","authors":"","doi":"10.1016/j.jss.2024.112154","DOIUrl":"10.1016/j.jss.2024.112154","url":null,"abstract":"<div><p>Logs are essential for the maintenance of large software systems. Software engineers often analyze logs for debugging, root cause analysis, and anomaly detection tasks. Logs, however, are partly structured, making the extraction of useful information from massive log files a challenging task. Recently, many log parsing techniques have been proposed to automatically extract log templates from unstructured log files. These parsers, however, are evaluated using different accuracy metrics. In this paper, we show that these metrics have several drawbacks, making it challenging to understand the strengths and limitations of existing parsers. To address this, we propose a novel accuracy metric, called AML (Accuracy Metric for Log Parsing). AML is a robust accuracy metric that is inspired by research in the field of remote sensing. It is based on measuring omission and commission errors. We use AML to assess the accuracy of 14 log parsing tools applied to the parsing of 16 log datasets. We also show how AML compares to existing accuracy metrics. Our findings demonstrate that AML is a promising accuracy metric for log parsing compared to alternative solutions, which enables a comprehensive evaluation of log parsing tools to help better decision-making in selecting and improving log parsing techniques.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141705411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Technical debt in AI-enabled systems: On the prevalence, severity, impact, and management strategies for code and architecture 人工智能系统中的技术债务：代码和架构的普遍性、严重性、影响和管理策略

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-07-04 DOI: 10.1016/j.jss.2024.112151

Gilberto Recupito , Fabiano Pecorelli , Gemma Catolino , Valentina Lenarduzzi , Davide Taibi , Dario Di Nucci , Fabio Palomba

Context:

Artificial Intelligence (AI) is pervasive in several application domains and promises to be even more diffused in the next decades. Developing high-quality AI-enabled systems — software systems embedding one or multiple AI components, algorithms, and models — could introduce critical challenges for mitigating specific risks related to the systems’ quality. Such development alone is insufficient to fully address socio-technical consequences and the need for rapid adaptation to evolutionary changes. Recent work proposed the concept of AI technical debt, a potential liability concerned with developing AI-enabled systems whose impact can affect the overall systems’ quality. While the problem of AI technical debt is rapidly gaining the attention of the software engineering research community, scientific knowledge that contributes to understanding and managing the matter is still limited.

Objective:

In this paper, we leverage the expertise of practitioners to offer useful insights to the research community, aiming to enhance researchers’ awareness about the detection and mitigation of AI technical debt. Our ultimate goal is to empower practitioners by providing them with tools and methods. Additionally, our study sheds light on novel aspects that practitioners might not be fully acquainted with, contributing to a deeper understanding of the subject.

Method:

We develop a survey study featuring 53 AI practitioners, in which we collect information on the practical prevalence, severity, and impact of AI technical debt issues affecting the code and the architecture other than the strategies applied by practitioners to identify and mitigate them.

Results:

The key findings of the study reveal the multiple impacts that AI technical debt issues may have on the quality of AI-enabled systems (e.g., the high negative impact that Undeclared consumers has on security, whereas Jumbled Model Architecture can induce the code to be hard to maintain) and the little support practitioners have to deal with them, limited to apply manual effort for identification and refactoring.

Conclusion:

We conclude the article by distilling lessons learned and actionable insights for researchers.

背景：人工智能（AI）已渗透到多个应用领域，并有望在未来几十年内进一步普及。开发高质量的人工智能系统--嵌入一个或多个人工智能组件、算法和模型的软件系统--可能会给降低与系统质量有关的特定风险带来严峻挑战。仅靠这种开发不足以完全解决社会技术后果和快速适应演变变化的需要。最近的工作提出了人工智能技术债务的概念，这是开发人工智能系统的潜在责任，其影响可能会影响整个系统的质量。虽然人工智能技术债务问题正迅速获得软件工程研究界的关注，但有助于理解和管理这一问题的科学知识仍然有限。目标：在本文中，我们利用从业人员的专业知识为研究界提供有用的见解，旨在提高研究人员对检测和减轻人工智能技术债务的认识。我们的最终目标是为从业人员提供工具和方法，从而增强他们的能力。此外，我们的研究还揭示了从业人员可能不完全了解的新方面，有助于加深对这一主题的理解。方法：我们对 53 名人工智能从业人员进行了调查研究，收集了有关影响代码和架构的人工智能技术债务问题的实际普遍性、严重性和影响的信息，以及从业人员识别和缓解这些问题的策略。结果：研究的主要发现揭示了人工智能技术债务问题可能对人工智能系统的质量产生的多重影响（例如，未声明的技术债务问题造成的高负面影响）、结果：研究的主要发现揭示了人工智能技术债务问题可能对人工智能系统的质量产生的多重影响（例如，未声明的消费者对安全性有很大的负面影响，而杂乱无章的模型架构会导致代码难以维护），以及从业人员在处理这些问题时几乎得不到任何支持，只能采用人工方式进行识别和重构。

{"title":"Technical debt in AI-enabled systems: On the prevalence, severity, impact, and management strategies for code and architecture","authors":"Gilberto Recupito , Fabiano Pecorelli , Gemma Catolino , Valentina Lenarduzzi , Davide Taibi , Dario Di Nucci , Fabio Palomba","doi":"10.1016/j.jss.2024.112151","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112151","url":null,"abstract":"<div><h3>Context:</h3><p>Artificial Intelligence (AI) is pervasive in several application domains and promises to be even more diffused in the next decades. Developing high-quality AI-enabled systems — software systems embedding one or multiple AI components, algorithms, and models — could introduce critical challenges for mitigating specific risks related to the systems’ quality. Such development alone is insufficient to fully address socio-technical consequences and the need for rapid adaptation to evolutionary changes. Recent work proposed the concept of AI technical debt, a potential liability concerned with developing AI-enabled systems whose impact can affect the overall systems’ quality. While the problem of AI technical debt is rapidly gaining the attention of the software engineering research community, scientific knowledge that contributes to understanding and managing the matter is still limited.</p></div><div><h3>Objective:</h3><p>In this paper, we leverage the expertise of practitioners to offer useful insights to the research community, aiming to enhance researchers’ awareness about the detection and mitigation of AI technical debt. Our ultimate goal is to empower practitioners by providing them with tools and methods. Additionally, our study sheds light on novel aspects that practitioners might not be fully acquainted with, contributing to a deeper understanding of the subject.</p></div><div><h3>Method:</h3><p>We develop a survey study featuring 53 AI practitioners, in which we collect information on the practical prevalence, severity, and impact of AI technical debt issues affecting the code and the architecture other than the strategies applied by practitioners to identify and mitigate them.</p></div><div><h3>Results:</h3><p>The key findings of the study reveal the multiple impacts that AI technical debt issues may have on the quality of AI-enabled systems (<em>e</em>.<em>g</em>., the high negative impact that <em>Undeclared consumers</em> has on security, whereas <em>Jumbled Model Architecture</em> can induce the code to be hard to maintain) and the little support practitioners have to deal with them, limited to apply manual effort for identification and refactoring.</p></div><div><h3>Conclusion:</h3><p>We conclude the article by distilling lessons learned and actionable insights for researchers.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0164121224001961/pdfft?md5=b02417a1294a463a8dd9b676949d1b4a&pid=1-s2.0-S0164121224001961-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141596840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Temporal isolation assessment in virtualized safety-critical mixed-criticality systems: A case study on Xen hypervisor 虚拟化安全关键混合临界系统中的时间隔离评估：Xen 虚拟机管理程序案例研究

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-07-02 DOI: 10.1016/j.jss.2024.112147

Marcello Cinque, Luigi De Simone, Daniele Ottaviano

Today, we are witnessing the increasing use of the cloud and virtualization technologies, which are a prominent way for the industry to develop mixed-criticality systems (MCSs) and reduce SWaP-C factors (size, weight, power, and cost) by flexibly consolidating multiple critical and non-critical software on the same System-on-a-Chip (SoC). Unfortunately, using virtualization leads to several issues in assessing isolation aspects, especially temporal behaviors, which must be evaluated due to safety-related standards (e.g., EN50128 in the railway domain). This study proposes a systematic approach for verifying temporal isolation properties in virtualized MCSs to characterize and mitigate timing failures, which is a fundamental aspect of dependability. In particular, as proof of the effectiveness of our proposal, we exploited the real-time flavor of Xen hypervisor used to deploy a virtualized 2 out of 2-based MCS scenario provided in the framework of an academic-industrial partnership, in the context of the railway domain. The results point out that virtualization overhead must be carefully tuned in a real industrial scenario according to the several features provided by a specific hypervisor solution. Further, we identify a set of directions toward employing virtualization in industry in the context of ARM-based mixed-criticality systems.

如今，我们看到云和虚拟化技术的使用越来越多，这是业界开发混合关键性系统（MCS）并通过在同一片上系统（SoC）上灵活整合多个关键和非关键软件来降低 SWaP-C 因素（尺寸、重量、功耗和成本）的重要方法。遗憾的是，使用虚拟化会导致在评估隔离方面出现一些问题，尤其是时间行为，因为安全相关标准（如铁路领域的 EN50128）规定必须对时间行为进行评估。本研究提出了一种系统方法，用于验证虚拟化 MCS 中的时间隔离特性，以表征和缓解时序故障，这是可靠性的一个基本方面。特别是，为了证明我们建议的有效性，我们利用 Xen 虚拟机管理程序的实时性，在铁路领域部署了基于学术和工业合作框架的虚拟化 2 out of 2 MCS 场景。结果表明，在实际工业场景中，必须根据特定管理程序解决方案提供的若干功能，对虚拟化开销进行仔细调整。此外，我们还确定了在基于 ARM 的混合关键性系统背景下在工业中采用虚拟化的一系列方向。

{"title":"Temporal isolation assessment in virtualized safety-critical mixed-criticality systems: A case study on Xen hypervisor","authors":"Marcello Cinque, Luigi De Simone, Daniele Ottaviano","doi":"10.1016/j.jss.2024.112147","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112147","url":null,"abstract":"<div><p>Today, we are witnessing the increasing use of the cloud and virtualization technologies, which are a prominent way for the industry to develop mixed-criticality systems (MCSs) and reduce <em>SWaP-C</em> factors (size, weight, power, and cost) by flexibly consolidating multiple critical and non-critical software on the same System-on-a-Chip (SoC). Unfortunately, using virtualization leads to several issues in assessing isolation aspects, especially temporal behaviors, which must be evaluated due to safety-related standards (e.g., EN50128 in the railway domain). This study proposes a systematic approach for verifying temporal isolation properties in virtualized MCSs to characterize and mitigate timing failures, which is a fundamental aspect of dependability. In particular, as proof of the effectiveness of our proposal, we exploited the real-time flavor of Xen hypervisor used to deploy a virtualized <em>2 out of 2</em>-based MCS scenario provided in the framework of an academic-industrial partnership, in the context of the railway domain. The results point out that virtualization overhead must be carefully tuned in a real industrial scenario according to the several features provided by a specific hypervisor solution. Further, we identify a set of directions toward employing virtualization in industry in the context of ARM-based mixed-criticality systems.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0164121224001924/pdfft?md5=f269cff9c3594f698621a5e15338501d&pid=1-s2.0-S0164121224001924-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141582685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

UVLHub: A feature model data repository using UVL and open science principles UVLHub：使用 UVL 和开放科学原则的特征模型数据存储库

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-07-01 DOI: 10.1016/j.jss.2024.112150

David Romero-Organvidez , José A. Galindo , Chico Sundermann , Jose-Miguel Horcas , David Benavides

Feature models are the de facto standard for modelling variabilities and commonalities in features and relationships in software product lines. They are the base artefacts in many engineering activities, such as product configuration, derivation, or testing. Concrete models in different domains exist; however, many are in private or sparse repositories or belong to discontinued projects. The dispersion of knowledge of feature models hinders the study and reuse of these artefacts in different studies. The Universal Variability Language (UVL) is a community effort textual feature model language that promotes a common way of serializing feature models independently of concrete tools. Open science principles promote transparency, accessibility, and collaboration in scientific research. Although some attempts exist to promote feature model sharing, the existing solutions lack open science principles by design. In addition, existing and public feature models are described using formats not always supported by current tools. This paper presents

, a repository of feature models in UVL format.

provides a front end that facilitates the search, upload, storage, and management of feature model datasets, improving the capabilities of discontinued proposals. Furthermore, the tool communicates with Zenodo – one of the most well-known open science repositories – providing a permanent save of datasets and following open science principles.

includes existing datasets and is readily available to include new data and functionalities in the future. It is maintained by three active universities in variability modelling.

功能模型是对软件产品系列中功能和关系的变异性和共性进行建模的事实标准。它们是许多工程活动（如产品配置、衍生或测试）的基础工件。不同领域都有具体的模型，但许多模型都在私人或稀少的资源库中，或者属于已中止的项目。特征模型知识的分散阻碍了在不同研究中对这些人工制品的研究和重用。通用可变性语言（UVL）是一种社区合作的文本特征模型语言，它倡导一种独立于具体工具的通用特征模型序列化方式。开放科学原则促进了科学研究的透明度、可访问性和协作性。虽然目前已有一些促进特征模型共享的尝试，但现有的解决方案在设计上缺乏开放科学原则。此外，现有和公开的特征模型所使用的格式并不总是为当前工具所支持。本文介绍了一个 UVL 格式的特征模型库，它提供了一个前端，便于搜索、上传、存储和管理特征模型数据集，提高了中止提案的能力。此外，该工具还与最著名的开放科学资料库之一 Zenodo 通信，提供数据集的永久保存，并遵循开放科学原则。它由三所活跃在变异性建模领域的大学负责维护。

{"title":"UVLHub: A feature model data repository using UVL and open science principles","authors":"David Romero-Organvidez , José A. Galindo , Chico Sundermann , Jose-Miguel Horcas , David Benavides","doi":"10.1016/j.jss.2024.112150","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112150","url":null,"abstract":"<div><p>Feature models are the <em>de facto</em> standard for modelling variabilities and commonalities in features and relationships in software product lines. They are the base artefacts in many engineering activities, such as product configuration, derivation, or testing. Concrete models in different domains exist; however, many are in private or sparse repositories or belong to discontinued projects. The dispersion of knowledge of feature models hinders the study and reuse of these artefacts in different studies. The Universal Variability Language (UVL) is a community effort textual feature model language that promotes a common way of serializing feature models independently of concrete tools. Open science principles promote transparency, accessibility, and collaboration in scientific research. Although some attempts exist to promote feature model sharing, the existing solutions lack open science principles by design. In addition, existing and public feature models are described using formats not always supported by current tools. This paper presents <figure><img></figure> , a repository of feature models in UVL format. <figure><img></figure> provides a front end that facilitates the search, upload, storage, and management of feature model datasets, improving the capabilities of discontinued proposals. Furthermore, the tool communicates with Zenodo – one of the most well-known open science repositories – providing a permanent save of datasets and following open science principles. <figure><img></figure> includes existing datasets and is readily available to include new data and functionalities in the future. It is maintained by three active universities in variability modelling.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S016412122400195X/pdfft?md5=2c04f3c6d777710e69a0d2ddb2012e6a&pid=1-s2.0-S016412122400195X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141596841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Project-specific code summarization with in-context learning 通过上下文学习总结特定项目代码

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-06-28 DOI: 10.1016/j.jss.2024.112149

Shangbo Yun , Shuhuai Lin , Xiaodong Gu , Beijun Shen

Automatically generating summaries for source code has emerged as a valuable task in software development. While state-of-the-art (SOTA) approaches have demonstrated significant efficacy in summarizing general code, they seldom concern code summarization for a specific project. Project-specific code summarization (PCS) poses special challenges due to the scarce availability of training data and the unique styles of different projects. In this paper, we empirically analyze the performance of Large Language Models (LLMs) on PCS tasks. Our study reveals that using appropriate prompts is an effective way to solicit LLMs for generating project-specific code summaries. Based on these findings, we propose a novel project-specific code summarization approach called P-CodeSum. P-CodeSum gathers a repository-level pool of (code, summary) examples to characterize the project-specific features. Then, it trains a neural prompt selector on a high-quality dataset crafted by LLMs using the example pool. The prompt selector offers relevant and high-quality prompts for LLMs to generate project-specific summaries. We evaluate against a variety of baseline approaches on six PCS datasets. Experimental results show that the P-CodeSum improves the performance by 5.9% (RLPG) to 101.51% (CodeBERT) on BLEU-4 compared to the state-of-the-art approaches in project-specific code summarization.

自动生成源代码摘要已成为软件开发中的一项重要任务。虽然最先进的（SOTA）方法在总结一般代码方面已显示出显著的功效，但它们很少涉及特定项目的代码总结。由于训练数据的稀缺性和不同项目的独特风格，特定项目代码总结（PCS）面临着特殊的挑战。在本文中，我们对大型语言模型（LLM）在 PCS 任务中的性能进行了实证分析。我们的研究表明，使用适当的提示是吸引 LLM 生成特定项目代码摘要的有效方法。基于这些发现，我们提出了一种名为 P-CodeSum 的新型特定项目代码总结方法。P-CodeSum 收集了一个资源库级别的（代码、摘要）示例库，以描述特定项目的特征。然后，它利用示例库在由 LLM 制作的高质量数据集上训练神经提示选择器。提示选择器可为 LLM 生成项目特定摘要提供相关的高质量提示。我们在六个 PCS 数据集上对各种基准方法进行了评估。实验结果表明，P-CodeSum 在 BLEU-4 的性能比项目特定代码总结的先进方法提高了 5.9%（RLPG）到 101.51%（CodeBERT）。

{"title":"Project-specific code summarization with in-context learning","authors":"Shangbo Yun , Shuhuai Lin , Xiaodong Gu , Beijun Shen","doi":"10.1016/j.jss.2024.112149","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112149","url":null,"abstract":"<div><p>Automatically generating summaries for source code has emerged as a valuable task in software development. While state-of-the-art (SOTA) approaches have demonstrated significant efficacy in summarizing general code, they seldom concern code summarization for a specific project. Project-specific code summarization (PCS) poses special challenges due to the scarce availability of training data and the unique styles of different projects. In this paper, we empirically analyze the performance of Large Language Models (LLMs) on PCS tasks. Our study reveals that using appropriate prompts is an effective way to solicit LLMs for generating project-specific code summaries. Based on these findings, we propose a novel project-specific code summarization approach called P-CodeSum. P-CodeSum gathers a repository-level pool of (code, summary) examples to characterize the project-specific features. Then, it trains a neural prompt selector on a high-quality dataset crafted by LLMs using the example pool. The prompt selector offers relevant and high-quality prompts for LLMs to generate project-specific summaries. We evaluate against a variety of baseline approaches on six PCS datasets. Experimental results show that the P-CodeSum improves the performance by 5.9% (RLPG) to 101.51% (CodeBERT) on BLEU-4 compared to the state-of-the-art approaches in project-specific code summarization.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141583405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

BIT: A template-based approach to incremental and bidirectional model-to-text transformation BIT：基于模板的增量双向模型到文本转换方法

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-06-27 DOI: 10.1016/j.jss.2024.112148

Xiao He , Tao Zan

Model-driven development is a model-centric software development paradigm that automates the development process by converting high-level models into low-level code and documents. To maintain synchronization between models and code/documents — which can evolve independently — this paper introduces BIT, a bidirectional language that can serve as a conventional template language for model-to-text transformations. However, a BIT program can function as both a printer, generating text by filling template holes with values from the input model, and a parser, putting parsed values back into the model. BIT comprises a surface language for better usability and a core language for formal definition. We define the semantics of the core language based on the theory of bidirectional transformation, and provide the translation from the surface to the core. We present the proof sketch of the well behavedness of BIT as a formal evidence of soundness. We also conduct three case studies to empirically demonstrate the expressiveness and the effectiveness of BIT. Based on the proof and the case studies, BIT covers the major features of existing template languages, and offers sufficient expressiveness to define real-world model-to-text transformations that can be executed bidirectionally and incrementally.

模型驱动开发是一种以模型为中心的软件开发范式，它通过将高级模型转换为低级代码和文档来实现开发过程的自动化。为了保持模型与代码/文档之间的同步（两者可以独立发展），本文介绍了一种双向语言 BIT，它可以作为传统的模板语言，用于模型到文本的转换。不过，BIT 程序既可以充当打印机，用输入模型中的值填充模板孔来生成文本，也可以充当解析器，将解析后的值放回模型中。BIT 包括用于提高可用性的表面语言和用于正式定义的核心语言。我们根据双向转换理论定义了核心语言的语义，并提供了从表面语言到核心语言的转换。我们提出了 BIT 良好行为的证明草图，作为合理性的形式证据。我们还进行了三个案例研究，以实证证明 BIT 的表达能力和有效性。基于证明和案例研究，BIT 涵盖了现有模板语言的主要特点，并提供了足够的表现力来定义现实世界中可双向和增量执行的模型到文本转换。

{"title":"BIT: A template-based approach to incremental and bidirectional model-to-text transformation","authors":"Xiao He , Tao Zan","doi":"10.1016/j.jss.2024.112148","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112148","url":null,"abstract":"<div><p>Model-driven development is a model-centric software development paradigm that automates the development process by converting high-level models into low-level code and documents. To maintain synchronization between models and code/documents — which can evolve independently — this paper introduces <span>BIT</span>, a bidirectional language that can serve as a conventional template language for model-to-text transformations. However, a <span>BIT</span> program can function as both a <em>printer</em>, generating text by filling template holes with values from the input model, and a <em>parser</em>, putting parsed values back into the model. <span>BIT</span> comprises a surface language for better usability and a core language for formal definition. We define the semantics of the core language based on the theory of bidirectional transformation, and provide the translation from the surface to the core. We present the proof sketch of the well behavedness of <span>BIT</span> as a formal evidence of soundness. We also conduct three case studies to empirically demonstrate the expressiveness and the effectiveness of <span>BIT</span>. Based on the proof and the case studies, <span>BIT</span> covers the major features of existing template languages, and offers sufficient expressiveness to define real-world model-to-text transformations that can be executed bidirectionally and incrementally.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141605163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A fuzzy logic-based quality model for identifying microservices with low maintainability 基于模糊逻辑的质量模型，用于识别可维护性低的微服务

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-06-27 DOI: 10.1016/j.jss.2024.112143

Rahime Yılmaz , Feza Buzluca

Microservice Architecture (MSA) is a popular architectural style that offers many advantages regarding quality attributes, including maintainability and scalability. Developing a system as a set of microservices with expected benefits requires a quality assessment strategy that is established on the measurements of the system's properties. This paper proposes a hierarchical quality model based on fuzzy logic to measure and evaluate the maintainability of MSAs considering ISO/IEC 250xy SQuaRE (System and Software Quality Requirements and Evaluation) standards. Since the qualitative bounds of low-level quality attributes are inherently ambiguous, we use a fuzzification technique to transform crisp values of code metrics into fuzzy levels and apply them as inputs to our quality model. The model generates fuzzy values for the quality sub-characteristics of the maintainability, i.e., modifiability and testability, converted to numerical values through defuzzification. In the last step, using the values of the sub-characteristics, we calculate numerical scores indicating the maintainability level of each microservice in the examined software system. This score was used to assess the quality of the microservices and decide whether they need refactoring. We evaluated our approach by creating a test set with the assistance of three developers, who reviewed and categorized the maintainability levels of the microservices in an open-source project based on their knowledge and experience. They labeled microservices as low, medium, or high, with low indicating the need for refactoring. Our method for identifying low-labeled microservices in the given test set achieved 94% accuracy, 78% precision, and 100% recall. These results indicate that our approach can assist designers in evaluating the maintainability quality of microservices.

微服务架构（MSA）是一种流行的架构风格，在质量属性方面具有许多优势，包括可维护性和可扩展性。要将系统开发成一套具有预期效益的微服务，就需要在测量系统属性的基础上制定质量评估策略。本文根据 ISO/IEC 250xy SQuaRE（系统和软件质量要求与评估）标准，提出了一种基于模糊逻辑的分层质量模型，用于测量和评估 MSA 的可维护性。由于低层次质量属性的定性界限本质上是模糊的，因此我们使用模糊化技术将代码度量的清晰值转化为模糊级别，并将其作为质量模型的输入。该模型为可维护性的质量子特征（即可修改性和可测试性）生成模糊值，并通过模糊化转换为数值。在最后一步，我们利用子特征值计算出数值分数，表明受检软件系统中每个微服务的可维护性水平。该分数用于评估微服务的质量，并决定是否需要对其进行重构。我们在三位开发人员的协助下创建了一个测试集，对我们的方法进行了评估。他们根据自己的知识和经验对一个开源项目中的微服务的可维护性水平进行了审查和分类。他们将微服务标记为低、中或高，低表示需要重构。我们在给定测试集中识别低标签微服务的方法达到了 94% 的准确率、78% 的精确率和 100% 的召回率。这些结果表明，我们的方法可以帮助设计人员评估微服务的可维护性质量。

{"title":"A fuzzy logic-based quality model for identifying microservices with low maintainability","authors":"Rahime Yılmaz , Feza Buzluca","doi":"10.1016/j.jss.2024.112143","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112143","url":null,"abstract":"<div><p>Microservice Architecture (MSA) is a popular architectural style that offers many advantages regarding quality attributes, including maintainability and scalability. Developing a system as a set of microservices with expected benefits requires a quality assessment strategy that is established on the measurements of the system's properties. This paper proposes a hierarchical quality model based on fuzzy logic to measure and evaluate the maintainability of MSAs considering ISO/IEC 250xy SQuaRE (System and Software Quality Requirements and Evaluation) standards. Since the qualitative bounds of low-level quality attributes are inherently ambiguous, we use a fuzzification technique to transform crisp values of code metrics into fuzzy levels and apply them as inputs to our quality model. The model generates fuzzy values for the quality sub-characteristics of the maintainability, i.e., modifiability and testability, converted to numerical values through defuzzification. In the last step, using the values of the sub-characteristics, we calculate numerical scores indicating the maintainability level of each microservice in the examined software system. This score was used to assess the quality of the microservices and decide whether they need refactoring. We evaluated our approach by creating a test set with the assistance of three developers, who reviewed and categorized the maintainability levels of the microservices in an open-source project based on their knowledge and experience. They labeled microservices as low, medium, or high, with low indicating the need for refactoring. Our method for identifying low-labeled microservices in the given test set achieved 94% accuracy, 78% precision, and 100% recall. These results indicate that our approach can assist designers in evaluating the maintainability quality of microservices.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141582684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Flexible and reversible conversion between extensible records and overloading constraints for ML 在可扩展记录和重载约束之间进行灵活、可逆的转换，以实现 ML

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-06-25 DOI: 10.1016/j.jss.2024.112141

Alvise Spanò

Most ML-like functional languages provide records and overloading as unrelated features. Records not only represent data structures, but are also used to implement dictionary passing, whereas overloading produces type constraints that are basically dictionaries subject to compiler-driven dispatching. In this paper we explore how records and overloading constraints can be converted one into the other, allowing the programmer to switch between the two at a very reasonable cost in terms of syntactic overhead. To achieve this we introduce two language constructs, namely inject and eject, performing a type-driven syntactic transformation. The former literally injects constraints into the type and produces a function adding an extra record argument. The latter does the opposite, ejecting a record argument from a function and turning fields into type constraints. The conversion is reversible and can be restricted to a subset of symbols, granting additional control to the programmer. Although what we call inject has already been proposed in literature, making it a language operator and coupling it with its reverse counterpart represent a novel design. The goal is to allow the programmer to switch from a dictionary-passing style to compiler-assisted constraint resolution, and vice versa, enabling reuse between libraries that otherwise would not interoperate.

大多数类似 ML 的函数式语言都提供记录和重载这两种互不相关的功能。记录不仅代表数据结构，还用于实现字典传递，而重载产生的类型约束基本上是字典，受编译器驱动的调度。在本文中，我们探讨了如何将记录和重载约束相互转换，从而允许程序员以非常合理的语法开销在两者之间切换。为此，我们引入了两种语言结构，即注入（inject）和弹出（eject），执行类型驱动的语法转换。前者从字面上将约束注入类型，并产生一个添加额外记录参数的函数。后者则相反，从函数中弹出记录参数，并将字段转化为类型约束。这种转换是可逆的，而且可以限制在一个符号子集内，从而为程序员提供了额外的控制权。虽然我们所说的注入在文献中已有提出，但将其作为语言运算符并与反向运算符耦合是一种新颖的设计。我们的目标是允许程序员从字典传递方式切换到编译器辅助的约束解决方式，反之亦然，从而实现原本无法互操作的库之间的重用。

{"title":"Flexible and reversible conversion between extensible records and overloading constraints for ML","authors":"Alvise Spanò","doi":"10.1016/j.jss.2024.112141","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112141","url":null,"abstract":"<div><p>Most ML-like functional languages provide records and overloading as unrelated features. Records not only represent data structures, but are also used to implement dictionary passing, whereas overloading produces type constraints that are basically dictionaries subject to compiler-driven dispatching. In this paper we explore how records and overloading constraints can be converted one into the other, allowing the programmer to switch between the two at a very reasonable cost in terms of syntactic overhead. To achieve this we introduce two language constructs, namely inject and eject, performing a type-driven syntactic transformation. The former literally injects constraints into the type and produces a function adding an extra record argument. The latter does the opposite, ejecting a record argument from a function and turning fields into type constraints. The conversion is reversible and can be restricted to a subset of symbols, granting additional control to the programmer. Although what we call inject has already been proposed in literature, making it a language operator and coupling it with its reverse counterpart represent a novel design. The goal is to allow the programmer to switch from a dictionary-passing style to compiler-assisted constraint resolution, and vice versa, enabling reuse between libraries that otherwise would not interoperate.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0164121224001869/pdfft?md5=edca73f6392f9a2e816339dcb88104ce&pid=1-s2.0-S0164121224001869-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141542023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating intrusion detection for microservice applications: Benchmark, dataset, and case studies 评估微服务应用程序的入侵检测：基准、数据集和案例研究

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Systems and Software

Pub Date : 2024-06-24 DOI: 10.1016/j.jss.2024.112142

José Flora, Nuno Antunes

Microservices are predominant for cloud-based applications, which serve millions of customers daily, that commonly run business-critical systems on software containers and multi-tenant environments; so, it is of utmost importance to secure these systems. Intrusion detection is a widely applied technique that is now being used in microservices to build behavior detection models and report possible attacks during runtime. However, it is cumbersome to evaluate and compare the effectiveness of different approaches. Standardized frameworks are non-existent and without fairly comparing new techniques to the state-of-the-art, it is difficult to understand their pros and cons. This paper presents a comprehensive approach to evaluate and compare different intrusion detection approaches for microservice applications. A benchmarking methodology is proposed to allow users to standardize the process for a representative and reproducible evaluation. We also present a dataset that applies representative workloads and technologies based on microservice applications state-of-the-art. The benchmark and dataset are used in three case studies, characterized by dynamicity, scalability, and continuous delivery, to evaluate and compare state-of-the-art algorithms with the objective of tackling intrusion detection in microservices. Experiments show the usefulness and wide application range of the benchmark while showing the capacity of intrusion detection algorithms in different applications and deployments.

微服务在基于云的应用中占主导地位，每天为数百万客户提供服务，这些应用通常在软件容器和多租户环境中运行关键业务系统；因此，确保这些系统的安全至关重要。入侵检测是一种广泛应用的技术，目前正被用于微服务，以建立行为检测模型并在运行期间报告可能的攻击。然而，评估和比较不同方法的有效性非常麻烦。标准化框架并不存在，如果不将新技术与最先进的技术进行公平比较，就很难了解它们的优缺点。本文提出了一种综合方法，用于评估和比较微服务应用程序的不同入侵检测方法。本文提出了一种基准测试方法，使用户能够将流程标准化，以进行具有代表性和可重现性的评估。我们还提出了一个数据集，其中应用了基于微服务应用最新技术的代表性工作负载和技术。该基准和数据集被用于三个案例研究，其特点是动态性、可扩展性和持续交付，以评估和比较最先进的算法，目的是解决微服务中的入侵检测问题。实验显示了该基准的实用性和广泛应用范围，同时也显示了入侵检测算法在不同应用和部署中的能力。

{"title":"Evaluating intrusion detection for microservice applications: Benchmark, dataset, and case studies","authors":"José Flora, Nuno Antunes","doi":"10.1016/j.jss.2024.112142","DOIUrl":"https://doi.org/10.1016/j.jss.2024.112142","url":null,"abstract":"<div><p>Microservices are predominant for cloud-based applications, which serve millions of customers daily, that commonly run business-critical systems on software containers and multi-tenant environments; so, it is of utmost importance to secure these systems. Intrusion detection is a widely applied technique that is now being used in microservices to build behavior detection models and report possible attacks during runtime. However, it is cumbersome to evaluate and compare the effectiveness of different approaches. Standardized frameworks are non-existent and without fairly comparing new techniques to the state-of-the-art, it is difficult to understand their pros and cons. This paper presents a comprehensive approach to evaluate and compare different intrusion detection approaches for microservice applications. A benchmarking methodology is proposed to allow users to standardize the process for a representative and reproducible evaluation. We also present a dataset that applies representative workloads and technologies based on microservice applications state-of-the-art. The benchmark and dataset are used in three case studies, characterized by dynamicity, scalability, and continuous delivery, to evaluate and compare state-of-the-art algorithms with the objective of tackling intrusion detection in microservices. Experiments show the usefulness and wide application range of the benchmark while showing the capacity of intrusion detection algorithms in different applications and deployments.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0164121224001870/pdfft?md5=1ab4c9f3abdbc617bc3ae531a2af714f&pid=1-s2.0-S0164121224001870-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141541801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0