Information and Software Technology最新文献_第2页

Energize sustainability: EnSAF for sustainability aware, software intensive energy management systems 为可持续发展注入活力：EnSAF 用于可持续发展意识、软件密集型能源管理系统

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-26 DOI: 10.1016/j.infsof.2024.107607

Anjana M.S. , Patricia Lago , Aryadevi Remanidevi Devidas , Maneesha Vinodini Ramesh

Context:

India’s coal use for electricity jumped 13% in 2021–22. Energy management systems (EnMS) are seen as a solution, but only sustainable EnMS can have a discernable impact on the carbon footprint and the Return On Investment (ROI).

Objective:

Designing a software-intensive sustainable energy management system requires considering technical, environmental, social, and economic factors. This helps evaluate an EnMS’s overall impact and improve its design. We proposed EnSAF for efficient utilization of the energy incurred for the design of sustainability-aware EnMSs.

Method:

In this work, EnMSs in diverse use cases were selected and analyzed in terms of technical, social, environmental, and economic dimensions of sustainability in collaboration with various stakeholders. The set of application-specific design concerns and Quality Attributes (QAs) were addressed by the Sustainability Assessment Framework (SAF) toolkit. The resultant SAF instances of each EnMS, derived through the analysis and discussion with the stakeholders, were then analyzed to advocate the DMs and SQ model for generic EnMSs.

Results:

This study demonstrated the following outcomes (i) technical concerns dominate the existing EnMSs (ii) integration of renewable energy resources reduces dependency to the main power grid and nurtures a sustainable environment by diminishing carbon footprint, and minimizing payback time, in the economic dimension; (iii) extant definitions of quality attributes need significant scrutiny and updates apropos of objectives of EnMSs

Conclusion:

The SAF toolkit was found to be deficient in the representation of relevant design concerns and quality attributes concomitant with sustainable EnMS. Prevailing DMs are inept to factor in stakeholder’s concerns, as the model is ill-equipped to account for spatio-temporal representation of QAs. Pursuant to the insights from the 4 SAF instances, a generic framework, EnSAF, is proposed to tackle the relevant concerns apropos of EnMS sustainability. This work proposed a representation of DMs in the SAF toolkit specifically for sustainability-aware EnMS.

背景：2021-22 年，印度的煤炭用电量猛增 13%。目标：设计软件密集型可持续能源管理系统需要考虑技术、环境、社会和经济因素。这有助于评估能源管理系统的整体影响并改进其设计。方法：在这项工作中，我们与各利益相关方合作，从可持续发展的技术、社会、环境和经济维度选择并分析了不同用例中的 EnMS。可持续性评估框架（SAF）工具包解决了一系列特定应用的设计问题和质量属性（QA）。通过分析和与利益相关者的讨论，得出了每个 EnMS 的 SAF 实例，然后对其进行分析，以倡导通用 EnMS 的 DM 和 SQ 模型。结果：这项研究显示了以下结果：（i）技术问题在现有的 EnMS 中占主导地位；（ii）可再生能源资源的整合减少了对主电网的依赖，并通过减少碳足迹和最大限度地缩短投资回收期，在经济层面上培育了可持续发展的环境；（iii）现有的质量属性定义需要仔细审查，并根据 EnMS 的目标进行更新。现行的 DM 无法将利益相关者的关切考虑在内，因为该模型不具备考虑质量保证的时空表现的能力。根据从 4 个 SAF 案例中获得的启示，我们提出了一个通用框架 EnSAF，以解决与 EnMS 可持续性相关的问题。这项工作在 SAF 工具包中提出了一种 DMs 表示法，专门用于具有可持续性意识的 EnMS。

{"title":"Energize sustainability: EnSAF for sustainability aware, software intensive energy management systems","authors":"Anjana M.S. , Patricia Lago , Aryadevi Remanidevi Devidas , Maneesha Vinodini Ramesh","doi":"10.1016/j.infsof.2024.107607","DOIUrl":"10.1016/j.infsof.2024.107607","url":null,"abstract":"<div><h3>Context:</h3><div>India’s coal use for electricity jumped 13% in 2021–22. Energy management systems (EnMS) are seen as a solution, but only sustainable EnMS can have a discernable impact on the carbon footprint and the Return On Investment (ROI).</div></div><div><h3>Objective:</h3><div>Designing a software-intensive sustainable energy management system requires considering technical, environmental, social, and economic factors. This helps evaluate an EnMS’s overall impact and improve its design. We proposed EnSAF for efficient utilization of the energy incurred for the design of sustainability-aware EnMSs.</div></div><div><h3>Method:</h3><div>In this work, EnMSs in diverse use cases were selected and analyzed in terms of technical, social, environmental, and economic dimensions of sustainability in collaboration with various stakeholders. The set of application-specific design concerns and Quality Attributes (QAs) were addressed by the Sustainability Assessment Framework (SAF) toolkit. The resultant SAF instances of each EnMS, derived through the analysis and discussion with the stakeholders, were then analyzed to advocate the DMs and SQ model for generic EnMSs.</div></div><div><h3>Results:</h3><div>This study demonstrated the following outcomes (i) technical concerns dominate the existing EnMSs (ii) integration of renewable energy resources reduces dependency to the main power grid and nurtures a sustainable environment by diminishing carbon footprint, and minimizing payback time, in the economic dimension; (iii) extant definitions of quality attributes need significant scrutiny and updates apropos of objectives of EnMSs</div></div><div><h3>Conclusion:</h3><div>The SAF toolkit was found to be deficient in the representation of relevant design concerns and quality attributes concomitant with sustainable EnMS. Prevailing DMs are inept to factor in stakeholder’s concerns, as the model is ill-equipped to account for spatio-temporal representation of QAs. Pursuant to the insights from the 4 SAF instances, a generic framework, EnSAF, is proposed to tackle the relevant concerns apropos of EnMS sustainability. This work proposed a representation of DMs in the SAF toolkit specifically for sustainability-aware EnMS.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107607"},"PeriodicalIF":3.8,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142561364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FOBICS: Assessing project security level through a metrics framework that evaluates DevSecOps performance FOBICS：通过评估 DevSecOps 性能的度量框架来评估项目安全级别

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-21 DOI: 10.1016/j.infsof.2024.107605

Alessandro Caniglia, Vincenzo Dentamaro, Stefano Galantucci, Donato Impedovo

Context:

In today’s software development landscape, the DevSecOps approach has gained traction due to its focus on the software development process and bolstering security measures in projects, a task in light of the ever-evolving cybersecurity threats.

Objective:

This study aims to address the lack of metrics for quantitatively assessing its efficacy from both security and business logic perspectives.

Methods:

To tackle this issue, the research introduces the Framework of Business Index Concerning Security (FOBICS), a set of metrics designed to enable transparent evaluations of project security. FOBICS considers various perspectives relevant to DevSecOps practices. It includes factors such as project duration and financial outcomes, making it appealing for implementation in business settings.

Results:

The effectiveness of FOBICS is validated theoretically and empirically via its application in two real-world projects: the results from these implementations show a correlation between FOBICS metrics and the security strategies employed as the development methodologies adopted by diverse teams throughout the projects.

Conclusion:

Hence, FOBICS emerges as a tool for assessing and continuously monitoring project security, offering insights into areas of strength and areas that may require enhancement. FOBICS is shown to be effective in assessing the level of DevSecOps implementation. The ease of calculating FOBICS metrics makes them easily interpretable and continuously verifiable. Moreover, FOBICS summarizes most of the other quantitative and qualitative metrics in the literature.

背景：在当今的软件开发领域，DevSecOps 方法因其专注于软件开发流程和加强项目中的安全措施而备受青睐，而这是在网络安全威胁不断发展的情况下的一项任务。方法：为解决这一问题，本研究引入了 "有关安全的业务指标框架"（FOBICS），这是一套旨在对项目安全性进行透明评估的指标。FOBICS 考虑了与 DevSecOps 实践相关的各种观点。结果：通过在两个实际项目中的应用，FOBICS 的有效性得到了理论和经验上的验证：这些实施的结果表明，FOBICS 指标与安全策略之间存在相关性，而安全策略是不同团队在整个项目中采用的开发方法。FOBICS 被证明能有效评估 DevSecOps 的实施水平。FOBICS 指标计算简便，易于解释和持续验证。此外，FOBICS 还总结了文献中的大多数其他定量和定性指标。

{"title":"FOBICS: Assessing project security level through a metrics framework that evaluates DevSecOps performance","authors":"Alessandro Caniglia, Vincenzo Dentamaro, Stefano Galantucci, Donato Impedovo","doi":"10.1016/j.infsof.2024.107605","DOIUrl":"10.1016/j.infsof.2024.107605","url":null,"abstract":"<div><h3>Context:</h3><div>In today’s software development landscape, the DevSecOps approach has gained traction due to its focus on the software development process and bolstering security measures in projects, a task in light of the ever-evolving cybersecurity threats.</div></div><div><h3>Objective:</h3><div>This study aims to address the lack of metrics for quantitatively assessing its efficacy from both security and business logic perspectives.</div></div><div><h3>Methods:</h3><div>To tackle this issue, the research introduces the Framework of Business Index Concerning Security (FOBICS), a set of metrics designed to enable transparent evaluations of project security. FOBICS considers various perspectives relevant to DevSecOps practices. It includes factors such as project duration and financial outcomes, making it appealing for implementation in business settings.</div></div><div><h3>Results:</h3><div>The effectiveness of FOBICS is validated theoretically and empirically via its application in two real-world projects: the results from these implementations show a correlation between FOBICS metrics and the security strategies employed as the development methodologies adopted by diverse teams throughout the projects.</div></div><div><h3>Conclusion:</h3><div>Hence, FOBICS emerges as a tool for assessing and continuously monitoring project security, offering insights into areas of strength and areas that may require enhancement. FOBICS is shown to be effective in assessing the level of DevSecOps implementation. The ease of calculating FOBICS metrics makes them easily interpretable and continuously verifiable. Moreover, FOBICS summarizes most of the other quantitative and qualitative metrics in the literature.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107605"},"PeriodicalIF":3.8,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142526813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic robustness evaluation for automated model selection in operation 运行中自动模型选择的动态稳健性评估

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-21 DOI: 10.1016/j.infsof.2024.107603

Jin Zhang , Jingyue Li , Zhirong Yang

Context:

The increasing use of artificial neural network (ANN) classifiers in systems, especially safety-critical systems (SCSs), requires ensuring their robustness against out-of-distribution (OOD) shifts in operation, which are changes in the underlying data distribution from the data training the classifier. However, measuring the robustness of classifiers in operation with only unlabeled data is challenging. Additionally, machine learning engineers may need to compare different models or versions of the same model and switch to an optimal version based on their robustness.

Objective:

This paper explores the problem of dynamic robustness evaluation for automated model selection. We aim to find efficient and effective metrics for evaluating and comparing the robustness of multiple ANN classifiers using unlabeled operational data.

Methods:

To quantitatively measure the differences between the model outputs and assess robustness under OOD shifts using unlabeled data, we choose distance-based metrics. An empirical comparison of five such metrics, suitable for higher-dimensional data like images, is performed. The selected metrics include Wasserstein distance (WD), maximum mean discrepancy (MMD), Hellinger distance (HL), Kolmogorov–Smirnov statistic (KS), and Kullback–Leibler divergence (KL), known for their efficacy in quantifying distribution differences. We evaluate these metrics on 20 state-of-the-art models (ten CIFAR10-based models, five CIFAR100-based models, and five ImageNet-based models) from a widely used robustness benchmark (RobustBench) using data perturbed with various types and magnitudes of corruptions to mimic real-world OOD shifts.

Results:

Our findings reveal that the WD metric outperforms others when ranking multiple ANN models for CIFAR10- and CIFAR100-based models, while the KS metric demonstrates superior performance for ImageNet-based models. MMD can be used as a reliable second option for both datasets.

Conclusion:

This study highlights the effectiveness of distance-based metrics in ranking models’ robustness for automated model selection. It also emphasizes the significance of advancing research in dynamic robustness evaluation.

背景：人工神经网络（ANN）分类器在系统中的使用越来越多，尤其是在安全关键系统（SCS）中，这就要求确保分类器在运行时具有抗分布外（OOD）偏移的鲁棒性，分布外偏移是指底层数据分布与训练分类器的数据之间的变化。然而，仅使用未标注数据来测量分类器运行时的鲁棒性是一项挑战。此外，机器学习工程师可能需要比较不同的模型或同一模型的不同版本，并根据它们的鲁棒性切换到最佳版本。方法：为了定量测量模型输出之间的差异，并评估使用无标记数据进行 OOD 转换时的鲁棒性，我们选择了基于距离的指标。我们对适用于图像等高维数据的五个此类指标进行了实证比较。所选指标包括瓦瑟斯坦距离（WD）、最大平均差异（MMD）、海林格距离（HL）、科尔莫哥洛夫-斯米尔诺夫统计量（KS）和库尔巴克-莱伯勒发散（KL），这些指标因其在量化分布差异方面的功效而闻名。结果：我们的研究结果表明，在对基于 CIFAR10 和 CIFAR100 的多个 ANN 模型进行排序时，WD 指标优于其他指标，而 KS 指标在基于 ImageNet 的模型中表现优异。结论：本研究强调了基于距离的度量在自动模型选择中对模型鲁棒性排序的有效性。结论：本研究强调了基于距离的度量在自动模型选择中对模型鲁棒性进行排序的有效性，同时也强调了推进动态鲁棒性评估研究的重要意义。

{"title":"Dynamic robustness evaluation for automated model selection in operation","authors":"Jin Zhang , Jingyue Li , Zhirong Yang","doi":"10.1016/j.infsof.2024.107603","DOIUrl":"10.1016/j.infsof.2024.107603","url":null,"abstract":"<div><h3>Context:</h3><div>The increasing use of artificial neural network (ANN) classifiers in systems, especially safety-critical systems (SCSs), requires ensuring their robustness against out-of-distribution (OOD) shifts in operation, which are changes in the underlying data distribution from the data training the classifier. However, measuring the robustness of classifiers in operation with only unlabeled data is challenging. Additionally, machine learning engineers may need to compare different models or versions of the same model and switch to an optimal version based on their robustness.</div></div><div><h3>Objective:</h3><div>This paper explores the problem of dynamic robustness evaluation for automated model selection. We aim to find efficient and effective metrics for evaluating and comparing the robustness of multiple ANN classifiers using unlabeled operational data.</div></div><div><h3>Methods:</h3><div>To quantitatively measure the differences between the model outputs and assess robustness under OOD shifts using unlabeled data, we choose distance-based metrics. An empirical comparison of five such metrics, suitable for higher-dimensional data like images, is performed. The selected metrics include Wasserstein distance (WD), maximum mean discrepancy (MMD), Hellinger distance (HL), Kolmogorov–Smirnov statistic (KS), and Kullback–Leibler divergence (KL), known for their efficacy in quantifying distribution differences. We evaluate these metrics on 20 state-of-the-art models (ten CIFAR10-based models, five CIFAR100-based models, and five ImageNet-based models) from a widely used robustness benchmark (<strong>RobustBench</strong>) using data perturbed with various types and magnitudes of corruptions to mimic real-world OOD shifts.</div></div><div><h3>Results:</h3><div>Our findings reveal that the WD metric outperforms others when ranking multiple ANN models for CIFAR10- and CIFAR100-based models, while the KS metric demonstrates superior performance for ImageNet-based models. MMD can be used as a reliable second option for both datasets.</div></div><div><h3>Conclusion:</h3><div>This study highlights the effectiveness of distance-based metrics in ranking models’ robustness for automated model selection. It also emphasizes the significance of advancing research in dynamic robustness evaluation.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107603"},"PeriodicalIF":3.8,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142539048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A survey on Cryptoagility and Agile Practices in the light of quantum resistance 从量子阻力看加密敏捷性和敏捷实践调查

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-21 DOI: 10.1016/j.infsof.2024.107604

Lodovica Marchesi , Michele Marchesi , Roberto Tonelli

Context:

Crypto-agility, a name that stems from agile methodologies for software development, means the ability to modify quickly and securely cryptographic algorithms in the event of a compromise. The advent of quantum computing poses existential threats to current cryptography, having the power to breach current cryptography systems.

Objective:

We investigated whether and to what extent agile practices for software development are suited to support crypto-agility, or not. In particular, we discuss their usefulness in the context of substituting current algorithms with quantum-resistant ones.

Method:

First, we analyzed the literature to define a subset of 15 agile practices potentially relevant to cryptographic software development. Then, we developed a questionnaire to assess the suitability of agile practices for obtaining crypto-agility. We performed a Web search of relevant documents about crypto-agility and quantum resistance and sent their authors the questionnaire. We also sent the questionnaire to cybersecurity officers of four Italian firms. We analyzed and discussed the responses to 32 valid questionnaires.

Results:

The respondents’ affiliations are evenly distributed between researchers and developers. Most of them are active, or somehow active, in quantum-resistant cryptography and use agile methods. Most of the agile practices are deemed to be quite useful, or very useful to get crypto-agility, the most effective being Continuous Integration and Coding Standards; the least appreciated is Self-organizing Team.

Conclusion:

According to researchers and developers working in the field, the safe transition of cryptographic algorithms to quantum-resistant ones can benefit from the adoption of many agile practices. Further software engineering research is needed to integrate agile practices in more formal cryptographic software development processes.

背景：密码敏捷性（Crypto-agility）这一名称源于软件开发的敏捷方法，指的是在密码遭到破坏的情况下快速、安全地修改密码算法的能力。量子计算的出现对当前的加密技术构成了生存威胁，它有能力攻破当前的加密系统。方法：首先，我们分析了文献，定义了可能与加密软件开发相关的 15 种敏捷实践子集。然后，我们编制了一份调查问卷，以评估敏捷实践对获得密码敏捷性的适用性。我们对有关密码敏捷性和量子抗性的相关文档进行了网络搜索，并向其作者发送了调查问卷。我们还向四家意大利公司的网络安全官员发送了调查问卷。我们对 32 份有效问卷的答复进行了分析和讨论。他们中的大多数人都活跃于或在某种程度上活跃于抗量子密码学领域，并使用敏捷方法。结论：根据在该领域工作的研究人员和开发人员的意见，采用多种敏捷方法可使密码算法安全过渡到抗量子算法。需要进一步开展软件工程研究，将敏捷实践融入更正规的密码软件开发流程。

{"title":"A survey on Cryptoagility and Agile Practices in the light of quantum resistance","authors":"Lodovica Marchesi , Michele Marchesi , Roberto Tonelli","doi":"10.1016/j.infsof.2024.107604","DOIUrl":"10.1016/j.infsof.2024.107604","url":null,"abstract":"<div><h3>Context:</h3><div>Crypto-agility, a name that stems from agile methodologies for software development, means the ability to modify quickly and securely cryptographic algorithms in the event of a compromise. The advent of quantum computing poses existential threats to current cryptography, having the power to breach current cryptography systems.</div></div><div><h3>Objective:</h3><div>We investigated whether and to what extent agile practices for software development are suited to support crypto-agility, or not. In particular, we discuss their usefulness in the context of substituting current algorithms with quantum-resistant ones.</div></div><div><h3>Method:</h3><div>First, we analyzed the literature to define a subset of 15 agile practices potentially relevant to cryptographic software development. Then, we developed a questionnaire to assess the suitability of agile practices for obtaining crypto-agility. We performed a Web search of relevant documents about crypto-agility and quantum resistance and sent their authors the questionnaire. We also sent the questionnaire to cybersecurity officers of four Italian firms. We analyzed and discussed the responses to 32 valid questionnaires.</div></div><div><h3>Results:</h3><div>The respondents’ affiliations are evenly distributed between researchers and developers. Most of them are active, or somehow active, in quantum-resistant cryptography and use agile methods. Most of the agile practices are deemed to be quite useful, or very useful to get crypto-agility, the most effective being Continuous Integration and Coding Standards; the least appreciated is Self-organizing Team.</div></div><div><h3>Conclusion:</h3><div>According to researchers and developers working in the field, the safe transition of cryptographic algorithms to quantum-resistant ones can benefit from the adoption of many agile practices. Further software engineering research is needed to integrate agile practices in more formal cryptographic software development processes.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107604"},"PeriodicalIF":3.8,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142526812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

E-code: Mastering efficient code generation through pretrained models and expert encoder group 电子编码：通过预训练模型和专家编码器小组掌握高效代码生成

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-20 DOI: 10.1016/j.infsof.2024.107602

Yue Pan , Chen Lyu , Zhenyu Yang , Lantian Li , Qi Liu , Xiuting Shao

Context:

With the waning of Moore’s Law, the software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results of software performance optimization have been on the rise in recent years, especially with the advancement propelled by Large Language Models (LLMs). However, traditional strategies for rectifying performance flaws have shown significant limitations at the competitive code efficiency optimization level, and research on this topic is surprisingly scarce.

Objective:

This study aims to address the research gap in this domain, offering practical solutions to the various challenges encountered. Specifically, we have overcome the constraints of traditional performance error rectification strategies and developed a Language Model (LM) tailored for the competitive code efficiency optimization realm.

Methods:

We introduced E-code, an advanced program synthesis LM. Inspired by the recent success of expert LMs, we designed an innovative structure called the Expert Encoder Group. This structure employs multiple expert encoders to extract features tailored for different input types. We assessed the performance of E-code against other leading models on a competitive dataset and conducted in-depth ablation experiments.

Results:

Upon systematic evaluation, E-code achieved a 54.98% improvement in code efficiency, significantly outperforming other advanced models. In the ablation experiments, we further validated the significance of the expert encoder group and other components within E-code.

Conclusion:

The research findings indicate that the expert encoder group can effectively handle various inputs in efficiency optimization tasks, significantly enhancing the model’s performance. In summary, this study paves new avenues for developing systems and methods to assist programmers in writing efficient code.

背景：随着摩尔定律的消退，软件行业越来越重视寻找其他解决方案来不断提高性能。近年来，软件性能优化的重要性和研究成果不断增加，尤其是在大型语言模型（LLM）的推动下。然而，传统的性能缺陷修正策略在竞争性代码效率优化层面显示出明显的局限性，而这方面的研究却少得令人吃惊。具体来说，我们克服了传统性能纠错策略的限制，开发了一种专为竞争性代码效率优化领域量身定制的语言模型（LM）。受最近成功的专家 LM 的启发，我们设计了一种名为专家编码器组的创新结构。这种结构采用多个专家编码器，针对不同的输入类型提取特征。结果：经过系统评估，E-code 的代码效率提高了 54.98%，明显优于其他先进模型。在消融实验中，我们进一步验证了专家编码器组和 E-code 中其他组件的重要性。结论：研究结果表明，专家编码器组能有效处理效率优化任务中的各种输入，从而大大提高了模型的性能。总之，这项研究为开发帮助程序员编写高效代码的系统和方法铺平了新的道路。

{"title":"E-code: Mastering efficient code generation through pretrained models and expert encoder group","authors":"Yue Pan , Chen Lyu , Zhenyu Yang , Lantian Li , Qi Liu , Xiuting Shao","doi":"10.1016/j.infsof.2024.107602","DOIUrl":"10.1016/j.infsof.2024.107602","url":null,"abstract":"<div><h3>Context:</h3><div>With the waning of Moore’s Law, the software industry is placing increasing importance on finding alternative solutions for continuous performance enhancement. The significance and research results of software performance optimization have been on the rise in recent years, especially with the advancement propelled by <strong>L</strong>arge <strong>L</strong>anguage <strong>M</strong>odel<strong>s</strong> (LLMs). However, traditional strategies for rectifying performance flaws have shown significant limitations at the competitive code efficiency optimization level, and research on this topic is surprisingly scarce.</div></div><div><h3>Objective:</h3><div>This study aims to address the research gap in this domain, offering practical solutions to the various challenges encountered. Specifically, we have overcome the constraints of traditional performance error rectification strategies and developed a <strong>L</strong>anguage <strong>M</strong>odel (LM) tailored for the competitive code efficiency optimization realm.</div></div><div><h3>Methods:</h3><div>We introduced E-code, an advanced program synthesis LM. Inspired by the recent success of expert LMs, we designed an innovative structure called the Expert Encoder Group. This structure employs multiple expert encoders to extract features tailored for different input types. We assessed the performance of E-code against other leading models on a competitive dataset and conducted in-depth ablation experiments.</div></div><div><h3>Results:</h3><div>Upon systematic evaluation, E-code achieved a 54.98% improvement in code efficiency, significantly outperforming other advanced models. In the ablation experiments, we further validated the significance of the expert encoder group and other components within E-code.</div></div><div><h3>Conclusion:</h3><div>The research findings indicate that the expert encoder group can effectively handle various inputs in efficiency optimization tasks, significantly enhancing the model’s performance. In summary, this study paves new avenues for developing systems and methods to assist programmers in writing efficient code.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107602"},"PeriodicalIF":3.8,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Systematic review on the current state of computer-supported argumentation learning systems 关于计算机支持的论证学习系统现状的系统回顾

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-18 DOI: 10.1016/j.infsof.2024.107598

Laura Sinikallio , Lili Aunimo , Tomi Männistö

Context:

Argumentation is a fundamental part of learning, communication and problem-solving not only in software engineering but all education. Teaching argumentation is a long-standing practice, and with the advance of digital learning, it, too, has been transitioning to an online format.

Objective:

As computer-supported argumentation learning progresses, other learning domains have much to learn from it on how to enable argumentation and reasoning in automated and scalable online learning solutions.

Methods:

To review the current state of the field, we conducted a systematic literature review on the last decade of academic research and design on computer-supported argumentation learning systems. We reviewed and summarised the central aspects and approaches of reported systems.

Results:

We reviewed 34 different argumentation learning tools. The review showed that approaches to computer-supported argumentation vary significantly in many aspects, e.g., argumentation theory, learning task types and collaboration status. However, the use of argumentation graphs is quite common. Most modern tools seem to embrace the role of feedback in learning.

Conclusions:

The role of individual learning has risen in computer-supported argumentation learning. This is in opposition to previous predictions and statements on the role of collaborative learning of argumentation. Automated feedback has, on the other hand, become commonplace in collaborative and individual-use argumentation learning tools. The modern generation of argumentation teaching tools is Web-based but recently we have also seen the emergence of mobile-based solutions.

背景：不仅在软件工程领域，而且在所有教育领域，论证都是学习、交流和解决问题的基本要素。目标：随着计算机支持的论证学习的发展，其他学习领域也可以从中学到很多关于如何在自动化和可扩展的在线学习解决方案中实现论证和推理的知识。方法：为了回顾该领域的现状，我们对过去十年关于计算机支持的论证学习系统的学术研究和设计进行了系统的文献回顾。结果：我们查阅了 34 种不同的论证学习工具。结果：我们对 34 种不同的论证学习工具进行了审查，审查结果表明，计算机支持的论证方法在许多方面都存在显著差异，例如论证理论、学习任务类型和协作状态。不过，论证图的使用相当普遍。结论：在计算机支持的论证学习中，个人学习的作用已经上升。结论：在计算机支持的论证学习中，个人学习的作用有所上升，这与之前关于论证协作学习作用的预测和声明相反。另一方面，在协作式和个人使用的论证学习工具中，自动反馈已变得司空见惯。新一代的论证教学工具都是基于网络的，但最近我们也看到了基于移动设备的解决方案。

{"title":"Systematic review on the current state of computer-supported argumentation learning systems","authors":"Laura Sinikallio , Lili Aunimo , Tomi Männistö","doi":"10.1016/j.infsof.2024.107598","DOIUrl":"10.1016/j.infsof.2024.107598","url":null,"abstract":"<div><h3>Context:</h3><div>Argumentation is a fundamental part of learning, communication and problem-solving not only in software engineering but all education. Teaching argumentation is a long-standing practice, and with the advance of digital learning, it, too, has been transitioning to an online format.</div></div><div><h3>Objective:</h3><div>As computer-supported argumentation learning progresses, other learning domains have much to learn from it on how to enable argumentation and reasoning in automated and scalable online learning solutions.</div></div><div><h3>Methods:</h3><div>To review the current state of the field, we conducted a systematic literature review on the last decade of academic research and design on computer-supported argumentation learning systems. We reviewed and summarised the central aspects and approaches of reported systems.</div></div><div><h3>Results:</h3><div>We reviewed 34 different argumentation learning tools. The review showed that approaches to computer-supported argumentation vary significantly in many aspects, e.g., argumentation theory, learning task types and collaboration status. However, the use of argumentation graphs is quite common. Most modern tools seem to embrace the role of feedback in learning.</div></div><div><h3>Conclusions:</h3><div>The role of individual learning has risen in computer-supported argumentation learning. This is in opposition to previous predictions and statements on the role of collaborative learning of argumentation. Automated feedback has, on the other hand, become commonplace in collaborative and individual-use argumentation learning tools. The modern generation of argumentation teaching tools is Web-based but recently we have also seen the emergence of mobile-based solutions.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107598"},"PeriodicalIF":3.8,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142526861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enabling efficient and low-effort decentralized federated learning with the EdgeFL framework 利用 EdgeFL 框架实现高效、低功耗的分散式联合学习

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-17 DOI: 10.1016/j.infsof.2024.107600

Hongyi Zhang , Jan Bosch , Helena Holmström Olsson

Context:

Federated Learning (FL) has gained prominence as a solution for preserving data privacy in machine learning applications. However, existing FL frameworks pose challenges for software engineers due to implementation complexity, limited customization options, and scalability issues. These limitations prevent the practical deployment of FL, especially in dynamic and resource-constrained edge environments, preventing its widespread adoption.

Objective:

To address these challenges, we propose EdgeFL, an efficient and low-effort FL framework designed to overcome centralized aggregation, implementation complexity and scalability limitations. EdgeFL applies a decentralized architecture that eliminates reliance on a central server by enabling direct model training and aggregation among edge nodes, which enhances fault tolerance and adaptability to diverse edge environments.

Methods:

We conducted experiments and a case study to demonstrate the effectiveness of EdgeFL. Our approach focuses on reducing weight update latency and facilitating faster model evolution on edge devices.

Results:

Our findings indicate that EdgeFL outperforms existing FL frameworks in terms of learning efficiency and performance. By enabling quicker model evolution on edge devices, EdgeFL enhances overall efficiency and responsiveness to changing data patterns.

Conclusion:

EdgeFL offers a solution for software engineers and companies seeking the benefits of FL, while effectively overcoming the challenges and privacy concerns associated with traditional FL frameworks. Its decentralized approach, simplified implementation, combined with enhanced customization and fault tolerance, make it suitable for diverse applications and industries.

背景：作为一种在机器学习应用中保护数据隐私的解决方案，联合学习（FL）的地位日益突出。然而，由于实施复杂、定制选项有限以及可扩展性问题，现有的联合学习框架给软件工程师带来了挑战。这些限制阻碍了 FL 的实际部署，尤其是在动态和资源受限的边缘环境中，从而阻碍了 FL 的广泛应用。目标：为了应对这些挑战，我们提出了 EdgeFL，一个高效、低功耗的 FL 框架，旨在克服集中式聚合、实施复杂性和可扩展性的限制。EdgeFL采用分散式架构，通过在边缘节点之间直接进行模型训练和聚合，消除了对中央服务器的依赖，从而增强了容错能力和对多样化边缘环境的适应性。结果：我们的研究结果表明，EdgeFL在学习效率和性能方面优于现有的FL框架。结论：EdgeFL 为寻求 FL 优势的软件工程师和公司提供了一种解决方案，同时有效克服了与传统 FL 框架相关的挑战和隐私问题。其分散式方法、简化的实施以及增强的定制和容错能力，使其适用于各种应用和行业。

{"title":"Enabling efficient and low-effort decentralized federated learning with the EdgeFL framework","authors":"Hongyi Zhang , Jan Bosch , Helena Holmström Olsson","doi":"10.1016/j.infsof.2024.107600","DOIUrl":"10.1016/j.infsof.2024.107600","url":null,"abstract":"<div><h3>Context:</h3><div>Federated Learning (FL) has gained prominence as a solution for preserving data privacy in machine learning applications. However, existing FL frameworks pose challenges for software engineers due to implementation complexity, limited customization options, and scalability issues. These limitations prevent the practical deployment of FL, especially in dynamic and resource-constrained edge environments, preventing its widespread adoption.</div></div><div><h3>Objective:</h3><div>To address these challenges, we propose EdgeFL, an efficient and low-effort FL framework designed to overcome centralized aggregation, implementation complexity and scalability limitations. EdgeFL applies a decentralized architecture that eliminates reliance on a central server by enabling direct model training and aggregation among edge nodes, which enhances fault tolerance and adaptability to diverse edge environments.</div></div><div><h3>Methods:</h3><div>We conducted experiments and a case study to demonstrate the effectiveness of EdgeFL. Our approach focuses on reducing weight update latency and facilitating faster model evolution on edge devices.</div></div><div><h3>Results:</h3><div>Our findings indicate that EdgeFL outperforms existing FL frameworks in terms of learning efficiency and performance. By enabling quicker model evolution on edge devices, EdgeFL enhances overall efficiency and responsiveness to changing data patterns.</div></div><div><h3>Conclusion:</h3><div>EdgeFL offers a solution for software engineers and companies seeking the benefits of FL, while effectively overcoming the challenges and privacy concerns associated with traditional FL frameworks. Its decentralized approach, simplified implementation, combined with enhanced customization and fault tolerance, make it suitable for diverse applications and industries.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107600"},"PeriodicalIF":3.8,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142526811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Causal reasoning in Software Quality Assurance: A systematic review 软件质量保证中的因果推理：系统回顾

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-15 DOI: 10.1016/j.infsof.2024.107599

Luca Giamattei, Antonio Guerriero, Roberto Pietrantuono, Stefano Russo

Context:

Software Quality Assurance (SQA) is a fundamental part of software engineering to ensure stakeholders that software products work as expected after release in operation. Machine Learning (ML) has proven to be able to boost SQA activities and contribute to the development of quality software systems. In this context, Causal Reasoning is gaining increasing interest as a methodology to go beyond a purely data-driven approach by exploiting the use of causality for more effective SQA strategies.

Objective:

Provide a broad and detailed overview of the use of causal reasoning for SQA activities, in order to support researchers to access this research field, identifying room for application, main challenges and research opportunities.

Methods:

A systematic review of the scientific literature on causal reasoning for SQA. The study has found, classified, and analyzed 86 articles, according to established guidelines for software engineering secondary studies.

Results:

Results highlight the primary areas within SQA where causal reasoning has been applied, the predominant methodologies used, and the level of maturity of the proposed solutions. Fault localization is the activity where causal reasoning is more exploited, especially in the web services/microservices domain, but other tasks like testing are rapidly gaining popularity. Both causal inference and causal discovery are exploited, with the Pearl’s graphical formulation of causality being preferred, likely due to its intuitiveness. Tools to favor their application are appearing at a fast pace — most of them after 2021.

Conclusions:

The findings show that causal reasoning is a valuable means for SQA tasks with respect to multiple quality attributes, especially during V&V, evolution and maintenance to ensure reliability, while it is not yet fully exploited for phases like requirements engineering and design. We give a picture of the current landscape, pointing out exciting possibilities for future research.

背景：软件质量保证（SQA）是软件工程的一个基本组成部分，旨在确保利益相关者的软件产品在发布投入使用后能够按照预期运行。事实证明，机器学习（ML）能够促进 SQA 活动，并为高质量软件系统的开发做出贡献。在此背景下，因果推理作为一种超越纯粹数据驱动方法的方法论，通过利用因果关系制定更有效的 SQA 战略，正受到越来越多的关注。目标：提供关于因果推理在 SQA 活动中的应用的广泛而详细的概述，以支持研究人员进入这一研究领域，确定应用空间、主要挑战和研究机会。研究根据软件工程二次研究的既定准则，对 86 篇文章进行了查找、分类和分析。结果：结果强调了 SQA 中应用因果推理的主要领域、使用的主要方法以及所提解决方案的成熟度。故障定位是因果推理应用较多的活动，尤其是在网络服务/微服务领域，但测试等其他任务也在迅速普及。因果推理和因果发现都得到了利用，Pearl 的因果关系图形表述更受青睐，这可能是由于其直观性。结论：研究结果表明，因果推理是针对多种质量属性的 SQA 任务的重要手段，尤其是在 V&V、演进和维护过程中，以确保可靠性，而在需求工程和设计等阶段尚未得到充分利用。我们介绍了当前的情况，并指出了未来研究的可能性。

{"title":"Causal reasoning in Software Quality Assurance: A systematic review","authors":"Luca Giamattei, Antonio Guerriero, Roberto Pietrantuono, Stefano Russo","doi":"10.1016/j.infsof.2024.107599","DOIUrl":"10.1016/j.infsof.2024.107599","url":null,"abstract":"<div><h3>Context:</h3><div>Software Quality Assurance (SQA) is a fundamental part of software engineering to ensure stakeholders that software products work as expected after release in operation. Machine Learning (ML) has proven to be able to boost SQA activities and contribute to the development of quality software systems. In this context, <em>Causal Reasoning</em> is gaining increasing interest as a methodology to go beyond a purely data-driven approach by exploiting the use of causality for more effective SQA strategies.</div></div><div><h3>Objective:</h3><div>Provide a broad and detailed overview of the use of causal reasoning for SQA activities, in order to support researchers to access this research field, identifying room for application, main challenges and research opportunities.</div></div><div><h3>Methods:</h3><div>A systematic review of the scientific literature on causal reasoning for SQA. The study has found, classified, and analyzed 86 articles, according to established guidelines for software engineering secondary studies.</div></div><div><h3>Results:</h3><div>Results highlight the primary areas within SQA where causal reasoning has been applied, the predominant methodologies used, and the level of maturity of the proposed solutions. Fault localization is the activity where causal reasoning is more exploited, especially in the web services/microservices domain, but other tasks like testing are rapidly gaining popularity. Both causal inference and causal discovery are exploited, with the Pearl’s graphical formulation of causality being preferred, likely due to its intuitiveness. Tools to favor their application are appearing at a fast pace — most of them after 2021.</div></div><div><h3>Conclusions:</h3><div>The findings show that causal reasoning is a valuable means for SQA tasks with respect to multiple quality attributes, especially during V&V, evolution and maintenance to ensure reliability, while it is not yet fully exploited for phases like requirements engineering and design. We give a picture of the current landscape, pointing out exciting possibilities for future research.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107599"},"PeriodicalIF":3.8,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detecting and Explaining Python Name Errors 检测和解释 Python 名称错误

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-11 DOI: 10.1016/j.infsof.2024.107592

Jiawei Wang , Li Li , Kui Liu , Xiaoning Du

Python has become one of the most popular programming languages nowadays but has not received enough attention from the software engineering community. Many errors, either fixed or not yet, have been scattered in the lifetime of Python projects, including popular Python libraries that have already been reused. NameError is among one of those errors that are widespread in the Python community, as confirmed in our empirical study. Yet, our community has not put effort into helping developers mitigate its introductions. To fill this gap, we propose in this work a static analysis-based approach called DENE (short for Detecting and Explaining Name Errors) to automatically detect and explain name errors in Python projects. To this end, DENE builds control-flow graphs for Python projects and leverages a scope-aware reaching definition analysis to locate identifiers that may cause name errors at runtime and report their locations. Experimental results on carefully crafted ground truth demonstrate that DENE is effective in detecting name errors in real-world Python projects. The results also confirm that unknown name errors are still widely presented in popular Python projects and libraries, and the outputs of DENE can indeed help developers understand why the name errors are flagged as such.

Python 已成为当今最流行的编程语言之一，但却没有得到软件工程社区的足够重视。在 Python 项目（包括已被重用的流行 Python 库）的生命周期中，散布着许多已修复或尚未修复的错误。我们的实证研究证实，NameError 就是 Python 社区中普遍存在的错误之一。然而，我们的社区并没有努力帮助开发者减少其引入。为了填补这一空白，我们在这项工作中提出了一种基于静态分析的方法，称为 DENE（Detecting and Explaining Name Errors 的缩写），用于自动检测和解释 Python 项目中的名称错误。为此，DENE 构建了 Python 项目的控制流图，并利用范围感知达到定义分析来定位可能在运行时导致名称错误的标识符，并报告其位置。在精心制作的基本事实基础上进行的实验结果表明，DENE 能有效检测真实 Python 项目中的名称错误。实验结果还证实，未知名称错误仍然广泛存在于流行的 Python 项目和库中，而 DENE 的输出结果确实可以帮助开发人员理解名称错误被标记为未知名称错误的原因。

{"title":"Detecting and Explaining Python Name Errors","authors":"Jiawei Wang , Li Li , Kui Liu , Xiaoning Du","doi":"10.1016/j.infsof.2024.107592","DOIUrl":"10.1016/j.infsof.2024.107592","url":null,"abstract":"<div><div>Python has become one of the most popular programming languages nowadays but has not received enough attention from the software engineering community. Many errors, either fixed or not yet, have been scattered in the lifetime of Python projects, including popular Python libraries that have already been reused. NameError is among one of those errors that are widespread in the Python community, as confirmed in our empirical study. Yet, our community has not put effort into helping developers mitigate its introductions. To fill this gap, we propose in this work a static analysis-based approach called <em>DENE</em> (short for <strong>D</strong>etecting and <strong>E</strong>xplaining <strong>N</strong>ame <strong>E</strong>rrors) to automatically detect and explain name errors in Python projects. To this end, <em>DENE</em> builds control-flow graphs for Python projects and leverages a scope-aware reaching definition analysis to locate identifiers that may cause name errors at runtime and report their locations. Experimental results on carefully crafted ground truth demonstrate that <em>DENE</em> is effective in detecting name errors in real-world Python projects. The results also confirm that unknown name errors are still widely presented in popular Python projects and libraries, and the outputs of <em>DENE</em> can indeed help developers understand why the name errors are flagged as such.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"178 ","pages":"Article 107592"},"PeriodicalIF":3.8,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142527296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Better together: Automated app review analysis with deep multi-task learning 更好地合作：利用深度多任务学习自动分析应用程序评论

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology

Pub Date : 2024-10-10 DOI: 10.1016/j.infsof.2024.107597

Yawen Wang , Junjie Wang , Hongyu Zhang , Xuran Ming , Qing Wang

Context:

User reviews of mobile apps provide an important communication channel between developers and users. Existing approaches to automated app review analysis mainly focus on one task (e.g., bug classification task, information extraction task, etc.) at a time, and are often constrained by the manually defined patterns and the ignorance of the correlations among the tasks. Recently, multi-task learning (MTL) has been successfully applied in many scenarios, with the potential to address the limitations associated with app review mining tasks.

Objective:

In this paper, we propose MABLE, a deep MTL-based and semantic-aware approach, to improve app review analysis by exploiting task correlations.

Methods:

MABLE jointly identifies the types of involved bugs reported in the review and extracts the fine-grained features where bugs might occur. It consists of three main phases: (1) data preparation phase, which prepares data to allow data sharing beyond single task learning; (2) model construction phase, which employs a BERT model as the shared representation layer to capture the semantic meanings of reviews, and task-specific layers to model two tasks in parallel; (3) model training phase, which enables eavesdropping by shared loss function between the two related tasks.

Results:

Evaluation results on six apps show that MABLE outperforms ten commonly-used and state-of-the-art baselines, with the precision of 79.76% and the recall of 79.24% for classifying bugs, and the precision of 79.83% and the recall of 80.33% for extracting problematic app features. The MTL mechanism improves the F-measure of two tasks by 3.80% and 4.63%, respectively.

Conclusion:

The proposed approach provides a novel and effective way to jointly learn two related review analysis tasks, and sheds light on exploring other review mining tasks.

背景：移动应用程序的用户评论是开发人员与用户之间的重要沟通渠道。现有的应用程序评论自动分析方法主要一次只关注一项任务（如错误分类任务、信息提取任务等），而且往往受制于人工定义的模式以及对任务间相关性的不了解。最近，多任务学习（Multi-task Learning，MTL）已成功应用于许多场景，有望解决应用程序评论挖掘任务的相关限制。方法：MABLE 联合识别评论中报告的相关错误类型，并提取可能出现错误的细粒度特征。它由三个主要阶段组成：（1）数据准备阶段，准备数据以实现数据共享，超越单一任务学习；（2）模型构建阶段，采用 BERT 模型作为共享表示层，捕捉评论的语义，并采用特定任务层对两个任务进行并行建模；（3）模型训练阶段，通过两个相关任务之间的共享损失函数实现窃听。结果：对六款应用程序的评估结果表明，MABLE优于十款常用的和最先进的基线，在对错误进行分类时，MABLE的精确度为79.76%，召回率为79.24%；在提取问题应用程序特征时，MABLE的精确度为79.83%，召回率为80.33%。MTL机制使两项任务的F-measure分别提高了3.80%和4.63%。结论：所提出的方法为联合学习两项相关的评论分析任务提供了一种新颖而有效的方法，并为探索其他评论挖掘任务提供了启示。

{"title":"Better together: Automated app review analysis with deep multi-task learning","authors":"Yawen Wang , Junjie Wang , Hongyu Zhang , Xuran Ming , Qing Wang","doi":"10.1016/j.infsof.2024.107597","DOIUrl":"10.1016/j.infsof.2024.107597","url":null,"abstract":"<div><h3>Context:</h3><div>User reviews of mobile apps provide an important communication channel between developers and users. Existing approaches to automated app review analysis mainly focus on one task (e.g., bug classification task, information extraction task, etc.) at a time, and are often constrained by the manually defined patterns and the ignorance of the correlations among the tasks. Recently, multi-task learning (MTL) has been successfully applied in many scenarios, with the potential to address the limitations associated with app review mining tasks.</div></div><div><h3>Objective:</h3><div>In this paper, we propose <span>MABLE</span>, a deep MTL-based and semantic-aware approach, to improve app review analysis by exploiting task correlations.</div></div><div><h3>Methods:</h3><div><span>MABLE</span> jointly identifies the types of involved bugs reported in the review and extracts the fine-grained features where bugs might occur. It consists of three main phases: (1) data preparation phase, which prepares data to allow data sharing beyond single task learning; (2) model construction phase, which employs a BERT model as the shared representation layer to capture the semantic meanings of reviews, and task-specific layers to model two tasks in parallel; (3) model training phase, which enables eavesdropping by shared loss function between the two related tasks.</div></div><div><h3>Results:</h3><div>Evaluation results on six apps show that <span>MABLE</span> outperforms ten commonly-used and state-of-the-art baselines, with the precision of 79.76% and the recall of 79.24% for classifying bugs, and the precision of 79.83% and the recall of 80.33% for extracting problematic app features. The MTL mechanism improves the F-measure of two tasks by 3.80% and 4.63%, respectively.</div></div><div><h3>Conclusion:</h3><div>The proposed approach provides a novel and effective way to jointly learn two related review analysis tasks, and sheds light on exploring other review mining tasks.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"177 ","pages":"Article 107597"},"PeriodicalIF":3.8,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0