Science of Computer Programming最新文献_第10页

Synthesizing LTL contracts from component libraries using rich counterexamples 利用丰富的反例从组件库中合成 LTL 合约

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-04-05 DOI: 10.1016/j.scico.2024.103116

Antonio Iannopollo , Inigo Incer , Alberto L. Sangiovanni-Vincentelli

We provide a method to synthesize an LTL Assume/Guarantee (A/G) specification, or contract, as an interconnection of elements from a library, each of which is also represented by an LTL A/G contract. Our approach, based on counterexample-guided inductive synthesis, leverages an off-the-shelf model checker to reason about infinite-length counterexamples and guarantee correctness. To increase scalability, we also introduce a novel concept of specification decomposition, based on contract projections; we show how it can be used to break down our synthesis problem into several simpler tasks, without reducing the size of the solution space. We test our technique on three industry-relevant case studies.

我们提供了一种方法，将 LTL 假设/保证（A/G）规范或合约合成为库中元素的互连，每个元素也由 LTL A/G 合约表示。我们的方法以反例引导归纳综合为基础，利用现成的模型检查器来推理无限长的反例并保证正确性。为了提高可扩展性，我们还引入了一个基于合约投影的新概念--规范分解；我们展示了如何利用这一概念将我们的综合问题分解为多个更简单的任务，而不会减小求解空间的大小。我们在三个与行业相关的案例研究中测试了我们的技术。

引用次数: 0

Evaluating the effectiveness of size-limited execution trace with near-omniscient debugging 评估大小受限的执行跟踪与近乎无知调试的有效性

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-04-05 DOI: 10.1016/j.scico.2024.103117

Kazumasa Shimari , Takashi Ishio , Tetsuya Kanda , Katsuro Inoue

Debugging is an important task to identify the defects in the software. Especially, logging is an important feature of a software system to record runtime information. Detailed logging allows developers to collect run-time information when they cannot use an interactive debugger, such as continuous integration and web application server cases. However, extensive logging leads to larger execution traces because few instructions can be repeated many times. In our previous work, to record detailed program behavior within limited storage space constraints, we proposed near-omniscient debugging, which is a methodology that records and visualizes an execution trace using fixed size buffers for each observed instruction. In this paper, we evaluate the effectiveness of near-omniscient debugging in recording infected states while reducing the size of execution traces. We conduct experiments on the Defects4J dataset and evaluate the effectiveness based on the completeness, trace size and runtime overhead. The result shows that near-omniscient debugging can completely record infected states for nearly 80 percent of bugs (with a buffer size of 1024 events). The size of execution traces can be reduced by a factor of one thousand for large repetitive executions.

调试是发现软件缺陷的一项重要任务。特别是，日志是软件系统记录运行时信息的重要功能。当开发人员无法使用交互式调试器时，详细的日志记录允许他们收集运行时信息，例如持续集成和网络应用程序服务器案例。然而，大量日志记录会导致较大的执行痕迹，因为很少的指令会重复多次。在我们之前的工作中，为了在有限的存储空间限制内记录详细的程序行为，我们提出了近乎无知调试（near-omniscient debugging）的方法，即使用固定大小的缓冲区记录并可视化每个观察到的指令的执行轨迹。在本文中，我们评估了近乎无知调试在记录受感染状态的同时减少执行轨迹大小方面的有效性。我们在 Defects4J 数据集上进行了实验，并根据完整性、跟踪大小和运行时开销评估了效果。结果表明，近乎无知的调试能完整记录近 80% 的错误的感染状态（缓冲区大小为 1024 个事件）。对于大型重复执行，执行跟踪的大小可减少一千倍。

{"title":"Evaluating the effectiveness of size-limited execution trace with near-omniscient debugging","authors":"Kazumasa Shimari , Takashi Ishio , Tetsuya Kanda , Katsuro Inoue","doi":"10.1016/j.scico.2024.103117","DOIUrl":"https://doi.org/10.1016/j.scico.2024.103117","url":null,"abstract":"<div><p>Debugging is an important task to identify the defects in the software. Especially, logging is an important feature of a software system to record runtime information. Detailed logging allows developers to collect run-time information when they cannot use an interactive debugger, such as continuous integration and web application server cases. However, extensive logging leads to larger execution traces because few instructions can be repeated many times. In our previous work, to record detailed program behavior within limited storage space constraints, we proposed near-omniscient debugging, which is a methodology that records and visualizes an execution trace using fixed size buffers for each observed instruction. In this paper, we evaluate the effectiveness of near-omniscient debugging in recording infected states while reducing the size of execution traces. We conduct experiments on the Defects4J dataset and evaluate the effectiveness based on the completeness, trace size and runtime overhead. The result shows that near-omniscient debugging can completely record infected states for nearly 80 percent of bugs (with a buffer size of 1024 events). The size of execution traces can be reduced by a factor of one thousand for large repetitive executions.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"236 ","pages":"Article 103117"},"PeriodicalIF":1.3,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140544006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Out of step: Code clone detection for mobile apps across different language codebases 脱节：跨不同语言代码库的移动应用程序代码克隆检测

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-04-03 DOI: 10.1016/j.scico.2024.103112

Stephannie Jimenez , Gordana Rakić , Silvia Takahashi , Nicolás Cardozo

Clone detection provides insight about replicated fragments in a code base. With the rise of multi-language code bases, new techniques addressing cross-language code clone detection enable the analysis of polyglot systems. Such techniques have not yet been applied to the mobile apps' domain, which are naturally polyglot. Native mobile app developers must synchronize their code base in at least two different programming languages. App synchronization is a difficult and time-consuming maintenance task, as features can rapidly diverge between platforms, and feature identification must be performed manually. The end goal of this work is to provide an analysis framework to reduce the impact of app synchronization. A first step in this direction consists in a structural algorithm for cross-language clone detection, called Out of Step, exploiting the idea behind enriched concrete syntax trees. Such trees are used as a common intermediate representation built from programming languages' grammars, to detect similarities between app code bases. Our technique finds code similarities with over 80% for the evaluation of language features, where Type 1-3 clones are manually injected for the analysis of both single- and cross-language cases for Kotlin and Dart. We validate the feasibility and correctness of our approach through the evaluation of the main language constructs for Kotlin and Dart. To validate the effectiveness we use a first case study detecting clones between 12 sorting algorithms across Kotlin and Dart, identifying clone similarities with a precision between 67% and 95%. Finally, we use a corpus of 144 mobile apps implemented in Kotlin and Dart, correctly identifying code similarities for the full application logic.

克隆检测可以深入了解代码库中的复制片段。随着多语言代码库的兴起，解决跨语言代码克隆检测的新技术使分析多语言系统成为可能。此类技术尚未应用于移动应用程序领域，而移动应用程序天生就是多语言的。本地移动应用程序开发人员必须同步至少两种不同编程语言的代码库。应用程序同步是一项艰巨而耗时的维护任务，因为不同平台之间的功能可能会迅速出现差异，而且功能识别必须手动进行。这项工作的最终目标是提供一个分析框架，以减少应用程序同步的影响。在这个方向上迈出的第一步是利用丰富的具体语法树背后的理念，开发出一种用于跨语言克隆检测的结构性算法，称为 "失步"（Out of Step）。语法树是由编程语言语法构建的通用中间表示法，用于检测应用程序代码库之间的相似性。在评估语言特征时，我们的技术能发现 80% 以上的代码相似性，其中 1-3 类克隆是手动注入的，用于分析 Kotlin 和 Dart 的单语言和跨语言案例。我们通过评估 Kotlin 和 Dart 的主要语言结构验证了我们方法的可行性和正确性。为了验证其有效性，我们使用了第一个案例研究，检测 Kotlin 和 Dart 中 12 种排序算法之间的克隆，识别克隆相似性的精度在 67% 和 95% 之间。最后，我们使用了由 144 个使用 Kotlin 和 Dart 实现的移动应用程序组成的语料库，正确识别了完整应用逻辑的代码相似性。

{"title":"Out of step: Code clone detection for mobile apps across different language codebases","authors":"Stephannie Jimenez , Gordana Rakić , Silvia Takahashi , Nicolás Cardozo","doi":"10.1016/j.scico.2024.103112","DOIUrl":"https://doi.org/10.1016/j.scico.2024.103112","url":null,"abstract":"<div><p>Clone detection provides insight about replicated fragments in a code base. With the rise of multi-language code bases, new techniques addressing cross-language code clone detection enable the analysis of polyglot systems. Such techniques have not yet been applied to the mobile apps' domain, which are naturally polyglot. Native mobile app developers must synchronize their code base in at least two different programming languages. App synchronization is a difficult and time-consuming maintenance task, as features can rapidly diverge between platforms, and feature identification must be performed manually. The end goal of this work is to provide an analysis framework to reduce the impact of app synchronization. A first step in this direction consists in a structural algorithm for cross-language clone detection, called <span>Out of Step</span>, exploiting the idea behind enriched concrete syntax trees. Such trees are used as a common intermediate representation built from programming languages' grammars, to detect similarities between app code bases. Our technique finds code similarities with over 80% for the evaluation of language features, where Type 1-3 clones are manually injected for the analysis of both single- and cross-language cases for Kotlin and Dart. We validate the feasibility and correctness of our approach through the evaluation of the main language constructs for Kotlin and Dart. To validate the effectiveness we use a first case study detecting clones between 12 sorting algorithms across Kotlin and Dart, identifying clone similarities with a precision between 67% and 95%. Finally, we use a corpus of 144 mobile apps implemented in Kotlin and Dart, correctly identifying code similarities for the full application logic.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"236 ","pages":"Article 103112"},"PeriodicalIF":1.3,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167642324000352/pdfft?md5=9fb4baf02297c135f2162257609c5f70&pid=1-s2.0-S0167642324000352-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140540127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Local deadlock analysis of Simulink models based on timed behavioural patterns and theorem proving 基于定时行为模式和定理证明的 Simulink 模型局部死锁分析

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-04-03 DOI: 10.1016/j.scico.2024.103113

Joabe Jesus, Augusto Sampaio

Compositional deadlock analysis of process networks is a well-known challenge. We propose a compositional deadlock analysis strategy for timed process networks, more specifically, those obtained from Simulink multi-rate block diagrams. We handle models with both acyclic and cyclic communication graphs. Particularly, the latter naturally happens in Simulink models with feedback, among other kinds of cycles. Since there is no general solution to analyse cyclic models in a compositional way, we explore the use of behavioural patterns that allow the verification to be carried out in a compositional fashion. We represent process networks in tock-CSP, a dialect of CSP that allows modelling time aspects using a special tock event. The verification approach is implemented as a new package in CSP-Prover, a theorem prover for CSP which is itself implemented in Isabelle/HOL. To illustrate the overall approach and, particularly, how it can scale, we consider several variations of an actuation system with increasing complexity. We show that the examples are instances of the client/server and the asynchronous dynamic timed behaviour patterns. These patterns and all verification steps are formalised using CSP-Prover.

流程网络的组合死锁分析是一项众所周知的挑战。我们提出了一种针对定时流程网络的组合死锁分析策略，更确切地说，是针对从 Simulink 多速率框图中获得的流程网络的组合死锁分析策略。我们既能处理非循环通信图，也能处理循环通信图。特别是，后者自然会出现在带反馈的 Simulink 模型中，以及其他类型的循环中。由于目前还没有以组合方式分析循环模型的通用解决方案，因此我们探索使用行为模式，以组合方式进行验证。我们用 tock-CSP 表示流程网络，这是 CSP 的一种方言，允许使用特殊的 tock 事件对时间方面进行建模。验证方法是作为 CSP-Prover 中的一个新软件包实现的，CSP-Prover 是 CSP 的定理证明器，它本身是在 Isabelle/HOL 中实现的。为了说明整个方法，特别是如何扩展，我们考虑了复杂度不断增加的执行系统的几种变体。我们展示了客户端/服务器和异步动态定时行为模式的实例。我们使用 CSP-Prover 对这些模式和所有验证步骤进行了形式化。

{"title":"Local deadlock analysis of Simulink models based on timed behavioural patterns and theorem proving","authors":"Joabe Jesus, Augusto Sampaio","doi":"10.1016/j.scico.2024.103113","DOIUrl":"https://doi.org/10.1016/j.scico.2024.103113","url":null,"abstract":"<div><p>Compositional deadlock analysis of process networks is a well-known challenge. We propose a compositional deadlock analysis strategy for timed process networks, more specifically, those obtained from <span>Simulink</span> multi-rate block diagrams. We handle models with both acyclic and cyclic communication graphs. Particularly, the latter naturally happens in <span>Simulink</span> models with feedback, among other kinds of cycles. Since there is no general solution to analyse cyclic models in a compositional way, we explore the use of behavioural patterns that allow the verification to be carried out in a compositional fashion. We represent process networks in <em><span>tock</span></em>-<em>CSP</em>, a dialect of <em>CSP</em> that allows modelling time aspects using a special tock event. The verification approach is implemented as a new package in <em>CSP</em>-<em>Prover</em>, a theorem prover for <em>CSP</em> which is itself implemented in <em>Isabelle</em>/<em>HOL</em>. To illustrate the overall approach and, particularly, how it can scale, we consider several variations of an actuation system with increasing complexity. We show that the examples are instances of the client/server and the asynchronous dynamic timed behaviour patterns. These patterns and all verification steps are formalised using <em>CSP</em>-<em>Prover</em>.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"236 ","pages":"Article 103113"},"PeriodicalIF":1.3,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140552585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring issues of story-based effort estimation in Agile Software Development (ASD) 探讨敏捷软件开发（ASD）中基于故事的努力估算问题

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-03-31 DOI: 10.1016/j.scico.2024.103114

Muhammad Iqbal , Muhammad Ijaz , Tehseen Mazhar , Tariq Shahzad , Qamar Abbas , YazeedYasin Ghadi , Wasim Ahmad , Habib Hamam

Context

Effort estimation based on user stories plays a pivotal role in agile software development, where accurate predictions of project efforts are vital for success. While various supervised ML tools attempt to estimate effort, the prevalence of estimation errors presents significant challenges, as evidenced by the CHAOS report by the Standish Group, which highlights incorrect estimations contributing to a substantial percentage of failed agile projects.

Objectives

This research delves into the domain of user story-based effort estimation in agile software development, aiming to explore the issues arising from inaccurate estimations. The primary goal is to uncover these issues comprehensively and propose potential solutions, thus enhancing the efficacy of the user story-based estimation method.

Methods

To achieve the research objectives, a systematic literature review (SLR) is conducted, surveying a wide range of sources to gather insights into issues surrounding user story-based effort estimation. The review encompasses diverse estimation methods, user story attributes, and the array of challenges that can result from inaccurate estimations.

Results

The SLR reveals a spectrum of issues undermining the accuracy of user story-based effort estimation. It identifies internal factors like communication, team expertise, and composition as crucial determinants of estimation reliability. Consistency in user stories, technical complexities, and task engineering practices also emerge as significant contributors to estimation inaccuracies. The study underscores the interconnectedness of these issues, emphasizing the need for a standardized protocol to minimize inaccuracies and enhance estimation precision.

Conclusion

In light of the findings, it becomes evident that addressing the multi-dimensional factors influencing user story-based effort estimation is imperative for successful agile software development. The study underscores the interplay of various aspects, such as team dynamics, task complexity, and requirement engineering, in achieving accurate estimations. By recognizing these challenges and implementing recommended solutions, software development processes can avoid failures and enhance their prospects of success in the agile paradigm.

背景基于用户故事的工作量估算在敏捷软件开发中起着举足轻重的作用，准确预测项目工作量对项目的成功至关重要。虽然各种有监督的 ML 工具都在尝试估算工作量，但估算错误的普遍存在带来了巨大的挑战，Standish Group 的 CHAOS 报告就证明了这一点，该报告强调不正确的估算导致了很大比例的敏捷项目失败。方法为实现研究目标，我们进行了系统的文献综述（SLR），调查了广泛的资料来源，以收集有关基于用户故事的工作量估算问题的见解。综述内容包括各种估算方法、用户故事属性以及估算不准确可能导致的一系列挑战。它指出，沟通、团队专业知识和组成等内部因素是估算可靠性的关键决定因素。用户故事的一致性、技术复杂性和任务工程实践也是造成估算不准确的重要因素。研究强调了这些问题之间的相互关联性，并强调有必要制定标准化协议，以最大限度地减少估算误差并提高估算精度。这项研究强调了团队动力、任务复杂性和需求工程等多方面因素在实现精确估算过程中的相互作用。认识到这些挑战并实施建议的解决方案，软件开发过程就能避免失败，并提高在敏捷范例中取得成功的前景。

{"title":"Exploring issues of story-based effort estimation in Agile Software Development (ASD)","authors":"Muhammad Iqbal , Muhammad Ijaz , Tehseen Mazhar , Tariq Shahzad , Qamar Abbas , YazeedYasin Ghadi , Wasim Ahmad , Habib Hamam","doi":"10.1016/j.scico.2024.103114","DOIUrl":"10.1016/j.scico.2024.103114","url":null,"abstract":"<div><h3>Context</h3><p>Effort estimation based on user stories plays a pivotal role in agile software development, where accurate predictions of project efforts are vital for success. While various supervised ML tools attempt to estimate effort, the prevalence of estimation errors presents significant challenges, as evidenced by the CHAOS report by the Standish Group, which highlights incorrect estimations contributing to a substantial percentage of failed agile projects.</p></div><div><h3>Objectives</h3><p>This research delves into the domain of user story-based effort estimation in agile software development, aiming to explore the issues arising from inaccurate estimations. The primary goal is to uncover these issues comprehensively and propose potential solutions, thus enhancing the efficacy of the user story-based estimation method.</p></div><div><h3>Methods</h3><p>To achieve the research objectives, a systematic literature review (SLR) is conducted, surveying a wide range of sources to gather insights into issues surrounding user story-based effort estimation. The review encompasses diverse estimation methods, user story attributes, and the array of challenges that can result from inaccurate estimations.</p></div><div><h3>Results</h3><p>The SLR reveals a spectrum of issues undermining the accuracy of user story-based effort estimation. It identifies internal factors like communication, team expertise, and composition as crucial determinants of estimation reliability. Consistency in user stories, technical complexities, and task engineering practices also emerge as significant contributors to estimation inaccuracies. The study underscores the interconnectedness of these issues, emphasizing the need for a standardized protocol to minimize inaccuracies and enhance estimation precision.</p></div><div><h3>Conclusion</h3><p>In light of the findings, it becomes evident that addressing the multi-dimensional factors influencing user story-based effort estimation is imperative for successful agile software development. The study underscores the interplay of various aspects, such as team dynamics, task complexity, and requirement engineering, in achieving accurate estimations. By recognizing these challenges and implementing recommended solutions, software development processes can avoid failures and enhance their prospects of success in the agile paradigm.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"236 ","pages":"Article 103114"},"PeriodicalIF":1.3,"publicationDate":"2024-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140405240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Actionable code smell identification with fusion learning of metrics and semantics 通过度量标准和语义的融合学习识别可操作的代码气味

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-03-27 DOI: 10.1016/j.scico.2024.103110

Dongjin Yu, Quanxin Yang, Xin Chen, Jie Chen, Sixuan Wang, Yihang Xu

Code smell detection is one of the essential tasks in the field of software engineering. Identifying whether a code snippet has a code smell is subjective and varies by programming language, developer, and development method. Moreover, developers tend to focus on code smells that have a real impact on development and ignore insignificant ones. However, existing static code analysis tools and code smell detection approaches exhibit a high false positive rate in detecting code smells, which makes insignificant smells drown out those smells that developers value. Therefore, accurately reporting those actionable code smells that developers tend to spend energy on refactoring can prevent developers from getting lost in the sea of smells and improve refactoring efficiency. In this paper, we aim to detect actionable code smells that developers tend to refactor. Specifically, we first collect actionable and non-actionable code smells from projects with numerous historical versions to construct our datasets. Then, we propose a dual-stream model for fusion learning of code metrics and code semantics to detect actionable code smells. On the one hand, code metrics quantify the code's structure and even some rules or patterns, providing fundamental information for detecting code smells. On the other hand, code semantics encompass information about developers' refactoring tendencies, which prove valuable in detecting actionable code smells. Extensive experiments show that our approach can detect actionable code smells more accurately compared to existing approaches.

代码气味检测是软件工程领域的基本任务之一。识别代码片段是否有代码气味是主观的，而且因编程语言、开发人员和开发方法的不同而各异。此外，开发人员倾向于关注对开发有实际影响的代码气味，而忽略无关紧要的代码气味。然而，现有的静态代码分析工具和代码气味检测方法在检测代码气味时表现出很高的假阳性率，这使得无关紧要的气味淹没了开发人员重视的气味。因此，准确报告开发人员倾向于花费精力重构的可操作代码气味，可以防止开发人员迷失在气味的海洋中，提高重构效率。本文旨在检测开发人员倾向于重构的可操作代码气味。具体来说，我们首先从具有大量历史版本的项目中收集可操作和不可操作的代码气味，构建数据集。然后，我们提出了一种融合学习代码度量和代码语义的双流模型，以检测可操作的代码气味。一方面，代码度量可以量化代码的结构，甚至是一些规则或模式，为检测代码气味提供基础信息。另一方面，代码语义包含有关开发人员重构倾向的信息，这些信息对检测可操作的代码气味非常有价值。广泛的实验表明，与现有方法相比，我们的方法能更准确地检测到可操作的代码气味。

{"title":"Actionable code smell identification with fusion learning of metrics and semantics","authors":"Dongjin Yu, Quanxin Yang, Xin Chen, Jie Chen, Sixuan Wang, Yihang Xu","doi":"10.1016/j.scico.2024.103110","DOIUrl":"https://doi.org/10.1016/j.scico.2024.103110","url":null,"abstract":"<div><p>Code smell detection is one of the essential tasks in the field of software engineering. Identifying whether a code snippet has a code smell is subjective and varies by programming language, developer, and development method. Moreover, developers tend to focus on code smells that have a real impact on development and ignore insignificant ones. However, existing static code analysis tools and code smell detection approaches exhibit a high false positive rate in detecting code smells, which makes insignificant smells drown out those smells that developers value. Therefore, accurately reporting those actionable code smells that developers tend to spend energy on refactoring can prevent developers from getting lost in the sea of smells and improve refactoring efficiency. In this paper, we aim to detect actionable code smells that developers tend to refactor. Specifically, we first collect actionable and non-actionable code smells from projects with numerous historical versions to construct our datasets. Then, we propose a dual-stream model for fusion learning of code metrics and code semantics to detect actionable code smells. On the one hand, code metrics quantify the code's structure and even some rules or patterns, providing fundamental information for detecting code smells. On the other hand, code semantics encompass information about developers' refactoring tendencies, which prove valuable in detecting actionable code smells. Extensive experiments show that our approach can detect actionable code smells more accurately compared to existing approaches.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"236 ","pages":"Article 103110"},"PeriodicalIF":1.3,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140344782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Implementing an environment for hybrid software evaluation 实施混合软件评估环境

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-03-24 DOI: 10.1016/j.scico.2024.103109

Ana Díaz-Muñoz , Moisés Rodríguez , Mario Piattini

Quantum computing is a revolutionary paradigm in computer science based on the principles of quantum mechanics. It has the potential to solve problems that are currently unsolvable for classical computing. Applications of quantum computing already span a variety of sectors.

Ongoing enhancements to the integrated programming and development environment simplify the creation and optimization of quantum algorithms. Ultimately, the focus on supporting tools represents the starting point towards achieving quantum computing maturity, facilitating its transition from an experimental domain to a practical industry.

As quantum software gains ground and relevance in various domains, it is essential to address the evaluation of hybrid systems that combine classical and quantum elements to ensure diverse quality characteristics. However, in the realm of quantum software, models, metrics, and tools are still to be established.

The primary contribution of this paper is to present the first technological environment for measuring and evaluating the analyzability of hybrid software.

Real-world examples of hybrid software are provided to showcase the functionality of the different tools in the environment, yielding readable and representative results for the evaluator.

量子计算是基于量子力学原理的计算机科学革命性范式。它有可能解决目前经典计算无法解决的问题。量子计算的应用已经遍及各个领域。对集成编程和开发环境的持续增强简化了量子算法的创建和优化。最终，对支持工具的关注代表了实现量子计算成熟的起点，促进了量子计算从实验领域向实用行业的过渡。随着量子软件在各个领域的普及和相关性的提高，必须解决对结合经典和量子元素的混合系统的评估问题，以确保不同的质量特性。本文的主要贡献在于首次提出了测量和评估混合软件可分析性的技术环境。本文提供了混合软件的真实案例，展示了环境中不同工具的功能，为评估者提供了可读性和代表性的结果。

{"title":"Implementing an environment for hybrid software evaluation","authors":"Ana Díaz-Muñoz , Moisés Rodríguez , Mario Piattini","doi":"10.1016/j.scico.2024.103109","DOIUrl":"https://doi.org/10.1016/j.scico.2024.103109","url":null,"abstract":"<div><p>Quantum computing is a revolutionary paradigm in computer science based on the principles of quantum mechanics. It has the potential to solve problems that are currently unsolvable for classical computing. Applications of quantum computing already span a variety of sectors.</p><p>Ongoing enhancements to the integrated programming and development environment simplify the creation and optimization of quantum algorithms. Ultimately, the focus on supporting tools represents the starting point towards achieving quantum computing maturity, facilitating its transition from an experimental domain to a practical industry.</p><p>As quantum software gains ground and relevance in various domains, it is essential to address the evaluation of hybrid systems that combine classical and quantum elements to ensure diverse quality characteristics. However, in the realm of quantum software, models, metrics, and tools are still to be established.</p><p>The primary contribution of this paper is to present the first technological environment for measuring and evaluating the analyzability of hybrid software.</p><p>Real-world examples of hybrid software are provided to showcase the functionality of the different tools in the environment, yielding readable and representative results for the evaluator.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"236 ","pages":"Article 103109"},"PeriodicalIF":1.3,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140350570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

“Will I be replaced?” Assessing ChatGPT's effect on software development and programmer perceptions of AI tools "我会被取代吗？评估 ChatGPT 对软件开发的影响以及程序员对人工智能工具的看法

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-03-22 DOI: 10.1016/j.scico.2024.103111

Mohammad Amin Kuhail , Sujith Samuel Mathew , Ashraf Khalil , Jose Berengueres , Syed Jawad Hussain Shah

ChatGPT is a language model with artificial intelligence (AI) capabilities that has found utility across various sectors. Given its impact, we conducted two empirical studies to assess the potential and limitations of ChatGPT and other AI tools in software development. In the first study, we evaluated ChatGPT 3.5′s effectiveness in generating code for 180 coding problems from LeetCode, an online coding interview preparation platform. Our findings suggest that ChatGPT 3.5 is more effective in solving easy and medium coding problems but less reliable for harder problems. Further, ChatGPT 3.5 is somewhat more effective at coding problems with higher popularity scores. In the second study, we administered a questionnaire (N = 99) to programmers to gain insights into their views on ChatGPT and other AI tools. Our findings indicate that programmers use AI tools for various tasks, such as generating boilerplate code, explaining complex code, and conducting research. AI tools also help programmers to become more productive by creating better-performing, shorter, and more readable code, among other benefits. However, AI tools can sometimes misunderstand requirements and generate erroneous code. While most programmers are not currently concerned about AI tools replacing them, they are apprehensive about what the future may hold. Our research has also revealed associations between AI tool usage, trust, perceived productivity, and job security threats caused by the tools.

ChatGPT 是一种具有人工智能（AI）功能的语言模型，在各行各业都有应用。鉴于其影响力，我们进行了两项实证研究，以评估 ChatGPT 和其他人工智能工具在软件开发中的潜力和局限性。在第一项研究中，我们评估了 ChatGPT 3.5 为在线编码面试准备平台 LeetCode 的 180 个编码问题生成代码的有效性。我们的研究结果表明，ChatGPT 3.5 在解决简单和中等难度的编码问题时更有效，但在解决较难问题时则不太可靠。此外，ChatGPT 3.5 在解决受欢迎程度较高的问题时更有效。在第二项研究中，我们对程序员进行了问卷调查（N = 99），以了解他们对 ChatGPT 和其他人工智能工具的看法。我们的研究结果表明，程序员使用人工智能工具完成各种任务，如生成模板代码、解释复杂代码和进行研究。人工智能工具还能帮助程序员创建性能更好、更短、更易读的代码，从而提高工作效率。不过，人工智能工具有时也会误解需求，生成错误的代码。虽然大多数程序员目前并不担心人工智能工具会取代他们，但他们对未来可能出现的情况感到担忧。我们的研究还揭示了人工智能工具的使用、信任度、感知生产率和工具造成的工作安全威胁之间的关联。

{"title":"“Will I be replaced?” Assessing ChatGPT's effect on software development and programmer perceptions of AI tools","authors":"Mohammad Amin Kuhail , Sujith Samuel Mathew , Ashraf Khalil , Jose Berengueres , Syed Jawad Hussain Shah","doi":"10.1016/j.scico.2024.103111","DOIUrl":"https://doi.org/10.1016/j.scico.2024.103111","url":null,"abstract":"<div><p>ChatGPT is a language model with artificial intelligence (AI) capabilities that has found utility across various sectors. Given its impact, we conducted two empirical studies to assess the potential and limitations of ChatGPT and other AI tools in software development. In the first study, we evaluated ChatGPT 3.5′s effectiveness in generating code for 180 coding problems from LeetCode, an online coding interview preparation platform. Our findings suggest that ChatGPT 3.5 is more effective in solving easy and medium coding problems but less reliable for harder problems. Further, ChatGPT 3.5 is somewhat more effective at coding problems with higher popularity scores. In the second study, we administered a questionnaire (<em>N</em> = 99) to programmers to gain insights into their views on ChatGPT and other AI tools. Our findings indicate that programmers use AI tools for various tasks, such as generating boilerplate code, explaining complex code, and conducting research. AI tools also help programmers to become more productive by creating better-performing, shorter, and more readable code, among other benefits. However, AI tools can sometimes misunderstand requirements and generate erroneous code. While most programmers are not currently concerned about AI tools replacing them, they are apprehensive about what the future may hold. Our research has also revealed associations between AI tool usage, trust, perceived productivity, and job security threats caused by the tools.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"235 ","pages":"Article 103111"},"PeriodicalIF":1.3,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140327665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PX-MBT: A framework for model-based player experience testing PX-MBT：基于模型的玩家体验测试框架

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-03-20 DOI: 10.1016/j.scico.2024.103108

Saba Gholizadeh Ansari , I.S.W.B. Prasetya , Mehdi Dastani , Gabriele Keller , Davide Prandi , Fitsum Meshesha Kifetew , Frank Dignum

As video games become more complex and widespread, player experience (PX) testing becomes crucial in the game industry. Attracting and retaining players are key elements to guarantee the success of a game in the highly competitive market. Although a number of techniques have been introduced to measure the emotional aspect of the experience, automated testing of player experience still needs to be explored. This paper presents PX-MBT, a framework for automated player experience testing with emotion pattern verification. PX-MBT (1) utilizes a model-based testing approach for test suite generation, (2) employs a computational model of emotions developed based on a psychological theory of emotions to model players' emotions during game-plays with an intelligent agent, and (3) verifies emotion patterns given by game designers on executed test suites to identify PX-issues. We explain PX-MBT architecture and provide an example along with its result in emotion pattern verification, which asserts the evolution of emotions over time, and heat-maps to showcase the spatial distribution of emotions on the game map.

随着视频游戏变得越来越复杂和广泛，玩家体验（PX）测试在游戏行业变得至关重要。吸引和留住玩家是保证游戏在激烈的市场竞争中取得成功的关键因素。尽管已经引入了许多技术来测量情感方面的体验，但玩家体验的自动测试仍有待探索。本文介绍了 PX-MBT，一个通过情感模式验证进行玩家体验自动测试的框架。PX-MBT (1) 利用基于模型的测试方法生成测试套件；(2) 利用基于心理学情感理论开发的情感计算模型，为玩家在与智能代理玩游戏时的情感建模；(3) 在执行的测试套件上验证游戏设计者给出的情感模式，以识别 PX 问题。我们解释了 PX-MBT 的架构，并提供了一个示例及其在情绪模式验证方面的成果，该成果证实了情绪随时间的演变，并通过热图展示了情绪在游戏地图上的空间分布。

{"title":"PX-MBT: A framework for model-based player experience testing","authors":"Saba Gholizadeh Ansari , I.S.W.B. Prasetya , Mehdi Dastani , Gabriele Keller , Davide Prandi , Fitsum Meshesha Kifetew , Frank Dignum","doi":"10.1016/j.scico.2024.103108","DOIUrl":"10.1016/j.scico.2024.103108","url":null,"abstract":"<div><p>As video games become more complex and widespread, player experience (PX) testing becomes crucial in the game industry. Attracting and retaining players are key elements to guarantee the success of a game in the highly competitive market. Although a number of techniques have been introduced to measure the emotional aspect of the experience, automated testing of player experience still needs to be explored. This paper presents <span>PX-MBT</span>, a framework for automated player experience testing with emotion pattern verification. <span>PX-MBT</span> (1) utilizes a model-based testing approach for test suite generation, (2) employs a computational model of emotions developed based on a psychological theory of emotions to model players' emotions during game-plays with an intelligent agent, and (3) verifies emotion patterns given by game designers on executed test suites to identify PX-issues. We explain <span>PX-MBT</span> architecture and provide an example along with its result in emotion pattern verification, which asserts the evolution of emotions over time, and heat-maps to showcase the spatial distribution of emotions on the game map.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"236 ","pages":"Article 103108"},"PeriodicalIF":1.3,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167642324000315/pdfft?md5=3feb08ed6c236db63ae3355a5f46a72f&pid=1-s2.0-S0167642324000315-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140280471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A method to identify overfitting program repair patches based on expression tree 基于表达树识别过度拟合程序修复补丁的方法

IF 1.3 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Science of Computer Programming

Pub Date : 2024-03-05 DOI: 10.1016/j.scico.2024.103105

Yukun Dong, Xiaotong Cheng, Yufei Yang, Lulu Zhang, Shuqi Wang, Lingjie Kong

The primary aim of Automatic Program Repair (APR) is to automatically repair defective programs, with the intention of reducing the amount of effort required by developers. However, APR techniques may produce overfitting patches that do not truly repair the program, allowing the program to pass all test cases. This paper provides a comprehensive review of the overfitting problem and adds to the existing research on overfitting in conditional statements. Our proposed method, ETPAT (Expression Tree-based Patch Assessment Technique), implements expression trees and targeted coverage criteria to identify differences between the original and the patched program. We utilize ETPAT to verify test case adequacy. In parallel, ETPAT also guides the generation of corresponding test cases via equivalence class information, which may be added to the original test suite, making it more robust while also preventing the repair technique from generating comparable overfitting patches. With reference to the patch set in the BuggyJavaJML benchmark, ETPAT recognized 77/82 (93.9%) overfitting patches out of 120 patches related to conditional constraints, displaying superior accuracy rates and fewer test cases required than the original repair tool.

自动程序修复（APR）的主要目的是自动修复有缺陷的程序，以减少开发人员的工作量。然而，自动程序修复技术可能会产生过拟合补丁，无法真正修复程序，使程序通过所有测试用例。本文全面回顾了过拟合问题，并对现有的条件语句过拟合研究进行了补充。我们提出的 ETPAT（基于表达式树的补丁评估技术）方法采用表达式树和目标覆盖标准来识别原始程序和补丁程序之间的差异。我们利用 ETPAT 验证测试用例的充分性。与此同时，ETPAT 还能通过等价类信息指导生成相应的测试用例，这些测试用例可添加到原始测试套件中，使其更加稳健，同时还能防止修复技术生成类似的过拟合补丁。参照 BuggyJavaJML 基准中的补丁集，ETPAT 在 120 个与条件约束相关的补丁中识别出 77/82 个（93.9%）过拟合补丁，显示出比原始修复工具更高的准确率和更少的所需测试用例。

{"title":"A method to identify overfitting program repair patches based on expression tree","authors":"Yukun Dong, Xiaotong Cheng, Yufei Yang, Lulu Zhang, Shuqi Wang, Lingjie Kong","doi":"10.1016/j.scico.2024.103105","DOIUrl":"10.1016/j.scico.2024.103105","url":null,"abstract":"<div><p>The primary aim of Automatic Program Repair (APR) is to automatically repair defective programs, with the intention of reducing the amount of effort required by developers. However, APR techniques may produce overfitting patches that do not truly repair the program, allowing the program to pass all test cases. This paper provides a comprehensive review of the overfitting problem and adds to the existing research on overfitting in conditional statements. Our proposed method, ETPAT (Expression Tree-based Patch Assessment Technique), implements expression trees and targeted coverage criteria to identify differences between the original and the patched program. We utilize ETPAT to verify test case adequacy. In parallel, ETPAT also guides the generation of corresponding test cases via equivalence class information, which may be added to the original test suite, making it more robust while also preventing the repair technique from generating comparable overfitting patches. With reference to the patch set in the BuggyJavaJML benchmark, ETPAT recognized 77/82 (93.9%) overfitting patches out of 120 patches related to conditional constraints, displaying superior accuracy rates and fewer test cases required than the original repair tool.</p></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"235 ","pages":"Article 103105"},"PeriodicalIF":1.3,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140053861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0