首页 > 最新文献

Journal of Computer Languages最新文献

英文 中文
Combining type inference techniques for semi-automatic UML generation from Pharo code 结合类型推断技术,从 Pharo 代码中半自动生成 UML
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-11-14 DOI: 10.1016/j.cola.2024.101300
Jan Blizničenko, Robert Pergl
This paper explores how to reconstruct UML diagrams from dynamically typed languages such as Smalltalk, which do not use explicit type information. This lack of information makes traditional methods for extracting associations difficult. It addresses the need for automated techniques, particularly in legacy software systems, to facilitate their transformation into modern technologies, focusing on Smalltalk as a case study due to its extensive industrial legacy and modern adaptations like Pharo. We propose a way to create UML diagrams from Smalltalk code, focusing on using type inference to determine UML associations. For optimal outcomes for large-scale software systems, we recommend combining different type inference methods in an automatic or semi-automatic way.
本文探讨了如何从动态类型语言(如 Smalltalk)中重建 UML 图表,因为这种语言不使用显式类型信息。这种信息的缺乏使得提取关联的传统方法变得困难。本文以Smalltalk为例,探讨了对自动化技术的需求,特别是在传统软件系统中,以促进其向现代技术的转化。我们提出了一种从Smalltalk代码中创建UML图表的方法,重点是使用类型推论来确定UML关联。为了使大型软件系统达到最佳效果,我们建议以自动或半自动的方式结合不同的类型推断方法。
{"title":"Combining type inference techniques for semi-automatic UML generation from Pharo code","authors":"Jan Blizničenko,&nbsp;Robert Pergl","doi":"10.1016/j.cola.2024.101300","DOIUrl":"10.1016/j.cola.2024.101300","url":null,"abstract":"<div><div>This paper explores how to reconstruct UML diagrams from dynamically typed languages such as Smalltalk, which do not use explicit type information. This lack of information makes traditional methods for extracting associations difficult. It addresses the need for automated techniques, particularly in legacy software systems, to facilitate their transformation into modern technologies, focusing on Smalltalk as a case study due to its extensive industrial legacy and modern adaptations like Pharo. We propose a way to create UML diagrams from Smalltalk code, focusing on using type inference to determine UML associations. For optimal outcomes for large-scale software systems, we recommend combining different type inference methods in an automatic or semi-automatic way.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101300"},"PeriodicalIF":1.7,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient instance selection algorithm for fast training of support vector machine for cross-project software defect prediction pairs 用于跨项目软件缺陷预测对支持向量机快速训练的高效实例选择算法
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-23 DOI: 10.1016/j.cola.2024.101301
Manpreet Singh, Jitender Kumar Chhabra
SVM is limited in its use for cross-project software defect prediction because of its very slow training process. So, this research article proposes a new instance selection (IS) algorithm called boundary detection among classes (BDAC) to reduce the training dataset size for faster training of SVM without degrading the prediction performance. The proposed algorithm is evaluated against six existing IS algorithms based on accuracy, running time, data reduction rate, etc. using 23 general datasets, 18 software defect prediction datasets, and two shape-based datasets, and results prove that BDAC is better than the selected algorithm based on collective comparison.
SVM 在跨项目软件缺陷预测中的应用受到限制,因为其训练过程非常缓慢。因此,本文提出了一种名为 "类间边界检测"(BDAC)的新实例选择(IS)算法,以减少训练数据集的大小,从而在不降低预测性能的情况下加快 SVM 的训练速度。文章使用 23 个一般数据集、18 个软件缺陷预测数据集和 2 个基于形状的数据集,根据准确度、运行时间、数据减少率等指标,对所提出的算法与现有的 6 种 IS 算法进行了评估,结果证明,基于集体比较,BDAC 优于所选算法。
{"title":"An efficient instance selection algorithm for fast training of support vector machine for cross-project software defect prediction pairs","authors":"Manpreet Singh,&nbsp;Jitender Kumar Chhabra","doi":"10.1016/j.cola.2024.101301","DOIUrl":"10.1016/j.cola.2024.101301","url":null,"abstract":"<div><div>SVM is limited in its use for cross-project software defect prediction because of its very slow training process. So, this research article proposes a new instance selection (IS) algorithm called boundary detection among classes (BDAC) to reduce the training dataset size for faster training of SVM without degrading the prediction performance. The proposed algorithm is evaluated against six existing IS algorithms based on accuracy, running time, data reduction rate, etc. using 23 general datasets, 18 software defect prediction datasets, and two shape-based datasets, and results prove that BDAC is better than the selected algorithm based on collective comparison.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101301"},"PeriodicalIF":1.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection and treatment of string events in the limit 探测和处理极限串事件
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-21 DOI: 10.1016/j.cola.2024.101299
Alex Holmquist , Vitor Emanuel , Fernando C. Alves , Fernando Magno Quintão Pereira
A string event is a pattern that occurs in a stream of characters. The need to detect and handle string events in infinite texts emerges in many scenarios, including online treatment of logs, web crawling, and syntax highlighting. This paper describes a technique to specify and treat string events. Users determine patterns of interest via a markup language. From such examples, tokens are generalized via a semi-lattice of regular expressions. Such tokens are combined into a context-free language that recognizes patterns in the text stream. These techniques are implemented in a text processing system called Lushu, which runs on the Java Virtual Machine (JVM). Lushu intercepts strings emitted by the JVM. Once patterns are detected, it invokes a user-specified action handler. As a proof of concept, this paper shows that Lushu outperforms state-of-the-art parsers and parser generators, such as Comby, BeautifulSoup4 and ZheFuscator, in terms of memory consumption and running time.
字符串事件是出现在字符流中的一种模式。在许多情况下,都需要检测和处理无限文本中的字符串事件,包括在线处理日志、网络爬行和语法高亮。本文介绍了一种指定和处理字符串事件的技术。用户通过标记语言确定感兴趣的模式。根据这些示例,通过正则表达式的半晶格对标记进行概括。这些标记被组合成一种无语境语言,可识别文本流中的模式。这些技术在一个名为 Lushu 的文本处理系统中得以实现,该系统在 Java 虚拟机(JVM)上运行。Lushu 拦截 JVM 发出的字符串。一旦检测到模式,它就会调用用户指定的动作处理程序。作为概念验证,本文展示了 Lushu 在内存消耗和运行时间方面优于 Comby、BeautifulSoup4 和 ZheFuscator 等最先进的解析器和解析器生成器。
{"title":"Detection and treatment of string events in the limit","authors":"Alex Holmquist ,&nbsp;Vitor Emanuel ,&nbsp;Fernando C. Alves ,&nbsp;Fernando Magno Quintão Pereira","doi":"10.1016/j.cola.2024.101299","DOIUrl":"10.1016/j.cola.2024.101299","url":null,"abstract":"<div><div>A string event is a pattern that occurs in a stream of characters. The need to detect and handle string events in infinite texts emerges in many scenarios, including online treatment of logs, web crawling, and syntax highlighting. This paper describes a technique to specify and treat string events. Users determine patterns of interest via a markup language. From such examples, tokens are generalized via a semi-lattice of regular expressions. Such tokens are combined into a context-free language that recognizes patterns in the text stream. These techniques are implemented in a text processing system called <span>Lushu</span>, which runs on the Java Virtual Machine (JVM). <span>Lushu</span> intercepts strings emitted by the JVM. Once patterns are detected, it invokes a user-specified action handler. As a proof of concept, this paper shows that <span>Lushu</span> outperforms state-of-the-art parsers and parser generators, such as <span>Comby</span>, <span>BeautifulSoup4</span> and <span>ZheFuscator</span>, in terms of memory consumption and running time.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101299"},"PeriodicalIF":1.7,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ClangOz: Parallel constant evaluation of C++ map and reduce operations ClangOz:C++ 映射和还原操作的并行常量评估
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-10 DOI: 10.1016/j.cola.2024.101298
Paul Keir , Andrew Gozillon
Interest in metaprogramming, reflection, and compile-time evaluation continues to inspire and foster innovation among the users and designers of the C++ programming language. Regrettably, the impact on compile-times of such features can be significant; and outside of build systems, multi-core parallelism is unable to bring down compilation times of individual translation units. We present ClangOz, a novel Clang-based research compiler that addresses this issue by evaluating annotated constant expressions in parallel, thereby reducing compilation times. Prior benchmarks analyzed parallel map operations, but were unable to consider reduction operations. Thus we also introduce parallel reduction functionality, alongside two additional benchmark programs.
对元编程、反射和编译时评估的兴趣不断激发和促进 C++ 编程语言用户和设计者的创新。遗憾的是,这些功能对编译时间的影响可能很大;在构建系统之外,多核并行性无法降低单个翻译单元的编译时间。我们介绍的 ClangOz 是一种基于 Clang 的新型研究编译器,它通过并行评估注释常量表达式来解决这一问题,从而缩短编译时间。之前的基准分析了并行映射操作,但无法考虑还原操作。因此,我们还引入了并行还原功能以及两个额外的基准程序。
{"title":"ClangOz: Parallel constant evaluation of C++ map and reduce operations","authors":"Paul Keir ,&nbsp;Andrew Gozillon","doi":"10.1016/j.cola.2024.101298","DOIUrl":"10.1016/j.cola.2024.101298","url":null,"abstract":"<div><div>Interest in metaprogramming, reflection, and compile-time evaluation continues to inspire and foster innovation among the users and designers of the C++ programming language. Regrettably, the impact on compile-times of such features can be significant; and outside of build systems, multi-core parallelism is unable to bring down compilation times of individual translation units. We present ClangOz, a novel Clang-based research compiler that addresses this issue by evaluating annotated constant expressions in parallel, thereby reducing compilation times. Prior benchmarks analyzed parallel map operations, but were unable to consider reduction operations. Thus we also introduce parallel reduction functionality, alongside two additional benchmark programs.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101298"},"PeriodicalIF":1.7,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MoTion: A new declarative object matching approach in Pharo MoTion:Pharo 中一种新的声明式对象匹配方法
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-23 DOI: 10.1016/j.cola.2024.101290
Aless Hosry , Vincent Aranega , Nicolas Anquetil

Pattern matching is an expressive way of matching data and extracting pieces of information from it. The recent inclusion of pattern matching in the Java and Python languages highlights that such a facility is more and more adopted by developers for everyday development. Other main stream programming languages also offer pattern matching capabilities as part of the language (Rust, Scala, Haskell, and OCaml), with different degrees of expressivity in what can be matched. In the meantime, in graphs, pattern matching takes a slightly different turn; it enhances the expressivity of the patterns that can be defined. Smalltalk currently offers little pattern matching capability to find specific objects inside a large graph of objects using a declarative pattern. In Pharo, the closest library to classical pattern matching that exists is the RBParseTreeSearcher, which allows to express specialized patterns over a Pharo Abstract Syntax Tree to find some inner node. The question arises of what features a flexible pattern matching language should have. In this paper, we review the features found in different existing pattern matching languages, both in General Purpose Languages (like Java) and in declarative graph pattern matching languages. We then describe MoTion, a new pattern matching engine for Pharo smalltalk, combining all these features. We discuss some aspects of MoTion’s implementation and illustrate its use with real case examples.

模式匹配是匹配数据并从中提取信息的一种表达方式。最近,Java 和 Python 语言中加入了模式匹配功能,这表明这种功能越来越多地被开发人员用于日常开发。其他主流编程语言(Rust、Scala、Haskell 和 OCaml)也将模式匹配功能作为语言的一部分,但对可匹配内容的表达程度各有不同。与此同时,在图形中,模式匹配的方式略有不同;它增强了可定义模式的表现力。目前,Smalltalk几乎不提供模式匹配功能,无法使用声明模式在大型对象图中查找特定对象。在Pharo中,最接近经典模式匹配的库是RBParseTreeSearcher,它允许在Pharo抽象语法树上表达专门的模式,以查找某个内部节点。问题是,灵活的模式匹配语言应该具备哪些功能。在本文中,我们回顾了现有模式匹配语言的特点,包括通用语言(如 Java)和声明式图模式匹配语言。然后,我们介绍了MoTion--一种用于Pharo smalltalk的新模式匹配引擎,它将所有这些特性结合在了一起。我们将讨论MoTion实现的某些方面,并通过实际案例来说明其用途。
{"title":"MoTion: A new declarative object matching approach in Pharo","authors":"Aless Hosry ,&nbsp;Vincent Aranega ,&nbsp;Nicolas Anquetil","doi":"10.1016/j.cola.2024.101290","DOIUrl":"10.1016/j.cola.2024.101290","url":null,"abstract":"<div><p>Pattern matching is an expressive way of matching data and extracting pieces of information from it. The recent inclusion of pattern matching in the Java and Python languages highlights that such a facility is more and more adopted by developers for everyday development. Other main stream programming languages also offer pattern matching capabilities as part of the language (Rust, Scala, Haskell, and OCaml), with different degrees of expressivity in what can be matched. In the meantime, in graphs, pattern matching takes a slightly different turn; it enhances the expressivity of the patterns that can be defined. Smalltalk currently offers little pattern matching capability to find specific objects inside a large graph of objects using a declarative pattern. In Pharo, the closest library to classical pattern matching that exists is the <span>RBParseTreeSearcher</span>, which allows to express specialized patterns over a Pharo Abstract Syntax Tree to find some inner node. The question arises of what features a flexible pattern matching language should have. In this paper, we review the features found in different existing pattern matching languages, both in General Purpose Languages (like Java) and in declarative graph pattern matching languages. We then describe MoTion, a new pattern matching engine for Pharo smalltalk, combining all these features. We discuss some aspects of MoTion’s implementation and illustrate its use with real case examples.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101290"},"PeriodicalIF":1.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An empirical study on divergence of differently-sourced LLVM IRs 关于不同来源 LLVM IR 分歧的实证研究
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-05 DOI: 10.1016/j.cola.2024.101289
Zhenzhou Tian , Yuchen Gong , Chenhao Chang , Jiaze Sun , Yanping Chen , Lingwei Chen

In solving binary code similarity detection, many approaches choose to operate on certain unified intermediate representations (IRs), such as Low Level Virtual Machine (LLVM) IR, to overcome the cross-architecture analysis challenge induced by the significant morphological and syntactic gaps across the diverse instruction set architectures (ISAs). However, the LLVM IRs of the same program can be affected by diverse factors, such as the acquisition source, i.e., compiled from source code or disassembled and lifted from binary code. While the impact of compilation settings on binary code has been explored, the specific differences between LLVM IRs from varied sources remain underexamined. To this end, we pioneer an in-depth empirical study to assess the discrepancies in LLVM IRs derived from different sources. Correspondingly, an extensive dataset containing nearly 98 million LLVM IR instructions distributed in 808,431 functions is curated with respect to these potential IR-influential factors. On this basis, three types of code metrics detailing the syntactic, structural, and semantic aspects of the IR samples are devised and leveraged to assess the divergence of the IRs across different origins. The findings offer insights into how and to what extent the various factors affect the IRs, providing valuable guidance for assembling a training corpus aimed at developing robust LLVM IR-oriented pre-training models, as well as facilitating relevant program analysis studies that operate on the LLVM IRs.

在解决二进制代码相似性检测问题时,许多方法都选择对某些统一的中间表示(IR)(如低级虚拟机(LLVM)IR)进行操作,以克服由于不同指令集架构(ISA)之间存在明显的形态和语法差距而引起的跨架构分析难题。然而,同一程序的 LLVM IR 会受到不同因素的影响,例如获取源,即从源代码编译或从二进制代码反汇编和提取。虽然已经探讨了编译设置对二进制代码的影响,但对不同来源的 LLVM IR 之间的具体差异仍未进行深入研究。为此,我们率先开展了一项深入的实证研究,以评估不同来源的 LLVM IR 之间的差异。相应地,我们根据这些潜在的 IR 影响因素,对包含 808431 个函数中近 9800 万条 LLVM IR 指令的大量数据集进行了分析。在此基础上,我们设计了三种代码度量标准,详细描述了 IR 样本的语法、结构和语义方面,并利用这些标准来评估不同来源的 IR 的差异。研究结果深入揭示了各种因素如何以及在多大程度上影响了 IR,为组建旨在开发强大的 LLVM IR 面向预训练模型的训练语料库提供了宝贵的指导,同时也促进了以 LLVM IR 为基础的相关程序分析研究。
{"title":"An empirical study on divergence of differently-sourced LLVM IRs","authors":"Zhenzhou Tian ,&nbsp;Yuchen Gong ,&nbsp;Chenhao Chang ,&nbsp;Jiaze Sun ,&nbsp;Yanping Chen ,&nbsp;Lingwei Chen","doi":"10.1016/j.cola.2024.101289","DOIUrl":"10.1016/j.cola.2024.101289","url":null,"abstract":"<div><p>In solving binary code similarity detection, many approaches choose to operate on certain unified intermediate representations (IRs), such as Low Level Virtual Machine (LLVM) IR, to overcome the cross-architecture analysis challenge induced by the significant morphological and syntactic gaps across the diverse instruction set architectures (ISAs). However, the LLVM IRs of the same program can be affected by diverse factors, such as the acquisition source, i.e., compiled from source code or disassembled and lifted from binary code. While the impact of compilation settings on binary code has been explored, the specific differences between LLVM IRs from varied sources remain underexamined. To this end, we pioneer an in-depth empirical study to assess the discrepancies in LLVM IRs derived from different sources. Correspondingly, an extensive dataset containing nearly 98 million LLVM IR instructions distributed in 808,431 functions is curated with respect to these potential IR-influential factors. On this basis, three types of code metrics detailing the syntactic, structural, and semantic aspects of the IR samples are devised and leveraged to assess the divergence of the IRs across different origins. The findings offer insights into how and to what extent the various factors affect the IRs, providing valuable guidance for assembling a training corpus aimed at developing robust LLVM IR-oriented pre-training models, as well as facilitating relevant program analysis studies that operate on the LLVM IRs.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101289"},"PeriodicalIF":1.7,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fault localization by abstract interpretation and its applications 通过抽象解释进行故障定位及其应用
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-01 DOI: 10.1016/j.cola.2024.101288
Aleksandar S. Dimovski

Fault localization aims to automatically identify the cause of an error in a program by localizing the error to a relatively small part of the program. In this paper, we present a novel technique for automated fault localization via error invariants inferred by abstract interpretation. An error invariant for a location in an error program over-approximates the reachable states at the given location that may produce the error, if the execution of the program is continued from that location. Error invariants can be used for statement-wise semantic slicing of error programs and for obtaining concise error explanations. We use an iterative refinement sequence of backward–forward static analyses by abstract interpretation to compute error invariants, which are designed to explain why an error program violates a particular assertion.

Furthermore, we present a practical application of the fault localization technique for automatic repair of programs. Given an erroneous program, we first use the fault localization to automatically identify statements relevant for the error, and then repeatedly mutate the expressions in those relevant statements until a correct program that satisfies all assertions is found. All other statements classified by the fault localization as irrelevant for the error are not mutated in the program repair process. This way, we significantly reduce the search space of mutated programs without losing any potentially correct program, and so locate a repaired program much faster than a program repair without fault localization.

We have developed a prototype tool for automatic fault localization and repair of C programs. We demonstrate the effectiveness of our approach to localize errors in realistic C programs, and to subsequently repair them. Moreover, we show that our approach based on combining fault localization and code mutations is significantly faster that the previous program repair approach without fault localization.

故障定位的目的是通过将错误定位到程序中相对较小的部分来自动识别程序中的错误原因。在本文中,我们提出了一种通过抽象解释推断出的错误不变式进行自动故障定位的新技术。错误程序中某一位置的错误不变式可以过度近似给定位置上可能产生错误的可到达状态,如果程序从该位置继续执行的话。错误不变式可用于对错误程序进行语句语义切分,并获得简明的错误解释。我们通过抽象解释使用后向-前向静态分析的迭代精炼序列来计算错误不变式,旨在解释错误程序违反特定断言的原因。给定一个错误的程序,我们首先使用故障定位自动识别与错误相关的语句,然后反复修改这些相关语句中的表达式,直到找到一个满足所有断言的正确程序。在程序修复过程中,所有被故障定位归类为与错误无关的其他语句都不会被修改。通过这种方法,我们在不丢失任何潜在正确程序的情况下,大大减少了变异程序的搜索空间,因此修复程序的定位速度比不进行故障定位的程序修复快得多。我们开发了自动定位和修复 C 语言程序故障的原型工具。我们展示了我们的方法在定位现实 C 语言程序中的错误以及随后修复它们的有效性。此外,我们还展示了基于故障定位和代码突变相结合的方法,其修复速度明显快于之前不进行故障定位的程序修复方法。
{"title":"Fault localization by abstract interpretation and its applications","authors":"Aleksandar S. Dimovski","doi":"10.1016/j.cola.2024.101288","DOIUrl":"10.1016/j.cola.2024.101288","url":null,"abstract":"<div><p><em>Fault localization</em> aims to automatically identify the cause of an error in a program by localizing the error to a relatively small part of the program. In this paper, we present a novel technique for automated fault localization via <em>error invariants</em> inferred by abstract interpretation. An error invariant for a location in an error program over-approximates the reachable states at the given location that may produce the error, if the execution of the program is continued from that location. Error invariants can be used for <em>statement-wise semantic slicing</em> of error programs and for obtaining concise error explanations. We use an iterative refinement sequence of backward–forward static analyses by abstract interpretation to compute error invariants, which are designed to explain why an error program violates a particular assertion.</p><p>Furthermore, we present a practical application of the fault localization technique for automatic repair of programs. Given an erroneous program, we first use the fault localization to automatically identify statements relevant for the error, and then repeatedly mutate the expressions in those relevant statements until a correct program that satisfies all assertions is found. All other statements classified by the fault localization as irrelevant for the error are not mutated in the program repair process. This way, we significantly reduce the search space of mutated programs without losing any potentially correct program, and so locate a repaired program much faster than a program repair without fault localization.</p><p>We have developed a prototype tool for automatic fault localization and repair of C programs. We demonstrate the effectiveness of our approach to localize errors in realistic C programs, and to subsequently repair them. Moreover, we show that our approach based on combining fault localization and code mutations is significantly faster that the previous program repair approach without fault localization.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"80 ","pages":"Article 101288"},"PeriodicalIF":1.7,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141845094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Screening articles for systematic reviews with ChatGPT 使用 ChatGPT 筛选系统综述文章
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-07-08 DOI: 10.1016/j.cola.2024.101287
Eugene Syriani , Istvan David , Gauransh Kumar

Systematic reviews (SRs) provide valuable evidence for guiding new research directions. However, the manual effort involved in selecting articles for inclusion in an SR is error-prone and time-consuming. While screening articles has traditionally been considered challenging to automate, the advent of large language models offers new possibilities. In this paper, we discuss the effect of using ChatGPT on the SR process. In particular, we investigate the effectiveness of different prompt strategies for automating the article screening process using five real SR datasets. Our results show that ChatGPT can reach up to 82% accuracy. The best performing prompts specify exclusion criteria and avoid negative shots. However, prompts should be adapted to different corpus characteristics.

系统综述(SR)为指导新的研究方向提供了宝贵的证据。然而,人工筛选纳入系统综述的文章既容易出错又耗费时间。虽然筛选文章在传统上被认为具有自动化的挑战性,但大型语言模型的出现提供了新的可能性。在本文中,我们讨论了使用 ChatGPT 对 SR 流程的影响。特别是,我们使用五个真实的 SR 数据集研究了不同提示策略对文章筛选过程自动化的有效性。结果表明,ChatGPT 的准确率可达 82%。表现最好的提示指定了排除标准,避免了负面镜头。不过,提示应适应不同的语料特征。
{"title":"Screening articles for systematic reviews with ChatGPT","authors":"Eugene Syriani ,&nbsp;Istvan David ,&nbsp;Gauransh Kumar","doi":"10.1016/j.cola.2024.101287","DOIUrl":"https://doi.org/10.1016/j.cola.2024.101287","url":null,"abstract":"<div><p>Systematic reviews (SRs) provide valuable evidence for guiding new research directions. However, the manual effort involved in selecting articles for inclusion in an SR is error-prone and time-consuming. While screening articles has traditionally been considered challenging to automate, the advent of large language models offers new possibilities. In this paper, we discuss the effect of using ChatGPT on the SR process. In particular, we investigate the effectiveness of different prompt strategies for automating the article screening process using five real SR datasets. Our results show that ChatGPT can reach up to 82% accuracy. The best performing prompts specify exclusion criteria and avoid negative shots. However, prompts should be adapted to different corpus characteristics.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"80 ","pages":"Article 101287"},"PeriodicalIF":1.7,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590118424000303/pdfft?md5=88fb1aa235050a4011046d39a856044b&pid=1-s2.0-S2590118424000303-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ask or tell: An empirical study on modeling challenges from LabVIEW community 问还是说?来自 LabVIEW 社区的建模挑战实证研究
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-24 DOI: 10.1016/j.cola.2024.101284
Xin Zhao , Gurshan Rai , Saheed Popoola

Systems engineering is fundamental to the performance, functionality, and value of modern systems and products. Systems model development and the usage of modeling tools in systems engineering can be complex. Systems engineering is concerned not only with the modeling process but many other aspects of development, such as hardware, data, and modeling tools. Because systems modeling practitioners are versatile and have expertise in specifying, building, maintaining, and supporting technical infrastructure, the modeling practices and challenges vary. We conducted an empirical study from a systems modeler’s perspective to better understand the challenges posed to systems engineers in the LabVIEW community. Our inspection consists of two investigations. First, with the help of machine learning techniques, we mined online discussion forum posts to identify what questions engineers ask regarding modeling challenges. The result shows that the most challenging part is related to coding practice in the development. Inspired by this discovery, we conducted another empirical study. We surveyed systems engineers in the LabVIEW community using an online questionnaire to recognize what system engineers tell regarding coding challenges in their development practice. Our paper also provided some observations and concrete suggestions to both systems engineers and tool developers on how to improve system model quality and facilitate the modeling process when using LabVIEW.

系统工程对于现代系统和产品的性能、功能和价值至关重要。系统工程中的系统模型开发和建模工具的使用可能非常复杂。系统工程不仅涉及建模过程,还涉及开发的许多其他方面,如硬件、数据和建模工具。由于系统建模从业人员是多面手,在指定、构建、维护和支持技术基础设施方面具有专长,因此建模实践和挑战也各不相同。我们从系统建模人员的角度开展了一项实证研究,以更好地了解 LabVIEW 社区的系统工程师所面临的挑战。我们的检查包括两项调查。首先,在机器学习技术的帮助下,我们挖掘了在线论坛帖子,以确定工程师们提出了哪些有关建模挑战的问题。结果显示,最具挑战性的部分与开发过程中的编码实践有关。受这一发现的启发,我们进行了另一项实证研究。我们通过在线问卷调查了 LabVIEW 社区的系统工程师,以了解系统工程师在开发实践中遇到的编码挑战。我们的论文还就如何在使用 LabVIEW 时提高系统模型质量和促进建模过程向系统工程师和工具开发人员提出了一些意见和具体建议。
{"title":"Ask or tell: An empirical study on modeling challenges from LabVIEW community","authors":"Xin Zhao ,&nbsp;Gurshan Rai ,&nbsp;Saheed Popoola","doi":"10.1016/j.cola.2024.101284","DOIUrl":"https://doi.org/10.1016/j.cola.2024.101284","url":null,"abstract":"<div><p>Systems engineering is fundamental to the performance, functionality, and value of modern systems and products. Systems model development and the usage of modeling tools in systems engineering can be complex. Systems engineering is concerned not only with the modeling process but many other aspects of development, such as hardware, data, and modeling tools. Because systems modeling practitioners are versatile and have expertise in specifying, building, maintaining, and supporting technical infrastructure, the modeling practices and challenges vary. We conducted an empirical study from a systems modeler’s perspective to better understand the challenges posed to systems engineers in the LabVIEW community. Our inspection consists of two investigations. First, with the help of machine learning techniques, we mined online discussion forum posts to identify what questions engineers <em>ask</em> regarding modeling challenges. The result shows that the most challenging part is related to coding practice in the development. Inspired by this discovery, we conducted another empirical study. We surveyed systems engineers in the LabVIEW community using an online questionnaire to recognize what system engineers <em>tell</em> regarding coding challenges in their development practice. Our paper also provided some observations and concrete suggestions to both systems engineers and tool developers on how to improve system model quality and facilitate the modeling process when using LabVIEW.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"80 ","pages":"Article 101284"},"PeriodicalIF":1.7,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141543638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Live application programming in the defense industry with the Molecule component framework 利用分子组件框架在国防工业中进行实时应用编程
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-24 DOI: 10.1016/j.cola.2024.101286
Pierre Laborde , Yann Le Goff , Éric Le Pors , Alain Plantec , Steven Costiou

At Thales Defense Mission Systems (DMS), software products first go through an industrial prototyping phase. Prototypes are serious applications that we evaluate with our end-users during demonstrations. End-users have a central role in the design process of our products. They often ask for software modifications during demonstrations to experiment new ideas or to focus the existing design on their needs.

In this paper, we present how we combined Smalltalk’s live-programming capabilities with software component models to obtain flexible and modular software designs in our context of live prototyping. We present Molecule, an open-source implementation of the Lightweight CORBA Component Model in Pharo. We use Molecule to build HMI systems prototypes, and we benefit from the dynamic run-time modification capabilities of Pharo during demonstrations with our end-users where we explore software designs in a lively way.

Molecule is an industrial contribution to Smalltalk, as it capitalizes 20 years of usage and maturation in our prototyping activity. The Molecule framework and tools are now mature, and we started building end-user software used in production at Thales DMS. We present two such end-user software and analyze their component architecture, that are representative of how we (learnt to) build HMI prototypes. Finally, we analyze our technological decisions with regards to the benefits we sought for our industrial activity.

在泰雷兹防务任务系统公司(DMS),软件产品首先要经过工业原型阶段。原型是我们在演示过程中与最终用户一起评估的严肃应用程序。最终用户在我们的产品设计过程中发挥着核心作用。在本文中,我们将介绍如何将Smalltalk的实时编程功能与软件组件模型相结合,从而在实时原型开发中获得灵活的模块化软件设计。我们介绍了Pharo中轻量级CORBA组件模型的开源实现--Molecule。我们使用Molecule构建人机界面系统原型,并在与最终用户的演示中受益于Pharo的动态运行时修改功能,以生动活泼的方式探索软件设计。Molecule框架和工具现已成熟,我们已开始构建用于Thales DMS生产的最终用户软件。我们介绍了两个这样的最终用户软件,并分析了它们的组件架构,这代表了我们(学会)如何构建人机界面原型。最后,我们分析了我们的技术决策与我们为工业活动所寻求的利益之间的关系。
{"title":"Live application programming in the defense industry with the Molecule component framework","authors":"Pierre Laborde ,&nbsp;Yann Le Goff ,&nbsp;Éric Le Pors ,&nbsp;Alain Plantec ,&nbsp;Steven Costiou","doi":"10.1016/j.cola.2024.101286","DOIUrl":"https://doi.org/10.1016/j.cola.2024.101286","url":null,"abstract":"<div><p>At Thales Defense Mission Systems (DMS), software products first go through an industrial prototyping phase. Prototypes are serious applications that we evaluate with our end-users during demonstrations. End-users have a central role in the design process of our products. They often ask for software modifications during demonstrations to experiment new ideas or to focus the existing design on their needs.</p><p>In this paper, we present how we combined Smalltalk’s live-programming capabilities with software component models to obtain flexible and modular software designs in our context of live prototyping. We present Molecule, an open-source implementation of the Lightweight CORBA Component Model in Pharo. We use Molecule to build HMI systems prototypes, and we benefit from the dynamic run-time modification capabilities of Pharo during demonstrations with our end-users where we explore software designs in a lively way.</p><p>Molecule is an industrial contribution to Smalltalk, as it capitalizes 20 years of usage and maturation in our prototyping activity. The Molecule framework and tools are now mature, and we started building end-user software used in production at Thales DMS. We present two such end-user software and analyze their component architecture, that are representative of how we (learnt to) build HMI prototypes. Finally, we analyze our technological decisions with regards to the benefits we sought for our industrial activity.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"80 ","pages":"Article 101286"},"PeriodicalIF":1.7,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141480844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Computer Languages
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1