首页 > 最新文献

Journal of Computer Languages最新文献

英文 中文
A comprehensive meta-analysis of efficiency and effectiveness in the detection community 对检测社区的效率和有效性进行综合荟萃分析
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-24 DOI: 10.1016/j.cola.2024.101314
Mohamed Amine Daoud , Sid Ahmed Mokhtar Mostefaoui , Abdelkader Ouared , Hadj Madani Meghazi , Bendaoud Mebarek , Abdelkader Bouguessa , Hasan Ahmed
Creating an intrusion detection system (IDS) is a prominent area of research that continuously draws attention from both scholars and practitioners who tirelessly innovate new solutions. The complexity of IDS naturally escalates alongside technological advancements, whether they are manually implemented within security infrastructures or elaborated upon in academic literature. However, accessing and comparing these IDS solutions requires sifting through a multitude of hypotheses presented in research papers, which is a laborious and error-prone endeavor. Consequently, many researchers encounter difficulties in replicating results or reanalyzing published IDSs. This challenge primarily arises due to the absence of a standardized process for elucidating IDS methodologies. In response, this paper advocates for a framework aimed at enhancing the reproducibility of IDS outcomes, thereby enabling their seamless reuse across diverse cybersecurity contexts, benefiting both end-users and experts alike. The proposed framework introduces a descriptive language for the precise specification of IDS descriptions. Additionally, a model repository facilitates the sharing and reusability of IDS configurations. Lastly, through a case study, we showcase the effectiveness of our framework in addressing challenges associated with data acquisition and knowledge organization and sharing. Our results demonstrate satisfactory prediction accuracy for configuration reuse and precise identification of reusable components.
创建入侵检测系统(IDS)是一个突出的研究领域,不断受到学者和实践者的关注,他们孜孜不倦地创新新的解决方案。IDS的复杂性自然会随着技术的进步而升级,无论是在安全基础设施中手动实现还是在学术文献中详细阐述。然而,访问和比较这些IDS解决方案需要筛选研究论文中提出的大量假设,这是一项费力且容易出错的工作。因此,许多研究人员在重复结果或重新分析已发表的ids时遇到困难。这一挑战主要是由于缺乏阐明IDS方法的标准化过程而产生的。作为回应,本文主张建立一个框架,旨在提高IDS结果的可重复性,从而使其在不同的网络安全环境中无缝重用,从而使最终用户和专家都受益。提出的框架引入了一种描述性语言,用于精确规范IDS描述。此外,模型存储库有助于IDS配置的共享和可重用性。最后,通过案例研究,我们展示了我们的框架在应对与数据获取、知识组织和共享相关的挑战方面的有效性。结果表明,配置复用的预测精度令人满意,可复用组件的识别精度较高。
{"title":"A comprehensive meta-analysis of efficiency and effectiveness in the detection community","authors":"Mohamed Amine Daoud ,&nbsp;Sid Ahmed Mokhtar Mostefaoui ,&nbsp;Abdelkader Ouared ,&nbsp;Hadj Madani Meghazi ,&nbsp;Bendaoud Mebarek ,&nbsp;Abdelkader Bouguessa ,&nbsp;Hasan Ahmed","doi":"10.1016/j.cola.2024.101314","DOIUrl":"10.1016/j.cola.2024.101314","url":null,"abstract":"<div><div>Creating an intrusion detection system (IDS) is a prominent area of research that continuously draws attention from both scholars and practitioners who tirelessly innovate new solutions. The complexity of IDS naturally escalates alongside technological advancements, whether they are manually implemented within security infrastructures or elaborated upon in academic literature. However, accessing and comparing these IDS solutions requires sifting through a multitude of hypotheses presented in research papers, which is a laborious and error-prone endeavor. Consequently, many researchers encounter difficulties in replicating results or reanalyzing published IDSs. This challenge primarily arises due to the absence of a standardized process for elucidating IDS methodologies. In response, this paper advocates for a framework aimed at enhancing the reproducibility of IDS outcomes, thereby enabling their seamless reuse across diverse cybersecurity contexts, benefiting both end-users and experts alike. The proposed framework introduces a descriptive language for the precise specification of IDS descriptions. Additionally, a model repository facilitates the sharing and reusability of IDS configurations. Lastly, through a case study, we showcase the effectiveness of our framework in addressing challenges associated with data acquisition and knowledge organization and sharing. Our results demonstrate satisfactory prediction accuracy for configuration reuse and precise identification of reusable components.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101314"},"PeriodicalIF":1.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143101472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MTable: Visual query interface for browsing and navigation in NoSQL data stores MTable:可视化查询界面,用于浏览和导航NoSQL数据存储
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-05 DOI: 10.1016/j.cola.2024.101312
Kanika Soni, Shelly Sachdeva
Almost all human endeavors in the era of the digital revolution, from commercial and industrial processes to scientific and medical research, depend on the use of ever-increasing amounts of data. However, this humungous data and its complexity make data exploration and querying challenging even for experts. This led to the demand for easy access to data, even for naive users, all the more evident. Considering this, the database community has tilted toward NoSQL Data stores. While there has been much study on query formulation assistance for NoSQL data stores, many users still want help when specifying complex queries (such as aggregation pipeline queries), which require an in-depth understanding of the data storage architecture of a specific NoSQL data store. To help users perform interactive browsing and navigation in NoSQL data stores (MongoDB), this paper proposes a novel, simple, and user-friendly interface, MTable, that provides users with a presentation-level interactive view. This view compactly presents the query results from multiple embedded documents within a single tabular format compared to MongoDB's find operation, which always returns the main document. A certain cell of the MTable contains clickable hyperlinks for users to interact directly with the data persisted in the document stores. This helps the users to incrementally construct complex queries and navigate the document stores without worrying about the tedious task of writing complex queries. In a user study, participants performed various querying tasks faster with MTable than with the traditional querying mechanism. MTable has received positive subjective feedback as well.
在数字革命时代,从商业和工业流程到科学和医学研究,几乎所有人类活动都依赖于使用不断增加的数据量。然而,如此庞大的数据及其复杂性使得数据探索和查询即使对专家来说也是具有挑战性的。这导致了对易于访问数据的需求,即使对于幼稚的用户来说也是如此,这一点更加明显。考虑到这一点,数据库社区倾向于NoSQL数据存储。虽然对NoSQL数据存储的查询公式辅助已经有了很多研究,但是许多用户在指定复杂查询(如聚合管道查询)时仍然需要帮助,这需要对特定NoSQL数据存储的数据存储架构有深入的了解。为了帮助用户在NoSQL数据存储(MongoDB)中进行交互式浏览和导航,本文提出了一个新颖、简单、用户友好的界面MTable,它为用户提供了一个表示级的交互式视图。与MongoDB的find操作(总是返回主文档)相比,该视图以单一表格格式紧凑地呈现来自多个嵌入文档的查询结果。MTable的某个单元格包含可单击的超链接,供用户直接与保存在文档存储中的数据交互。这有助于用户增量地构造复杂查询和浏览文档存储,而不必担心编写复杂查询的繁琐任务。在用户研究中,参与者使用MTable比使用传统查询机制更快地执行各种查询任务。MTable也收到了积极的主观反馈。
{"title":"MTable: Visual query interface for browsing and navigation in NoSQL data stores","authors":"Kanika Soni,&nbsp;Shelly Sachdeva","doi":"10.1016/j.cola.2024.101312","DOIUrl":"10.1016/j.cola.2024.101312","url":null,"abstract":"<div><div>Almost all human endeavors in the era of the digital revolution, from commercial and industrial processes to scientific and medical research, depend on the use of ever-increasing amounts of data. However, this humungous data and its complexity make data exploration and querying challenging even for experts. This led to the demand for easy access to data, even for naive users, all the more evident. Considering this, the database community has tilted toward NoSQL Data stores. While there has been much study on query formulation assistance for NoSQL data stores, many users still want help when specifying complex queries (such as aggregation pipeline queries), which require an in-depth understanding of the data storage architecture of a specific NoSQL data store. To help users perform interactive browsing and navigation in NoSQL data stores (MongoDB), this paper proposes a novel, simple, and user-friendly interface, MTable, that provides users with a presentation-level interactive view. This view compactly presents the query results from multiple embedded documents within a single tabular format compared to MongoDB's find operation, which always returns the main document. A certain cell of the MTable contains clickable hyperlinks for users to interact directly with the data persisted in the document stores. This helps the users to incrementally construct complex queries and navigate the document stores without worrying about the tedious task of writing complex queries. In a user study, participants performed various querying tasks faster with MTable than with the traditional querying mechanism. MTable has received positive subjective feedback as well.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101312"},"PeriodicalIF":1.7,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143101471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mental stress analysis by measuring heart rate variability during learning programming: Comparison of visual- and text-based languages 通过测量学习编程过程中的心率变异性来分析精神压力:基于视觉和基于文本的语言的比较
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-03 DOI: 10.1016/j.cola.2024.101311
Katsuyuki Umezawa , Takumi Koshikawa , Makoto Nakazawa , Shigeichi Hirasawa
Visual-based programming languages that facilitate block-based coding have gained popularity as introductory methods for learning programming. Conversely, programming experts typically use text-based programming languages like C and Java. Nevertheless, a seamless method for transitioning from a visual- to text-based language has yet to be developed. Therefore, our research project aims to develop a methodology that facilitates this transition by bridging the gap between the two languages and verifying the variations in the biometric information of learners of both languages. In this study, we measured the participants’ heart rate variability (HRV) and evaluated variations in mental stress experienced while learning visual- and text-based languages. The experimental results confirmed that participants proficient in text-based languages experienced lower HRV (indicating higher stress levels) when learning visual-based languages. Conversely, those poorly proficient in text-based languages exhibited higher HRVs (indicating more favorable stress levels) while learning text-based languages. This study successfully observed differences in stress levels while learning both language types using experimental methods. These findings serve as a preliminary step toward clarifying the impact of stress experienced during learning outcomes and identifying the factors that constitute beneficial stress. This study establishes a foundation for an intermediate language that can enhance transitions between the two types of languages.
促进基于块的编码的基于视觉的编程语言已经作为学习编程的入门方法而流行起来。相反,编程专家通常使用基于文本的编程语言,如C和Java。然而,一种从视觉语言到基于文本的语言无缝转换的方法还有待开发。因此,我们的研究项目旨在开发一种方法,通过弥合两种语言之间的差距并验证两种语言学习者生物特征信息的变化来促进这种转变。在这项研究中,我们测量了参与者的心率变异性(HRV),并评估了学习基于视觉和文本的语言时所经历的精神压力的变化。实验结果证实,精通基于文本的语言的参与者在学习基于视觉的语言时经历了更低的HRV(表明更高的压力水平)。相反,那些对基于文本的语言不熟练的人在学习基于文本的语言时表现出更高的hrv(表明更有利的压力水平)。本研究通过实验方法成功观察了两种语言学习过程中压力水平的差异。这些发现为澄清压力对学习结果的影响和确定构成有益压力的因素迈出了初步的一步。本研究奠定了中间语言的基础,可以促进两种语言之间的转换。
{"title":"Mental stress analysis by measuring heart rate variability during learning programming: Comparison of visual- and text-based languages","authors":"Katsuyuki Umezawa ,&nbsp;Takumi Koshikawa ,&nbsp;Makoto Nakazawa ,&nbsp;Shigeichi Hirasawa","doi":"10.1016/j.cola.2024.101311","DOIUrl":"10.1016/j.cola.2024.101311","url":null,"abstract":"<div><div>Visual-based programming languages that facilitate block-based coding have gained popularity as introductory methods for learning programming. Conversely, programming experts typically use text-based programming languages like C and Java. Nevertheless, a seamless method for transitioning from a visual- to text-based language has yet to be developed. Therefore, our research project aims to develop a methodology that facilitates this transition by bridging the gap between the two languages and verifying the variations in the biometric information of learners of both languages. In this study, we measured the participants’ heart rate variability (HRV) and evaluated variations in mental stress experienced while learning visual- and text-based languages. The experimental results confirmed that participants proficient in text-based languages experienced lower HRV (indicating higher stress levels) when learning visual-based languages. Conversely, those poorly proficient in text-based languages exhibited higher HRVs (indicating more favorable stress levels) while learning text-based languages. This study successfully observed differences in stress levels while learning both language types using experimental methods. These findings serve as a preliminary step toward clarifying the impact of stress experienced during learning outcomes and identifying the factors that constitute beneficial stress. This study establishes a foundation for an intermediate language that can enhance transitions between the two types of languages.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101311"},"PeriodicalIF":1.7,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143101470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining type inference techniques for semi-automatic UML generation from Pharo code 结合类型推断技术,从 Pharo 代码中半自动生成 UML
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-11-14 DOI: 10.1016/j.cola.2024.101300
Jan Blizničenko, Robert Pergl
This paper explores how to reconstruct UML diagrams from dynamically typed languages such as Smalltalk, which do not use explicit type information. This lack of information makes traditional methods for extracting associations difficult. It addresses the need for automated techniques, particularly in legacy software systems, to facilitate their transformation into modern technologies, focusing on Smalltalk as a case study due to its extensive industrial legacy and modern adaptations like Pharo. We propose a way to create UML diagrams from Smalltalk code, focusing on using type inference to determine UML associations. For optimal outcomes for large-scale software systems, we recommend combining different type inference methods in an automatic or semi-automatic way.
本文探讨了如何从动态类型语言(如 Smalltalk)中重建 UML 图表,因为这种语言不使用显式类型信息。这种信息的缺乏使得提取关联的传统方法变得困难。本文以Smalltalk为例,探讨了对自动化技术的需求,特别是在传统软件系统中,以促进其向现代技术的转化。我们提出了一种从Smalltalk代码中创建UML图表的方法,重点是使用类型推论来确定UML关联。为了使大型软件系统达到最佳效果,我们建议以自动或半自动的方式结合不同的类型推断方法。
{"title":"Combining type inference techniques for semi-automatic UML generation from Pharo code","authors":"Jan Blizničenko,&nbsp;Robert Pergl","doi":"10.1016/j.cola.2024.101300","DOIUrl":"10.1016/j.cola.2024.101300","url":null,"abstract":"<div><div>This paper explores how to reconstruct UML diagrams from dynamically typed languages such as Smalltalk, which do not use explicit type information. This lack of information makes traditional methods for extracting associations difficult. It addresses the need for automated techniques, particularly in legacy software systems, to facilitate their transformation into modern technologies, focusing on Smalltalk as a case study due to its extensive industrial legacy and modern adaptations like Pharo. We propose a way to create UML diagrams from Smalltalk code, focusing on using type inference to determine UML associations. For optimal outcomes for large-scale software systems, we recommend combining different type inference methods in an automatic or semi-automatic way.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101300"},"PeriodicalIF":1.7,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient instance selection algorithm for fast training of support vector machine for cross-project software defect prediction pairs 用于跨项目软件缺陷预测对支持向量机快速训练的高效实例选择算法
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-23 DOI: 10.1016/j.cola.2024.101301
Manpreet Singh, Jitender Kumar Chhabra
SVM is limited in its use for cross-project software defect prediction because of its very slow training process. So, this research article proposes a new instance selection (IS) algorithm called boundary detection among classes (BDAC) to reduce the training dataset size for faster training of SVM without degrading the prediction performance. The proposed algorithm is evaluated against six existing IS algorithms based on accuracy, running time, data reduction rate, etc. using 23 general datasets, 18 software defect prediction datasets, and two shape-based datasets, and results prove that BDAC is better than the selected algorithm based on collective comparison.
SVM 在跨项目软件缺陷预测中的应用受到限制,因为其训练过程非常缓慢。因此,本文提出了一种名为 "类间边界检测"(BDAC)的新实例选择(IS)算法,以减少训练数据集的大小,从而在不降低预测性能的情况下加快 SVM 的训练速度。文章使用 23 个一般数据集、18 个软件缺陷预测数据集和 2 个基于形状的数据集,根据准确度、运行时间、数据减少率等指标,对所提出的算法与现有的 6 种 IS 算法进行了评估,结果证明,基于集体比较,BDAC 优于所选算法。
{"title":"An efficient instance selection algorithm for fast training of support vector machine for cross-project software defect prediction pairs","authors":"Manpreet Singh,&nbsp;Jitender Kumar Chhabra","doi":"10.1016/j.cola.2024.101301","DOIUrl":"10.1016/j.cola.2024.101301","url":null,"abstract":"<div><div>SVM is limited in its use for cross-project software defect prediction because of its very slow training process. So, this research article proposes a new instance selection (IS) algorithm called boundary detection among classes (BDAC) to reduce the training dataset size for faster training of SVM without degrading the prediction performance. The proposed algorithm is evaluated against six existing IS algorithms based on accuracy, running time, data reduction rate, etc. using 23 general datasets, 18 software defect prediction datasets, and two shape-based datasets, and results prove that BDAC is better than the selected algorithm based on collective comparison.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101301"},"PeriodicalIF":1.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection and treatment of string events in the limit 探测和处理极限串事件
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-21 DOI: 10.1016/j.cola.2024.101299
Alex Holmquist , Vitor Emanuel , Fernando C. Alves , Fernando Magno Quintão Pereira
A string event is a pattern that occurs in a stream of characters. The need to detect and handle string events in infinite texts emerges in many scenarios, including online treatment of logs, web crawling, and syntax highlighting. This paper describes a technique to specify and treat string events. Users determine patterns of interest via a markup language. From such examples, tokens are generalized via a semi-lattice of regular expressions. Such tokens are combined into a context-free language that recognizes patterns in the text stream. These techniques are implemented in a text processing system called Lushu, which runs on the Java Virtual Machine (JVM). Lushu intercepts strings emitted by the JVM. Once patterns are detected, it invokes a user-specified action handler. As a proof of concept, this paper shows that Lushu outperforms state-of-the-art parsers and parser generators, such as Comby, BeautifulSoup4 and ZheFuscator, in terms of memory consumption and running time.
字符串事件是出现在字符流中的一种模式。在许多情况下,都需要检测和处理无限文本中的字符串事件,包括在线处理日志、网络爬行和语法高亮。本文介绍了一种指定和处理字符串事件的技术。用户通过标记语言确定感兴趣的模式。根据这些示例,通过正则表达式的半晶格对标记进行概括。这些标记被组合成一种无语境语言,可识别文本流中的模式。这些技术在一个名为 Lushu 的文本处理系统中得以实现,该系统在 Java 虚拟机(JVM)上运行。Lushu 拦截 JVM 发出的字符串。一旦检测到模式,它就会调用用户指定的动作处理程序。作为概念验证,本文展示了 Lushu 在内存消耗和运行时间方面优于 Comby、BeautifulSoup4 和 ZheFuscator 等最先进的解析器和解析器生成器。
{"title":"Detection and treatment of string events in the limit","authors":"Alex Holmquist ,&nbsp;Vitor Emanuel ,&nbsp;Fernando C. Alves ,&nbsp;Fernando Magno Quintão Pereira","doi":"10.1016/j.cola.2024.101299","DOIUrl":"10.1016/j.cola.2024.101299","url":null,"abstract":"<div><div>A string event is a pattern that occurs in a stream of characters. The need to detect and handle string events in infinite texts emerges in many scenarios, including online treatment of logs, web crawling, and syntax highlighting. This paper describes a technique to specify and treat string events. Users determine patterns of interest via a markup language. From such examples, tokens are generalized via a semi-lattice of regular expressions. Such tokens are combined into a context-free language that recognizes patterns in the text stream. These techniques are implemented in a text processing system called <span>Lushu</span>, which runs on the Java Virtual Machine (JVM). <span>Lushu</span> intercepts strings emitted by the JVM. Once patterns are detected, it invokes a user-specified action handler. As a proof of concept, this paper shows that <span>Lushu</span> outperforms state-of-the-art parsers and parser generators, such as <span>Comby</span>, <span>BeautifulSoup4</span> and <span>ZheFuscator</span>, in terms of memory consumption and running time.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101299"},"PeriodicalIF":1.7,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ClangOz: Parallel constant evaluation of C++ map and reduce operations ClangOz:C++ 映射和还原操作的并行常量评估
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-10 DOI: 10.1016/j.cola.2024.101298
Paul Keir , Andrew Gozillon
Interest in metaprogramming, reflection, and compile-time evaluation continues to inspire and foster innovation among the users and designers of the C++ programming language. Regrettably, the impact on compile-times of such features can be significant; and outside of build systems, multi-core parallelism is unable to bring down compilation times of individual translation units. We present ClangOz, a novel Clang-based research compiler that addresses this issue by evaluating annotated constant expressions in parallel, thereby reducing compilation times. Prior benchmarks analyzed parallel map operations, but were unable to consider reduction operations. Thus we also introduce parallel reduction functionality, alongside two additional benchmark programs.
对元编程、反射和编译时评估的兴趣不断激发和促进 C++ 编程语言用户和设计者的创新。遗憾的是,这些功能对编译时间的影响可能很大;在构建系统之外,多核并行性无法降低单个翻译单元的编译时间。我们介绍的 ClangOz 是一种基于 Clang 的新型研究编译器,它通过并行评估注释常量表达式来解决这一问题,从而缩短编译时间。之前的基准分析了并行映射操作,但无法考虑还原操作。因此,我们还引入了并行还原功能以及两个额外的基准程序。
{"title":"ClangOz: Parallel constant evaluation of C++ map and reduce operations","authors":"Paul Keir ,&nbsp;Andrew Gozillon","doi":"10.1016/j.cola.2024.101298","DOIUrl":"10.1016/j.cola.2024.101298","url":null,"abstract":"<div><div>Interest in metaprogramming, reflection, and compile-time evaluation continues to inspire and foster innovation among the users and designers of the C++ programming language. Regrettably, the impact on compile-times of such features can be significant; and outside of build systems, multi-core parallelism is unable to bring down compilation times of individual translation units. We present ClangOz, a novel Clang-based research compiler that addresses this issue by evaluating annotated constant expressions in parallel, thereby reducing compilation times. Prior benchmarks analyzed parallel map operations, but were unable to consider reduction operations. Thus we also introduce parallel reduction functionality, alongside two additional benchmark programs.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101298"},"PeriodicalIF":1.7,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MoTion: A new declarative object matching approach in Pharo MoTion:Pharo 中一种新的声明式对象匹配方法
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-23 DOI: 10.1016/j.cola.2024.101290
Aless Hosry , Vincent Aranega , Nicolas Anquetil

Pattern matching is an expressive way of matching data and extracting pieces of information from it. The recent inclusion of pattern matching in the Java and Python languages highlights that such a facility is more and more adopted by developers for everyday development. Other main stream programming languages also offer pattern matching capabilities as part of the language (Rust, Scala, Haskell, and OCaml), with different degrees of expressivity in what can be matched. In the meantime, in graphs, pattern matching takes a slightly different turn; it enhances the expressivity of the patterns that can be defined. Smalltalk currently offers little pattern matching capability to find specific objects inside a large graph of objects using a declarative pattern. In Pharo, the closest library to classical pattern matching that exists is the RBParseTreeSearcher, which allows to express specialized patterns over a Pharo Abstract Syntax Tree to find some inner node. The question arises of what features a flexible pattern matching language should have. In this paper, we review the features found in different existing pattern matching languages, both in General Purpose Languages (like Java) and in declarative graph pattern matching languages. We then describe MoTion, a new pattern matching engine for Pharo smalltalk, combining all these features. We discuss some aspects of MoTion’s implementation and illustrate its use with real case examples.

模式匹配是匹配数据并从中提取信息的一种表达方式。最近,Java 和 Python 语言中加入了模式匹配功能,这表明这种功能越来越多地被开发人员用于日常开发。其他主流编程语言(Rust、Scala、Haskell 和 OCaml)也将模式匹配功能作为语言的一部分,但对可匹配内容的表达程度各有不同。与此同时,在图形中,模式匹配的方式略有不同;它增强了可定义模式的表现力。目前,Smalltalk几乎不提供模式匹配功能,无法使用声明模式在大型对象图中查找特定对象。在Pharo中,最接近经典模式匹配的库是RBParseTreeSearcher,它允许在Pharo抽象语法树上表达专门的模式,以查找某个内部节点。问题是,灵活的模式匹配语言应该具备哪些功能。在本文中,我们回顾了现有模式匹配语言的特点,包括通用语言(如 Java)和声明式图模式匹配语言。然后,我们介绍了MoTion--一种用于Pharo smalltalk的新模式匹配引擎,它将所有这些特性结合在了一起。我们将讨论MoTion实现的某些方面,并通过实际案例来说明其用途。
{"title":"MoTion: A new declarative object matching approach in Pharo","authors":"Aless Hosry ,&nbsp;Vincent Aranega ,&nbsp;Nicolas Anquetil","doi":"10.1016/j.cola.2024.101290","DOIUrl":"10.1016/j.cola.2024.101290","url":null,"abstract":"<div><p>Pattern matching is an expressive way of matching data and extracting pieces of information from it. The recent inclusion of pattern matching in the Java and Python languages highlights that such a facility is more and more adopted by developers for everyday development. Other main stream programming languages also offer pattern matching capabilities as part of the language (Rust, Scala, Haskell, and OCaml), with different degrees of expressivity in what can be matched. In the meantime, in graphs, pattern matching takes a slightly different turn; it enhances the expressivity of the patterns that can be defined. Smalltalk currently offers little pattern matching capability to find specific objects inside a large graph of objects using a declarative pattern. In Pharo, the closest library to classical pattern matching that exists is the <span>RBParseTreeSearcher</span>, which allows to express specialized patterns over a Pharo Abstract Syntax Tree to find some inner node. The question arises of what features a flexible pattern matching language should have. In this paper, we review the features found in different existing pattern matching languages, both in General Purpose Languages (like Java) and in declarative graph pattern matching languages. We then describe MoTion, a new pattern matching engine for Pharo smalltalk, combining all these features. We discuss some aspects of MoTion’s implementation and illustrate its use with real case examples.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101290"},"PeriodicalIF":1.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An empirical study on divergence of differently-sourced LLVM IRs 关于不同来源 LLVM IR 分歧的实证研究
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-05 DOI: 10.1016/j.cola.2024.101289
Zhenzhou Tian , Yuchen Gong , Chenhao Chang , Jiaze Sun , Yanping Chen , Lingwei Chen

In solving binary code similarity detection, many approaches choose to operate on certain unified intermediate representations (IRs), such as Low Level Virtual Machine (LLVM) IR, to overcome the cross-architecture analysis challenge induced by the significant morphological and syntactic gaps across the diverse instruction set architectures (ISAs). However, the LLVM IRs of the same program can be affected by diverse factors, such as the acquisition source, i.e., compiled from source code or disassembled and lifted from binary code. While the impact of compilation settings on binary code has been explored, the specific differences between LLVM IRs from varied sources remain underexamined. To this end, we pioneer an in-depth empirical study to assess the discrepancies in LLVM IRs derived from different sources. Correspondingly, an extensive dataset containing nearly 98 million LLVM IR instructions distributed in 808,431 functions is curated with respect to these potential IR-influential factors. On this basis, three types of code metrics detailing the syntactic, structural, and semantic aspects of the IR samples are devised and leveraged to assess the divergence of the IRs across different origins. The findings offer insights into how and to what extent the various factors affect the IRs, providing valuable guidance for assembling a training corpus aimed at developing robust LLVM IR-oriented pre-training models, as well as facilitating relevant program analysis studies that operate on the LLVM IRs.

在解决二进制代码相似性检测问题时,许多方法都选择对某些统一的中间表示(IR)(如低级虚拟机(LLVM)IR)进行操作,以克服由于不同指令集架构(ISA)之间存在明显的形态和语法差距而引起的跨架构分析难题。然而,同一程序的 LLVM IR 会受到不同因素的影响,例如获取源,即从源代码编译或从二进制代码反汇编和提取。虽然已经探讨了编译设置对二进制代码的影响,但对不同来源的 LLVM IR 之间的具体差异仍未进行深入研究。为此,我们率先开展了一项深入的实证研究,以评估不同来源的 LLVM IR 之间的差异。相应地,我们根据这些潜在的 IR 影响因素,对包含 808431 个函数中近 9800 万条 LLVM IR 指令的大量数据集进行了分析。在此基础上,我们设计了三种代码度量标准,详细描述了 IR 样本的语法、结构和语义方面,并利用这些标准来评估不同来源的 IR 的差异。研究结果深入揭示了各种因素如何以及在多大程度上影响了 IR,为组建旨在开发强大的 LLVM IR 面向预训练模型的训练语料库提供了宝贵的指导,同时也促进了以 LLVM IR 为基础的相关程序分析研究。
{"title":"An empirical study on divergence of differently-sourced LLVM IRs","authors":"Zhenzhou Tian ,&nbsp;Yuchen Gong ,&nbsp;Chenhao Chang ,&nbsp;Jiaze Sun ,&nbsp;Yanping Chen ,&nbsp;Lingwei Chen","doi":"10.1016/j.cola.2024.101289","DOIUrl":"10.1016/j.cola.2024.101289","url":null,"abstract":"<div><p>In solving binary code similarity detection, many approaches choose to operate on certain unified intermediate representations (IRs), such as Low Level Virtual Machine (LLVM) IR, to overcome the cross-architecture analysis challenge induced by the significant morphological and syntactic gaps across the diverse instruction set architectures (ISAs). However, the LLVM IRs of the same program can be affected by diverse factors, such as the acquisition source, i.e., compiled from source code or disassembled and lifted from binary code. While the impact of compilation settings on binary code has been explored, the specific differences between LLVM IRs from varied sources remain underexamined. To this end, we pioneer an in-depth empirical study to assess the discrepancies in LLVM IRs derived from different sources. Correspondingly, an extensive dataset containing nearly 98 million LLVM IR instructions distributed in 808,431 functions is curated with respect to these potential IR-influential factors. On this basis, three types of code metrics detailing the syntactic, structural, and semantic aspects of the IR samples are devised and leveraged to assess the divergence of the IRs across different origins. The findings offer insights into how and to what extent the various factors affect the IRs, providing valuable guidance for assembling a training corpus aimed at developing robust LLVM IR-oriented pre-training models, as well as facilitating relevant program analysis studies that operate on the LLVM IRs.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101289"},"PeriodicalIF":1.7,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fault localization by abstract interpretation and its applications 通过抽象解释进行故障定位及其应用
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-08-01 DOI: 10.1016/j.cola.2024.101288
Aleksandar S. Dimovski

Fault localization aims to automatically identify the cause of an error in a program by localizing the error to a relatively small part of the program. In this paper, we present a novel technique for automated fault localization via error invariants inferred by abstract interpretation. An error invariant for a location in an error program over-approximates the reachable states at the given location that may produce the error, if the execution of the program is continued from that location. Error invariants can be used for statement-wise semantic slicing of error programs and for obtaining concise error explanations. We use an iterative refinement sequence of backward–forward static analyses by abstract interpretation to compute error invariants, which are designed to explain why an error program violates a particular assertion.

Furthermore, we present a practical application of the fault localization technique for automatic repair of programs. Given an erroneous program, we first use the fault localization to automatically identify statements relevant for the error, and then repeatedly mutate the expressions in those relevant statements until a correct program that satisfies all assertions is found. All other statements classified by the fault localization as irrelevant for the error are not mutated in the program repair process. This way, we significantly reduce the search space of mutated programs without losing any potentially correct program, and so locate a repaired program much faster than a program repair without fault localization.

We have developed a prototype tool for automatic fault localization and repair of C programs. We demonstrate the effectiveness of our approach to localize errors in realistic C programs, and to subsequently repair them. Moreover, we show that our approach based on combining fault localization and code mutations is significantly faster that the previous program repair approach without fault localization.

故障定位的目的是通过将错误定位到程序中相对较小的部分来自动识别程序中的错误原因。在本文中,我们提出了一种通过抽象解释推断出的错误不变式进行自动故障定位的新技术。错误程序中某一位置的错误不变式可以过度近似给定位置上可能产生错误的可到达状态,如果程序从该位置继续执行的话。错误不变式可用于对错误程序进行语句语义切分,并获得简明的错误解释。我们通过抽象解释使用后向-前向静态分析的迭代精炼序列来计算错误不变式,旨在解释错误程序违反特定断言的原因。给定一个错误的程序,我们首先使用故障定位自动识别与错误相关的语句,然后反复修改这些相关语句中的表达式,直到找到一个满足所有断言的正确程序。在程序修复过程中,所有被故障定位归类为与错误无关的其他语句都不会被修改。通过这种方法,我们在不丢失任何潜在正确程序的情况下,大大减少了变异程序的搜索空间,因此修复程序的定位速度比不进行故障定位的程序修复快得多。我们开发了自动定位和修复 C 语言程序故障的原型工具。我们展示了我们的方法在定位现实 C 语言程序中的错误以及随后修复它们的有效性。此外,我们还展示了基于故障定位和代码突变相结合的方法,其修复速度明显快于之前不进行故障定位的程序修复方法。
{"title":"Fault localization by abstract interpretation and its applications","authors":"Aleksandar S. Dimovski","doi":"10.1016/j.cola.2024.101288","DOIUrl":"10.1016/j.cola.2024.101288","url":null,"abstract":"<div><p><em>Fault localization</em> aims to automatically identify the cause of an error in a program by localizing the error to a relatively small part of the program. In this paper, we present a novel technique for automated fault localization via <em>error invariants</em> inferred by abstract interpretation. An error invariant for a location in an error program over-approximates the reachable states at the given location that may produce the error, if the execution of the program is continued from that location. Error invariants can be used for <em>statement-wise semantic slicing</em> of error programs and for obtaining concise error explanations. We use an iterative refinement sequence of backward–forward static analyses by abstract interpretation to compute error invariants, which are designed to explain why an error program violates a particular assertion.</p><p>Furthermore, we present a practical application of the fault localization technique for automatic repair of programs. Given an erroneous program, we first use the fault localization to automatically identify statements relevant for the error, and then repeatedly mutate the expressions in those relevant statements until a correct program that satisfies all assertions is found. All other statements classified by the fault localization as irrelevant for the error are not mutated in the program repair process. This way, we significantly reduce the search space of mutated programs without losing any potentially correct program, and so locate a repaired program much faster than a program repair without fault localization.</p><p>We have developed a prototype tool for automatic fault localization and repair of C programs. We demonstrate the effectiveness of our approach to localize errors in realistic C programs, and to subsequently repair them. Moreover, we show that our approach based on combining fault localization and code mutations is significantly faster that the previous program repair approach without fault localization.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"80 ","pages":"Article 101288"},"PeriodicalIF":1.7,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141845094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Computer Languages
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1