首页 > 最新文献

Journal of Computer Languages最新文献

英文 中文
Near-Pruned single assignment transformation of programs
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-05 DOI: 10.1016/j.cola.2025.101324
Akshay M. Fajge, Raju Halder
This paper introduces Near-Pruned SSA, a novel variant of the SSA form that attains precision close to the Pruned version while prioritizing its efficient generation without the need for costly data flow analysis. This is realized by leveraging variables’ usage information within the program’s augmented CFG. Furthermore, we propose a direct method for generating DSA form of programs that bypasses the traditional process of ϕ-node destruction into its immediate predecessor-blocks, thereby streamlining the process. Experimental evaluation on a range of Solidity programs, including real-world smart contracts deployed on the Ethereum mainnet, demonstrates that our method outperforms existing SSA variants, except for the Pruned version, by minimizing the number of introduced ϕ-statements compared to state-of-the-art techniques. In particular, the proposed Near-Pruned variant demonstrates a computational cost that is approximately one-third of that of the Pruned variant while achieving a nearly 92% reduction in the introduction of additional statements compared to the Semi-Pruned variant.
{"title":"Near-Pruned single assignment transformation of programs","authors":"Akshay M. Fajge,&nbsp;Raju Halder","doi":"10.1016/j.cola.2025.101324","DOIUrl":"10.1016/j.cola.2025.101324","url":null,"abstract":"<div><div>This paper introduces <span>Near-Pruned</span> <span>SSA</span>, a novel variant of the <span>SSA</span> form that attains precision close to the <span>Pruned</span> version while prioritizing its efficient generation without the need for costly data flow analysis. This is realized by leveraging variables’ usage information within the program’s <em>augmented</em> <span>CFG</span>. Furthermore, we propose a direct method for generating <span>DSA</span> form of programs that bypasses the traditional process of <span><math><mi>ϕ</mi></math></span>-node destruction into its immediate predecessor-blocks, thereby streamlining the process. Experimental evaluation on a range of <em>Solidity</em> programs, including <em>real-world</em> smart contracts deployed on the <em>Ethereum mainnet</em>, demonstrates that our method outperforms existing <span>SSA</span> variants, except for the <span>Pruned</span> version, by minimizing the number of introduced <span><math><mi>ϕ</mi></math></span>-statements compared to <em>state-of-the-art</em> techniques. In particular, the proposed <span>Near-Pruned</span> variant demonstrates a computational cost that is approximately one-third of that of the <span>Pruned</span> variant while achieving a nearly 92% reduction in the introduction of additional statements compared to the <span>Semi-Pruned</span> variant.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"83 ","pages":"Article 101324"},"PeriodicalIF":1.7,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143360843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MLAPW: A framework to assess the impact of feature selection and sampling techniques on anti-pattern prediction using WSDL metrics
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-01 DOI: 10.1016/j.cola.2025.101322
Lov Kumar , Vikram Singh , Lalita Bhanu Murthy , Aneesh Krishna , Sanjay Misra
<div><h3>Context:</h3><div>The quality and design of Service-Based Systems may be degraded because of frequent changes, and negatively impacts the software design quality called <strong>Anti-patterns</strong>. The existence of these Anti-patterns highly impacts the overall maintainability of Service-Based Systems. Hence, early detection of these anti-patterns’ presence becomes mandatory with co-located modifications. However, it is not easy to find these anti-patterns manually.</div></div><div><h3>Objective:</h3><div>The objective of this work is to explore the role of WSDL (Web Services Description Language) metrics (MLAPW) for anti-pattern prediction using a Machine Learning (ML) based framework. This framework encompasses different variants of feature selection techniques, data sampling techniques, and a wide range of ML algorithms. This work empirically investigates the predictive ability of anti-pattern prediction models developed using different sets of WSDL metrics. Our major focus is to investigate ’<em>how these metrics accurately predict different types of Anti-patterns present in the WSDL file</em>’.</div></div><div><h3>Methods:</h3><div>To achieve the objective, different sets of WSDL metrics such as Structural Quality Metrics, Procedural Quality Metrics, Data Quality Metrics, Quality Metrics, and Complexity metrics, are used as input for Anti-patterns prediction models. Since these models use WSDL metrics as input, we have also used feature selection methods to find the best sets of WSDL metrics. These models are trained using various machine-learning techniques. This study also shows the performance of these models trained on balanced data using data sampling techniques. Finally, the empirical investigation of these techniques was done using accuracy and ROC (receiver operating characteristic curve) curve (AUC) with hypothesis testing.</div></div><div><h3>Results:</h3><div>The empirical study’s observation is based on 226 WSDL files from various domains such as finance, tourism, health, education, etc. The assessment asserts that the models trained using WSDL metrics have 0.79 mean AUC and 0.90 Median AUC. However, the models trained using the selected feature with classifier feature subset selection (CFS) have a better mean AUC of 0.80 and median AUC of 0.97. The experimental results also confirm that the models trained on up-sampling (UPSAM) have a better mean AUC of 0.79 and median AUC of 0.91 with a low value of Friedman rank of 2.40. Finally, the models trained using the least square support vector machine (LSSVM) achieved 1 median AUC, 0.99 mean AUC, and a low Friedman rank of 1.30.</div></div><div><h3>Conclusion:</h3><div>The experimental results show that the AUC values of the models trained using Data and Procedural Quality Metrics are high as compared to the other sets of metrics. However, the models improved significantly in their prediction performance after employing feature selection techniques. The experimental result
{"title":"MLAPW: A framework to assess the impact of feature selection and sampling techniques on anti-pattern prediction using WSDL metrics","authors":"Lov Kumar ,&nbsp;Vikram Singh ,&nbsp;Lalita Bhanu Murthy ,&nbsp;Aneesh Krishna ,&nbsp;Sanjay Misra","doi":"10.1016/j.cola.2025.101322","DOIUrl":"10.1016/j.cola.2025.101322","url":null,"abstract":"&lt;div&gt;&lt;h3&gt;Context:&lt;/h3&gt;&lt;div&gt;The quality and design of Service-Based Systems may be degraded because of frequent changes, and negatively impacts the software design quality called &lt;strong&gt;Anti-patterns&lt;/strong&gt;. The existence of these Anti-patterns highly impacts the overall maintainability of Service-Based Systems. Hence, early detection of these anti-patterns’ presence becomes mandatory with co-located modifications. However, it is not easy to find these anti-patterns manually.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Objective:&lt;/h3&gt;&lt;div&gt;The objective of this work is to explore the role of WSDL (Web Services Description Language) metrics (MLAPW) for anti-pattern prediction using a Machine Learning (ML) based framework. This framework encompasses different variants of feature selection techniques, data sampling techniques, and a wide range of ML algorithms. This work empirically investigates the predictive ability of anti-pattern prediction models developed using different sets of WSDL metrics. Our major focus is to investigate ’&lt;em&gt;how these metrics accurately predict different types of Anti-patterns present in the WSDL file&lt;/em&gt;’.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Methods:&lt;/h3&gt;&lt;div&gt;To achieve the objective, different sets of WSDL metrics such as Structural Quality Metrics, Procedural Quality Metrics, Data Quality Metrics, Quality Metrics, and Complexity metrics, are used as input for Anti-patterns prediction models. Since these models use WSDL metrics as input, we have also used feature selection methods to find the best sets of WSDL metrics. These models are trained using various machine-learning techniques. This study also shows the performance of these models trained on balanced data using data sampling techniques. Finally, the empirical investigation of these techniques was done using accuracy and ROC (receiver operating characteristic curve) curve (AUC) with hypothesis testing.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Results:&lt;/h3&gt;&lt;div&gt;The empirical study’s observation is based on 226 WSDL files from various domains such as finance, tourism, health, education, etc. The assessment asserts that the models trained using WSDL metrics have 0.79 mean AUC and 0.90 Median AUC. However, the models trained using the selected feature with classifier feature subset selection (CFS) have a better mean AUC of 0.80 and median AUC of 0.97. The experimental results also confirm that the models trained on up-sampling (UPSAM) have a better mean AUC of 0.79 and median AUC of 0.91 with a low value of Friedman rank of 2.40. Finally, the models trained using the least square support vector machine (LSSVM) achieved 1 median AUC, 0.99 mean AUC, and a low Friedman rank of 1.30.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;Conclusion:&lt;/h3&gt;&lt;div&gt;The experimental results show that the AUC values of the models trained using Data and Procedural Quality Metrics are high as compared to the other sets of metrics. However, the models improved significantly in their prediction performance after employing feature selection techniques. The experimental result","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"83 ","pages":"Article 101322"},"PeriodicalIF":1.7,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143349974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Code histories: Documenting development by recording code influences and changes in code
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-25 DOI: 10.1016/j.cola.2024.101313
Vo Thien Tri Pham, Caitlin Kelleher
Developers frequently encounter challenges when working with large code bases found in modern software applications, from navigating through files to more complex tasks like understanding code histories, dependencies, and evolutions. While many applications use Version Control Systems (VCSs) to archive present-day programs and provide a historical perspective on code development, the level of detail they offer is often insufficient for in-depth analyses. As a result, it becomes difficult to fully explore the potential benefits of historical data in software development. We introduce an enhanced recording framework that integrates both the Visual Studio Code (VS Code) development environment and the Google Chrome web browser to capture more detailed development activities. Our framework is designed to offer additional recording options, thereby providing researchers with more opportunities to study how different historical resources can be utilized. Through an observational study, we demonstrate the utility of our framework in capturing the complex dynamics of code change activities, highlighting its potential value in both academic and practical contexts.
{"title":"Code histories: Documenting development by recording code influences and changes in code","authors":"Vo Thien Tri Pham,&nbsp;Caitlin Kelleher","doi":"10.1016/j.cola.2024.101313","DOIUrl":"10.1016/j.cola.2024.101313","url":null,"abstract":"<div><div>Developers frequently encounter challenges when working with large code bases found in modern software applications, from navigating through files to more complex tasks like understanding code histories, dependencies, and evolutions. While many applications use Version Control Systems (VCSs) to archive present-day programs and provide a historical perspective on code development, the level of detail they offer is often insufficient for in-depth analyses. As a result, it becomes difficult to fully explore the potential benefits of historical data in software development. We introduce an enhanced recording framework that integrates both the Visual Studio Code (VS Code) development environment and the Google Chrome web browser to capture more detailed development activities. Our framework is designed to offer additional recording options, thereby providing researchers with more opportunities to study how different historical resources can be utilized. Through an observational study, we demonstrate the utility of our framework in capturing the complex dynamics of code change activities, highlighting its potential value in both academic and practical contexts.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101313"},"PeriodicalIF":1.7,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143101473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive meta-analysis of efficiency and effectiveness in the detection community
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-24 DOI: 10.1016/j.cola.2024.101314
Mohamed Amine Daoud , Sid Ahmed Mokhtar Mostefaoui , Abdelkader Ouared , Hadj Madani Meghazi , Bendaoud Mebarek , Abdelkader Bouguessa , Hasan Ahmed
Creating an intrusion detection system (IDS) is a prominent area of research that continuously draws attention from both scholars and practitioners who tirelessly innovate new solutions. The complexity of IDS naturally escalates alongside technological advancements, whether they are manually implemented within security infrastructures or elaborated upon in academic literature. However, accessing and comparing these IDS solutions requires sifting through a multitude of hypotheses presented in research papers, which is a laborious and error-prone endeavor. Consequently, many researchers encounter difficulties in replicating results or reanalyzing published IDSs. This challenge primarily arises due to the absence of a standardized process for elucidating IDS methodologies. In response, this paper advocates for a framework aimed at enhancing the reproducibility of IDS outcomes, thereby enabling their seamless reuse across diverse cybersecurity contexts, benefiting both end-users and experts alike. The proposed framework introduces a descriptive language for the precise specification of IDS descriptions. Additionally, a model repository facilitates the sharing and reusability of IDS configurations. Lastly, through a case study, we showcase the effectiveness of our framework in addressing challenges associated with data acquisition and knowledge organization and sharing. Our results demonstrate satisfactory prediction accuracy for configuration reuse and precise identification of reusable components.
{"title":"A comprehensive meta-analysis of efficiency and effectiveness in the detection community","authors":"Mohamed Amine Daoud ,&nbsp;Sid Ahmed Mokhtar Mostefaoui ,&nbsp;Abdelkader Ouared ,&nbsp;Hadj Madani Meghazi ,&nbsp;Bendaoud Mebarek ,&nbsp;Abdelkader Bouguessa ,&nbsp;Hasan Ahmed","doi":"10.1016/j.cola.2024.101314","DOIUrl":"10.1016/j.cola.2024.101314","url":null,"abstract":"<div><div>Creating an intrusion detection system (IDS) is a prominent area of research that continuously draws attention from both scholars and practitioners who tirelessly innovate new solutions. The complexity of IDS naturally escalates alongside technological advancements, whether they are manually implemented within security infrastructures or elaborated upon in academic literature. However, accessing and comparing these IDS solutions requires sifting through a multitude of hypotheses presented in research papers, which is a laborious and error-prone endeavor. Consequently, many researchers encounter difficulties in replicating results or reanalyzing published IDSs. This challenge primarily arises due to the absence of a standardized process for elucidating IDS methodologies. In response, this paper advocates for a framework aimed at enhancing the reproducibility of IDS outcomes, thereby enabling their seamless reuse across diverse cybersecurity contexts, benefiting both end-users and experts alike. The proposed framework introduces a descriptive language for the precise specification of IDS descriptions. Additionally, a model repository facilitates the sharing and reusability of IDS configurations. Lastly, through a case study, we showcase the effectiveness of our framework in addressing challenges associated with data acquisition and knowledge organization and sharing. Our results demonstrate satisfactory prediction accuracy for configuration reuse and precise identification of reusable components.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101314"},"PeriodicalIF":1.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143101472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MTable: Visual query interface for browsing and navigation in NoSQL data stores
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-05 DOI: 10.1016/j.cola.2024.101312
Kanika Soni, Shelly Sachdeva
Almost all human endeavors in the era of the digital revolution, from commercial and industrial processes to scientific and medical research, depend on the use of ever-increasing amounts of data. However, this humungous data and its complexity make data exploration and querying challenging even for experts. This led to the demand for easy access to data, even for naive users, all the more evident. Considering this, the database community has tilted toward NoSQL Data stores. While there has been much study on query formulation assistance for NoSQL data stores, many users still want help when specifying complex queries (such as aggregation pipeline queries), which require an in-depth understanding of the data storage architecture of a specific NoSQL data store. To help users perform interactive browsing and navigation in NoSQL data stores (MongoDB), this paper proposes a novel, simple, and user-friendly interface, MTable, that provides users with a presentation-level interactive view. This view compactly presents the query results from multiple embedded documents within a single tabular format compared to MongoDB's find operation, which always returns the main document. A certain cell of the MTable contains clickable hyperlinks for users to interact directly with the data persisted in the document stores. This helps the users to incrementally construct complex queries and navigate the document stores without worrying about the tedious task of writing complex queries. In a user study, participants performed various querying tasks faster with MTable than with the traditional querying mechanism. MTable has received positive subjective feedback as well.
{"title":"MTable: Visual query interface for browsing and navigation in NoSQL data stores","authors":"Kanika Soni,&nbsp;Shelly Sachdeva","doi":"10.1016/j.cola.2024.101312","DOIUrl":"10.1016/j.cola.2024.101312","url":null,"abstract":"<div><div>Almost all human endeavors in the era of the digital revolution, from commercial and industrial processes to scientific and medical research, depend on the use of ever-increasing amounts of data. However, this humungous data and its complexity make data exploration and querying challenging even for experts. This led to the demand for easy access to data, even for naive users, all the more evident. Considering this, the database community has tilted toward NoSQL Data stores. While there has been much study on query formulation assistance for NoSQL data stores, many users still want help when specifying complex queries (such as aggregation pipeline queries), which require an in-depth understanding of the data storage architecture of a specific NoSQL data store. To help users perform interactive browsing and navigation in NoSQL data stores (MongoDB), this paper proposes a novel, simple, and user-friendly interface, MTable, that provides users with a presentation-level interactive view. This view compactly presents the query results from multiple embedded documents within a single tabular format compared to MongoDB's find operation, which always returns the main document. A certain cell of the MTable contains clickable hyperlinks for users to interact directly with the data persisted in the document stores. This helps the users to incrementally construct complex queries and navigate the document stores without worrying about the tedious task of writing complex queries. In a user study, participants performed various querying tasks faster with MTable than with the traditional querying mechanism. MTable has received positive subjective feedback as well.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101312"},"PeriodicalIF":1.7,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143101471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mental stress analysis by measuring heart rate variability during learning programming: Comparison of visual- and text-based languages
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-12-03 DOI: 10.1016/j.cola.2024.101311
Katsuyuki Umezawa , Takumi Koshikawa , Makoto Nakazawa , Shigeichi Hirasawa
Visual-based programming languages that facilitate block-based coding have gained popularity as introductory methods for learning programming. Conversely, programming experts typically use text-based programming languages like C and Java. Nevertheless, a seamless method for transitioning from a visual- to text-based language has yet to be developed. Therefore, our research project aims to develop a methodology that facilitates this transition by bridging the gap between the two languages and verifying the variations in the biometric information of learners of both languages. In this study, we measured the participants’ heart rate variability (HRV) and evaluated variations in mental stress experienced while learning visual- and text-based languages. The experimental results confirmed that participants proficient in text-based languages experienced lower HRV (indicating higher stress levels) when learning visual-based languages. Conversely, those poorly proficient in text-based languages exhibited higher HRVs (indicating more favorable stress levels) while learning text-based languages. This study successfully observed differences in stress levels while learning both language types using experimental methods. These findings serve as a preliminary step toward clarifying the impact of stress experienced during learning outcomes and identifying the factors that constitute beneficial stress. This study establishes a foundation for an intermediate language that can enhance transitions between the two types of languages.
{"title":"Mental stress analysis by measuring heart rate variability during learning programming: Comparison of visual- and text-based languages","authors":"Katsuyuki Umezawa ,&nbsp;Takumi Koshikawa ,&nbsp;Makoto Nakazawa ,&nbsp;Shigeichi Hirasawa","doi":"10.1016/j.cola.2024.101311","DOIUrl":"10.1016/j.cola.2024.101311","url":null,"abstract":"<div><div>Visual-based programming languages that facilitate block-based coding have gained popularity as introductory methods for learning programming. Conversely, programming experts typically use text-based programming languages like C and Java. Nevertheless, a seamless method for transitioning from a visual- to text-based language has yet to be developed. Therefore, our research project aims to develop a methodology that facilitates this transition by bridging the gap between the two languages and verifying the variations in the biometric information of learners of both languages. In this study, we measured the participants’ heart rate variability (HRV) and evaluated variations in mental stress experienced while learning visual- and text-based languages. The experimental results confirmed that participants proficient in text-based languages experienced lower HRV (indicating higher stress levels) when learning visual-based languages. Conversely, those poorly proficient in text-based languages exhibited higher HRVs (indicating more favorable stress levels) while learning text-based languages. This study successfully observed differences in stress levels while learning both language types using experimental methods. These findings serve as a preliminary step toward clarifying the impact of stress experienced during learning outcomes and identifying the factors that constitute beneficial stress. This study establishes a foundation for an intermediate language that can enhance transitions between the two types of languages.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101311"},"PeriodicalIF":1.7,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143101470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining type inference techniques for semi-automatic UML generation from Pharo code 结合类型推断技术,从 Pharo 代码中半自动生成 UML
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-11-14 DOI: 10.1016/j.cola.2024.101300
Jan Blizničenko, Robert Pergl
This paper explores how to reconstruct UML diagrams from dynamically typed languages such as Smalltalk, which do not use explicit type information. This lack of information makes traditional methods for extracting associations difficult. It addresses the need for automated techniques, particularly in legacy software systems, to facilitate their transformation into modern technologies, focusing on Smalltalk as a case study due to its extensive industrial legacy and modern adaptations like Pharo. We propose a way to create UML diagrams from Smalltalk code, focusing on using type inference to determine UML associations. For optimal outcomes for large-scale software systems, we recommend combining different type inference methods in an automatic or semi-automatic way.
本文探讨了如何从动态类型语言(如 Smalltalk)中重建 UML 图表,因为这种语言不使用显式类型信息。这种信息的缺乏使得提取关联的传统方法变得困难。本文以Smalltalk为例,探讨了对自动化技术的需求,特别是在传统软件系统中,以促进其向现代技术的转化。我们提出了一种从Smalltalk代码中创建UML图表的方法,重点是使用类型推论来确定UML关联。为了使大型软件系统达到最佳效果,我们建议以自动或半自动的方式结合不同的类型推断方法。
{"title":"Combining type inference techniques for semi-automatic UML generation from Pharo code","authors":"Jan Blizničenko,&nbsp;Robert Pergl","doi":"10.1016/j.cola.2024.101300","DOIUrl":"10.1016/j.cola.2024.101300","url":null,"abstract":"<div><div>This paper explores how to reconstruct UML diagrams from dynamically typed languages such as Smalltalk, which do not use explicit type information. This lack of information makes traditional methods for extracting associations difficult. It addresses the need for automated techniques, particularly in legacy software systems, to facilitate their transformation into modern technologies, focusing on Smalltalk as a case study due to its extensive industrial legacy and modern adaptations like Pharo. We propose a way to create UML diagrams from Smalltalk code, focusing on using type inference to determine UML associations. For optimal outcomes for large-scale software systems, we recommend combining different type inference methods in an automatic or semi-automatic way.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"82 ","pages":"Article 101300"},"PeriodicalIF":1.7,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient instance selection algorithm for fast training of support vector machine for cross-project software defect prediction pairs 用于跨项目软件缺陷预测对支持向量机快速训练的高效实例选择算法
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-23 DOI: 10.1016/j.cola.2024.101301
Manpreet Singh, Jitender Kumar Chhabra
SVM is limited in its use for cross-project software defect prediction because of its very slow training process. So, this research article proposes a new instance selection (IS) algorithm called boundary detection among classes (BDAC) to reduce the training dataset size for faster training of SVM without degrading the prediction performance. The proposed algorithm is evaluated against six existing IS algorithms based on accuracy, running time, data reduction rate, etc. using 23 general datasets, 18 software defect prediction datasets, and two shape-based datasets, and results prove that BDAC is better than the selected algorithm based on collective comparison.
SVM 在跨项目软件缺陷预测中的应用受到限制,因为其训练过程非常缓慢。因此,本文提出了一种名为 "类间边界检测"(BDAC)的新实例选择(IS)算法,以减少训练数据集的大小,从而在不降低预测性能的情况下加快 SVM 的训练速度。文章使用 23 个一般数据集、18 个软件缺陷预测数据集和 2 个基于形状的数据集,根据准确度、运行时间、数据减少率等指标,对所提出的算法与现有的 6 种 IS 算法进行了评估,结果证明,基于集体比较,BDAC 优于所选算法。
{"title":"An efficient instance selection algorithm for fast training of support vector machine for cross-project software defect prediction pairs","authors":"Manpreet Singh,&nbsp;Jitender Kumar Chhabra","doi":"10.1016/j.cola.2024.101301","DOIUrl":"10.1016/j.cola.2024.101301","url":null,"abstract":"<div><div>SVM is limited in its use for cross-project software defect prediction because of its very slow training process. So, this research article proposes a new instance selection (IS) algorithm called boundary detection among classes (BDAC) to reduce the training dataset size for faster training of SVM without degrading the prediction performance. The proposed algorithm is evaluated against six existing IS algorithms based on accuracy, running time, data reduction rate, etc. using 23 general datasets, 18 software defect prediction datasets, and two shape-based datasets, and results prove that BDAC is better than the selected algorithm based on collective comparison.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101301"},"PeriodicalIF":1.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection and treatment of string events in the limit 探测和处理极限串事件
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-21 DOI: 10.1016/j.cola.2024.101299
Alex Holmquist , Vitor Emanuel , Fernando C. Alves , Fernando Magno Quintão Pereira
A string event is a pattern that occurs in a stream of characters. The need to detect and handle string events in infinite texts emerges in many scenarios, including online treatment of logs, web crawling, and syntax highlighting. This paper describes a technique to specify and treat string events. Users determine patterns of interest via a markup language. From such examples, tokens are generalized via a semi-lattice of regular expressions. Such tokens are combined into a context-free language that recognizes patterns in the text stream. These techniques are implemented in a text processing system called Lushu, which runs on the Java Virtual Machine (JVM). Lushu intercepts strings emitted by the JVM. Once patterns are detected, it invokes a user-specified action handler. As a proof of concept, this paper shows that Lushu outperforms state-of-the-art parsers and parser generators, such as Comby, BeautifulSoup4 and ZheFuscator, in terms of memory consumption and running time.
字符串事件是出现在字符流中的一种模式。在许多情况下,都需要检测和处理无限文本中的字符串事件,包括在线处理日志、网络爬行和语法高亮。本文介绍了一种指定和处理字符串事件的技术。用户通过标记语言确定感兴趣的模式。根据这些示例,通过正则表达式的半晶格对标记进行概括。这些标记被组合成一种无语境语言,可识别文本流中的模式。这些技术在一个名为 Lushu 的文本处理系统中得以实现,该系统在 Java 虚拟机(JVM)上运行。Lushu 拦截 JVM 发出的字符串。一旦检测到模式,它就会调用用户指定的动作处理程序。作为概念验证,本文展示了 Lushu 在内存消耗和运行时间方面优于 Comby、BeautifulSoup4 和 ZheFuscator 等最先进的解析器和解析器生成器。
{"title":"Detection and treatment of string events in the limit","authors":"Alex Holmquist ,&nbsp;Vitor Emanuel ,&nbsp;Fernando C. Alves ,&nbsp;Fernando Magno Quintão Pereira","doi":"10.1016/j.cola.2024.101299","DOIUrl":"10.1016/j.cola.2024.101299","url":null,"abstract":"<div><div>A string event is a pattern that occurs in a stream of characters. The need to detect and handle string events in infinite texts emerges in many scenarios, including online treatment of logs, web crawling, and syntax highlighting. This paper describes a technique to specify and treat string events. Users determine patterns of interest via a markup language. From such examples, tokens are generalized via a semi-lattice of regular expressions. Such tokens are combined into a context-free language that recognizes patterns in the text stream. These techniques are implemented in a text processing system called <span>Lushu</span>, which runs on the Java Virtual Machine (JVM). <span>Lushu</span> intercepts strings emitted by the JVM. Once patterns are detected, it invokes a user-specified action handler. As a proof of concept, this paper shows that <span>Lushu</span> outperforms state-of-the-art parsers and parser generators, such as <span>Comby</span>, <span>BeautifulSoup4</span> and <span>ZheFuscator</span>, in terms of memory consumption and running time.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101299"},"PeriodicalIF":1.7,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ClangOz: Parallel constant evaluation of C++ map and reduce operations ClangOz:C++ 映射和还原操作的并行常量评估
IF 1.7 3区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-10-10 DOI: 10.1016/j.cola.2024.101298
Paul Keir , Andrew Gozillon
Interest in metaprogramming, reflection, and compile-time evaluation continues to inspire and foster innovation among the users and designers of the C++ programming language. Regrettably, the impact on compile-times of such features can be significant; and outside of build systems, multi-core parallelism is unable to bring down compilation times of individual translation units. We present ClangOz, a novel Clang-based research compiler that addresses this issue by evaluating annotated constant expressions in parallel, thereby reducing compilation times. Prior benchmarks analyzed parallel map operations, but were unable to consider reduction operations. Thus we also introduce parallel reduction functionality, alongside two additional benchmark programs.
对元编程、反射和编译时评估的兴趣不断激发和促进 C++ 编程语言用户和设计者的创新。遗憾的是,这些功能对编译时间的影响可能很大;在构建系统之外,多核并行性无法降低单个翻译单元的编译时间。我们介绍的 ClangOz 是一种基于 Clang 的新型研究编译器,它通过并行评估注释常量表达式来解决这一问题,从而缩短编译时间。之前的基准分析了并行映射操作,但无法考虑还原操作。因此,我们还引入了并行还原功能以及两个额外的基准程序。
{"title":"ClangOz: Parallel constant evaluation of C++ map and reduce operations","authors":"Paul Keir ,&nbsp;Andrew Gozillon","doi":"10.1016/j.cola.2024.101298","DOIUrl":"10.1016/j.cola.2024.101298","url":null,"abstract":"<div><div>Interest in metaprogramming, reflection, and compile-time evaluation continues to inspire and foster innovation among the users and designers of the C++ programming language. Regrettably, the impact on compile-times of such features can be significant; and outside of build systems, multi-core parallelism is unable to bring down compilation times of individual translation units. We present ClangOz, a novel Clang-based research compiler that addresses this issue by evaluating annotated constant expressions in parallel, thereby reducing compilation times. Prior benchmarks analyzed parallel map operations, but were unable to consider reduction operations. Thus we also introduce parallel reduction functionality, alongside two additional benchmark programs.</div></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"81 ","pages":"Article 101298"},"PeriodicalIF":1.7,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142440881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Computer Languages
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1