Pub Date : 2024-01-04DOI: 10.1016/j.cola.2023.101257
Hamzeh Eyal Salman
Issues are highly prevalent on GitHub due to the increasing scale of its software repositories. These issues are submitted to the issue tracking system for several reasons: reporting a bug, asking a question, or other maintenance activities. The attractive repositories on Github receive a large number of issues daily. Assigning similar issues individually to different developers for validating and fixing introduces inconsistencies when asynchronously independent developers fix them, in addition to slowing the fixing process. However, grouping similar issues into clusters and assigning each cluster to the same and appropriate developer/team speeds up the fixing process. In this paper, a machine learning algorithm-based approach has been proposed to support issue management on GitHub by grouping similar issues together. For validity, the proposed approach was applied to 13 software components from different and large repositories. Findings reveal that the proposed approach identifies similar clusters of issues with promising results using widely used evaluation measures in this subject: Precision, Recall, and F-measure.
{"title":"AI-based clustering of similar issues in GitHub’s repositories","authors":"Hamzeh Eyal Salman","doi":"10.1016/j.cola.2023.101257","DOIUrl":"10.1016/j.cola.2023.101257","url":null,"abstract":"<div><p>Issues are highly prevalent on GitHub due to the increasing scale of its software repositories. These issues are submitted to the issue tracking system for several reasons: reporting a bug, asking a question, or other maintenance activities. The attractive repositories on Github receive a large number of issues daily. Assigning similar issues individually to different developers for validating and fixing introduces inconsistencies when asynchronously independent developers fix them, in addition to slowing the fixing process. However, grouping similar issues into clusters and assigning each cluster to the same and appropriate developer/team speeds up the fixing process. In this paper, a machine learning algorithm-based approach has been proposed to support issue management on GitHub by grouping similar issues together. For validity, the proposed approach was applied to 13 software components from different and large repositories. Findings reveal that the proposed approach identifies similar clusters of issues with promising results using widely used evaluation measures in this subject: Precision, Recall, and F-measure.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101257"},"PeriodicalIF":2.2,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139095633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-18DOI: 10.1016/j.cola.2023.101256
Claudio Di Sipio, Juri Di Rocco, Davide Di Ruscio, Phuong T. Nguyen
To facilitate the development of recommender systems for software engineering (RSSEs), this paper introduces LEV4REC, a model-driven approach supporting all RSSE development stages, from design to deployment. It enables parameter fine-tuning, enhancing the developer and user experience by using a dedicated feature model for early configuration. We evaluated LEV4REC by applying it to two existing RSSEs based on different algorithms.
Results demonstrate its ability to recreate suitable recommendations and outperform a state-of-the-art approach. Qualitative findings from a focus group study further validate LEV4REC’s effectiveness, while indicating the need for extension points to support additional systems.
{"title":"LEV4REC: A feature-based approach to engineering RSSEs","authors":"Claudio Di Sipio, Juri Di Rocco, Davide Di Ruscio, Phuong T. Nguyen","doi":"10.1016/j.cola.2023.101256","DOIUrl":"10.1016/j.cola.2023.101256","url":null,"abstract":"<div><p><span>To facilitate the development of recommender systems<span> for software engineering (RSSEs), this paper introduces LEV4REC, a model-driven approach supporting all RSSE development stages, from design to deployment. It enables parameter fine-tuning, enhancing the developer and </span></span>user experience by using a dedicated feature model for early configuration. We evaluated LEV4REC by applying it to two existing RSSEs based on different algorithms.</p><p>Results demonstrate its ability to recreate suitable recommendations and outperform a state-of-the-art approach. Qualitative findings from a focus group study further validate LEV4REC’s effectiveness, while indicating the need for extension points to support additional systems.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101256"},"PeriodicalIF":2.2,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138741160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Test amplification exploits the knowledge embedded in an existing test suite to strengthen it. A typical test amplification technique transforms the initial tests into additional test methods that increase the mutation coverage. Although past research demonstrated the benefits, additional steps need to be taken to incorporate test amplifiers in the everyday workflow of developers. This paper describes a proof-of-concept bot integrating Small-Amp with GitHub-Actions. The bot decides for itself which tests to amplify and does so within a limited time budget. To integrate the bot into the GitHub-Actions workflow, we incorporate three special-purpose features: (i) prioritization (to fit the process within a given time budget), (ii) sharding (to split lengthy tests into smaller chunks), and (iii) sandboxing (to make the amplifier crash-resilient). We evaluate our approach by installing the proof-of-concept extension of Small-Amp on five open-source projects deployed on GitHub. Our results show that a test amplification bot is feasible at a project level by integrating it into the build system. Moreover, we quantify the impact of prioritization, sharding, and sandboxing so that other test amplifiers may benefit from these special-purpose features. Our proof-of-concept demonstrates that the entry barrier for adopting test amplification can be significantly lowered.
{"title":"A test amplification bot for Pharo/Smalltalk","authors":"Mehrdad Abdi , Henrique Rocha , Alexandre Bergel , Serge Demeyer","doi":"10.1016/j.cola.2023.101255","DOIUrl":"10.1016/j.cola.2023.101255","url":null,"abstract":"<div><p>Test amplification exploits the knowledge embedded in an existing test suite to strengthen it. A typical test amplification technique transforms the initial tests into additional test methods that increase the mutation coverage. Although past research demonstrated the benefits, additional steps need to be taken to incorporate test amplifiers in the everyday workflow of developers. This paper describes a proof-of-concept bot integrating <span>Small-Amp</span> with <span>GitHub-Actions</span>. The bot decides for itself which tests to amplify and does so within a limited time budget. To integrate the bot into the <span>GitHub-Actions</span> workflow, we incorporate three special-purpose features: (i) prioritization (to fit the process within a given time budget), (ii) sharding (to split lengthy tests into smaller chunks), and (iii) sandboxing (to make the amplifier crash-resilient). We evaluate our approach by installing the proof-of-concept extension of <span>Small-Amp</span> on five open-source projects deployed on GitHub. Our results show that a test amplification bot is feasible at a project level by integrating it into the build system. Moreover, we quantify the impact of prioritization, sharding, and sandboxing so that other test amplifiers may benefit from these special-purpose features. Our proof-of-concept demonstrates that the entry barrier for adopting test amplification can be significantly lowered.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101255"},"PeriodicalIF":2.2,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138565824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-02DOI: 10.1016/j.cola.2023.101253
Manpreet Singh, Jitender Kumar Chhabra
Many code metrics exist for bug prediction. However, these metrics are based on the trivial count of code properties and are not sufficient. This research article proposes three new code metrics based on class complexity, coupling, and cohesion to fill the gap. The Promise repository metrics suite's complexity, coupling, and cohesion metrics are replaced by the proposed metrics, and a new metric suite is generated. Experiments show that the proposed metrics suite gives more than 2 % improvement in AUC and precision and approximately 1.5 % in f1-score and recall with fewer code metrics than the existing metrics suite.
{"title":"Improved software fault prediction using new code metrics and machine learning algorithms","authors":"Manpreet Singh, Jitender Kumar Chhabra","doi":"10.1016/j.cola.2023.101253","DOIUrl":"10.1016/j.cola.2023.101253","url":null,"abstract":"<div><p>Many code metrics exist for bug prediction. However, these metrics are based on the trivial count of code properties and are not sufficient. This research article proposes three new code metrics based on class complexity, coupling, and cohesion to fill the gap. The Promise repository metrics suite's complexity, coupling, and cohesion metrics are replaced by the proposed metrics, and a new metric suite is generated. Experiments show that the proposed metrics suite gives more than 2 % improvement in AUC and precision and approximately 1.5 % in f1-score and recall with fewer code metrics than the existing metrics suite.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101253"},"PeriodicalIF":2.2,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-30DOI: 10.1016/j.cola.2023.101254
Felicien Ihirwe , Davide Di Ruscio , Simone Gianfranceschi , Alfonso Pierantonio
Context:
The current technology revolution, which places the highest value on people’s welfare, is frequently seen as being mainly supported by Internet of Things (IoT) technologies. IoT is regarded as a powerful multi-layered network of systems that integrates several heterogeneous, independently networked (sub-)systems working together to achieve a shared purpose.
Objective:
In this article, we present CHESSIoT, a model-driven engineering environment that integrates high-level visual design languages, software development, safety analysis, and deployment approaches for engineering multi-layered IoT systems. With CHESSIoT, users may conduct different engineering tasks on system and software models under development to enable earlier decision-making and take prospective measures, all supported by a unique environment.
Methodology:
This is achieved through multi-staged designs, most notably the physical, functional, and deployment architectures. The physical model specification is used to perform both qualitative and quantitative safety analysis by employing logical Fault-Trees models (FTs). The functional model specifies the system’s functional behavior and is later used to generate platform-specific code that can be deployed on low-level IoT device nodes. Additionally, the framework supports modeling the system’s deployment plan and run-time service provisioning, which would ultimately be transformed into deployment configuration artifacts ready for execution on remote servers.
Results:
To showcase the effectiveness of our proposed approach, as well as the capability of the supporting tool, a multi-layered Home Automation system (HAS) scenario has been developed covering all its design, development, analysis, and deployment aspects. Furthermore, we present the results from different evaluation mechanisms which include a comparative analysis and a qualitative assessment. The evaluation mechanisms target mainly completeness of CHESSIoT by addressing specific research questions.
{"title":"CHESSIoT: A model-driven approach for engineering multi-layered IoT systems","authors":"Felicien Ihirwe , Davide Di Ruscio , Simone Gianfranceschi , Alfonso Pierantonio","doi":"10.1016/j.cola.2023.101254","DOIUrl":"10.1016/j.cola.2023.101254","url":null,"abstract":"<div><h3>Context:</h3><p>The current technology revolution, which places the highest value on people’s welfare, is frequently seen as being mainly supported by Internet of Things (IoT) technologies. IoT is regarded as a powerful multi-layered network of systems that integrates several heterogeneous, independently networked (sub-)systems working together to achieve a shared purpose.</p></div><div><h3>Objective:</h3><p>In this article, we present CHESSIoT, a model-driven engineering environment that integrates high-level visual design languages, software development, safety analysis, and deployment approaches for engineering multi-layered IoT systems. With CHESSIoT, users may conduct different engineering tasks on system and software models under development to enable earlier decision-making and take prospective measures, all supported by a unique environment.</p></div><div><h3>Methodology:</h3><p>This is achieved through multi-staged designs, most notably the physical, functional, and deployment architectures<span>. The physical model specification is used to perform both qualitative and quantitative safety analysis by employing logical Fault-Trees models (FTs). The functional model specifies the system’s functional behavior and is later used to generate platform-specific code that can be deployed on low-level IoT device nodes. Additionally, the framework supports modeling the system’s deployment plan and run-time service provisioning, which would ultimately be transformed into deployment configuration artifacts ready for execution on remote servers.</span></p></div><div><h3>Results:</h3><p>To showcase the effectiveness of our proposed approach, as well as the capability of the supporting tool, a multi-layered Home Automation system (HAS) scenario has been developed covering all its design, development, analysis, and deployment aspects. Furthermore, we present the results from different evaluation mechanisms which include a comparative analysis and a qualitative assessment. The evaluation mechanisms target mainly completeness of CHESSIoT by addressing specific research questions.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101254"},"PeriodicalIF":2.2,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-23DOI: 10.1016/j.cola.2023.101242
Xu Zhu , Miguel A. Nacenta , Özgür Akgün , Daniel Zenkovitch
Discrete constraint problems surface often in everyday life. Teachers might group students with complex considerations and hospital administrators need to produce staff rosters. Constraint programming (CP) provides techniques to efficiently find solutions. However, there remains a key challenge: these techniques are still largely inaccessible because expressing constraint problems requires sophisticated programming and logic skills. In this work we contribute a language and tool that leverage knowledge of how non-experts conceptualize problems to facilitate the expression of constraint models. Additionally, we report the results of a study surveying the advantages and remaining challenges towards making CP accessible to the wider public.
{"title":"Solvi: A visual constraint modeling tool","authors":"Xu Zhu , Miguel A. Nacenta , Özgür Akgün , Daniel Zenkovitch","doi":"10.1016/j.cola.2023.101242","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101242","url":null,"abstract":"<div><p>Discrete constraint problems surface often in everyday life. Teachers might group students with complex considerations and hospital administrators need to produce staff rosters. Constraint programming (CP) provides techniques to efficiently find solutions. However, there remains a key challenge: these techniques are still largely inaccessible because expressing constraint problems requires sophisticated programming and logic skills. In this work we contribute a language and tool that leverage knowledge of how non-experts conceptualize problems to facilitate the expression of constraint models. Additionally, we report the results of a study surveying the advantages and remaining challenges towards making CP accessible to the wider public.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101242"},"PeriodicalIF":2.2,"publicationDate":"2023-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590118423000527/pdfft?md5=42dcd60e8822ed624ec930252ba9fd7e&pid=1-s2.0-S2590118423000527-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138435915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-23DOI: 10.1016/j.cola.2023.101251
Paul Boutot, Mirza Rehenuma Tabassum, Abdul Abedin, Sadaf Mustafiz
The engineering of IoT (Internet of Things) systems brings about various challenges due to the inherent complexities associated with such adaptive systems. Addressing the adaptive nature of IoT systems in the early stages of the development life cycle is essential for developing a complete and precise system specification. In this paper, we propose a use case-based modelling language, UCM4IoT, to support requirements elicitation and specification of IoT systems. UCM4IoT takes into account the heterogeneity of IoT systems and provides domain-specific language constructs to model the different facets of IoT systems. The language also incorporates the notion of exceptional situations and adaptive system behaviour. Our language is supported with a textual modelling environment to assist modellers in writing use cases. The environment supports syntax-directed editing, validation of use case models, and requirements analysis. The proposed language and tool is demonstrated and evaluated with two case studies: smart store system and smart fire alarm system.
{"title":"Requirements development for IoT systems with UCM4IoT","authors":"Paul Boutot, Mirza Rehenuma Tabassum, Abdul Abedin, Sadaf Mustafiz","doi":"10.1016/j.cola.2023.101251","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101251","url":null,"abstract":"<div><p><span>The engineering of IoT (Internet of Things) systems brings about various challenges due to the inherent complexities associated with such adaptive systems. Addressing the adaptive nature of IoT systems in the early stages of the development life cycle<span> is essential for developing a complete and precise system specification. In this paper, we propose a use case-based modelling language<span>, UCM4IoT, to support requirements elicitation and specification of IoT systems. UCM4IoT takes into account the heterogeneity of IoT systems and provides domain-specific language constructs to model the different facets of IoT systems. The language also incorporates the notion of exceptional situations and adaptive system behaviour. Our language is supported with a textual modelling environment to assist modellers in writing use cases. The environment supports syntax-directed editing, validation of use case models, and requirements analysis. The proposed language and tool is demonstrated and evaluated with two case studies: smart store system and smart </span></span></span>fire alarm system.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101251"},"PeriodicalIF":2.2,"publicationDate":"2023-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-22DOI: 10.1016/j.cola.2023.101243
Felicien Ihirwe , Davide Di Ruscio , Katia Di Blasio , Simone Gianfranceschi , Alfonso Pierantonio
Dependability is regarded as the ability of the system to provide services that can be trusted within a specific period. As the complexity and heterogeneity of Internet of Things (IoT) systems rise, so does the possibility of errors and failure. Early safety analysis not only reduces the cost of late failure but also makes it easier to trace and determine the source of the failure beforehand in case something goes wrong. In this paper, we present an early safety analysis approach based on Failure-Logic Analysis (FLA) and Fault-Tree Analysis (FTA) for safety-critical IoT systems. The safety analysis infrastructure, supported by the CHESSIoT tool, takes into account the system-level physical architecture model annotated with the component’s failure logic properties to perform different kinds of automated failure analyses. In addition to its ability to generate the system Fault-Trees (FTs), the new FTA analysis approach automatically performs qualitative and quantitative analyses which include the elimination of redundant events, unnecessary failure paths, as well as automatic probabilistic calculation of the undesired events. To assess the effectiveness of the approach, a comparative study between our propose approach with 19 existing approaches in both academia and industry was conducted showcasing its contribution to the state of the art. Finally, a Patient Monitoring System (PMS) use case has been developed to demonstrate the capabilities of the supporting CHESSIoT tool, and the results are thoroughly presented.
{"title":"Supporting model-based safety analysis for safety-critical IoT systems","authors":"Felicien Ihirwe , Davide Di Ruscio , Katia Di Blasio , Simone Gianfranceschi , Alfonso Pierantonio","doi":"10.1016/j.cola.2023.101243","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101243","url":null,"abstract":"<div><p>Dependability is regarded as the ability of the system to provide services that can be trusted within a specific period. As the complexity and heterogeneity of Internet of Things (IoT) systems rise, so does the possibility of errors and failure. Early safety analysis not only reduces the cost of late failure but also makes it easier to trace and determine the source of the failure beforehand in case something goes wrong. In this paper, we present an early safety analysis approach based on Failure-Logic Analysis (FLA) and Fault-Tree Analysis (FTA) for safety-critical IoT systems. The safety analysis infrastructure, supported by the CHESSIoT tool, takes into account the system-level physical architecture model annotated with the component’s failure logic properties to perform different kinds of automated failure analyses. In addition to its ability to generate the system Fault-Trees (FTs), the new FTA analysis approach automatically performs qualitative and quantitative analyses which include the elimination of redundant events, unnecessary failure paths, as well as automatic probabilistic calculation of the undesired events. To assess the effectiveness of the approach, a comparative study between our propose approach with 19 existing approaches in both academia and industry was conducted showcasing its contribution to the state of the art. Finally, a Patient Monitoring System (PMS) use case has been developed to demonstrate the capabilities of the supporting CHESSIoT tool, and the results are thoroughly presented.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101243"},"PeriodicalIF":2.2,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138435916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-19DOI: 10.1016/j.cola.2023.101252
Hannes Sochor , Flavio Ferrarotti , Daniela Kaufmann
To be effective, a fuzzer needs to generate inputs that are well formed, so that they are not outright rejected by the Software Under Test (SUT) and can thus detect meaningful bugs. Grammar based fuzzers solve this problem, but they obviously require a grammar of the input language accepted by the SUT. Many times such grammar is unknown. Therefore, different black- and white-box algorithms have been proposed for learning them from SUTs. Black-box algorithms rely only on membership queries, but need access to carefully crafted well formed inputs in order to obtain good results. White-box algorithms require access to the source code and generally produce grammars with higher precision and recall, but at the expense of working only for specific programming languages and libraries. We propose a new algorithm and show through extensive experimentation that it can learn grammars from recursive descendent parsers with consistently high levels of both, recall and precision. Notably, this result was obtained starting with a couple of arbitrary seed inputs and includes evaluations with sophisticated languages such as Java Script Object Notation (JSON). Different to other state of the art white-box approaches, our method does not require sophisticated program analysis techniques such as dynamic tainting or symbolic execution. In fact, the experiments confirm that our method performs extremely well with just a (standard) generic Abstract Syntax Tree (AST) of the parsing program as input. The core of our method uses fuzzing techniques combined with fundamental theoretical results on grammar learning. Compared to other white-box approaches, ours is not tied to specific programming languages and tools, and thus can be easily ported. Regarding performance, we have shown that our algorithm works well in practice and that, under reasonable assumptions, its worst-case complexity is polynomial (with low exponents) w.r.t. time and space requirements.
{"title":"Fuzzing-based grammar learning from a minimal set of seed inputs","authors":"Hannes Sochor , Flavio Ferrarotti , Daniela Kaufmann","doi":"10.1016/j.cola.2023.101252","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101252","url":null,"abstract":"<div><p><span>To be effective, a fuzzer needs to generate inputs that are well formed, so that they are not outright rejected by the Software Under Test (SUT) and can thus detect meaningful bugs. Grammar based fuzzers<span><span> solve this problem, but they obviously require a grammar of the input language accepted by the SUT. Many times such grammar is unknown. Therefore, different black- and white-box algorithms have been proposed for learning them from SUTs. Black-box algorithms rely only on membership queries, but need access to carefully crafted well formed inputs in order to obtain good results. White-box algorithms require access to the source code and generally produce grammars with higher precision and recall, but at the expense of working only for specific programming languages and libraries. We propose a new algorithm and show through extensive experimentation that it can learn grammars from recursive descendent parsers with consistently high levels of both, recall and precision. Notably, this result was obtained starting with a couple of arbitrary seed inputs and includes evaluations with sophisticated languages such as </span>Java Script Object Notation<span> (JSON). Different to other state of the art white-box approaches, our method does not require sophisticated program analysis techniques such as dynamic tainting or symbolic execution. In fact, the experiments confirm that our method performs extremely well with just a (standard) generic Abstract Syntax Tree (AST) of the </span></span></span>parsing program as input. The core of our method uses fuzzing techniques combined with fundamental theoretical results on grammar learning. Compared to other white-box approaches, ours is not tied to specific programming languages and tools, and thus can be easily ported. Regarding performance, we have shown that our algorithm works well in practice and that, under reasonable assumptions, its worst-case complexity is polynomial (with low exponents) w.r.t. time and space requirements.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"78 ","pages":"Article 101252"},"PeriodicalIF":2.2,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138430607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Capella/Arcadia helps engineers design complex system models, but as models grew in complexity, simulation and verification became necessary. An automatic model-to-model transformation approach was proposed to interpret the dynamic behavior of the semi-formal Capella models. Custom domain-specific languages were introduced to assess the syntax of these models. The approach was applied to the Adaptive Exterior Light system, transforming Capella models into Event-B models for safety verification. The paper provides traceability between Capella and Event-B meta-models to aid interpretation of verification results.
{"title":"A transformation methodology for Capella to Event-B models with DSL verification","authors":"Khaoula Bouba , Abderrahim Ait Wakrime , Yassine Ouhammou , Redouane Benaini","doi":"10.1016/j.cola.2023.101241","DOIUrl":"https://doi.org/10.1016/j.cola.2023.101241","url":null,"abstract":"<div><p>Capella/Arcadia helps engineers design complex system models, but as models grew in complexity, simulation and verification became necessary. An automatic model-to-model transformation approach was proposed to interpret the dynamic behavior of the semi-formal Capella models. Custom domain-specific languages were introduced to assess the syntax of these models. The approach was applied to the Adaptive Exterior Light system, transforming Capella models into Event-B models for safety verification. The paper provides traceability between Capella and Event-B meta-models to aid interpretation of verification results.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"77 ","pages":"Article 101241"},"PeriodicalIF":2.2,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92025575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}