Pub Date : 2024-03-25DOI: 10.1016/j.simpa.2024.100632
Mostofa Kamal Rasel, Mohammad Rezwanul Huq, Mohammad Arifuzzaman
Many graph mining algorithms process large graphs with several passes and suffers from huge I/O cost. GraphIdx, an open-source C library, facilitates a memory-efficient indexing of large graphs to reduce that I/O cost. GraphIdx indexes a block of graph data for a set of nodes based on the empirical evaluation of edges. Due to the indexed graph, graph mining algorithms can access and process only the related nodes and their edges instead of scanning entire graph. As a result, the number of I/Os is significantly reduced. Moreover, GraphIdx accredited algorithms can process graphs in parallel due to the indexed data.
{"title":"GraphIdx: An efficient indexing technique for accelerating graph data mining","authors":"Mostofa Kamal Rasel, Mohammad Rezwanul Huq, Mohammad Arifuzzaman","doi":"10.1016/j.simpa.2024.100632","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100632","url":null,"abstract":"<div><p>Many graph mining algorithms process large graphs with several passes and suffers from huge I/O cost. GraphIdx, an open-source C library, facilitates a memory-efficient indexing of large graphs to reduce that I/O cost. GraphIdx indexes a block of graph data for a set of nodes based on the empirical evaluation of edges. Due to the indexed graph, graph mining algorithms can access and process only the related nodes and their edges instead of scanning entire graph. As a result, the number of I/Os is significantly reduced. Moreover, GraphIdx accredited algorithms can process graphs in parallel due to the indexed data.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"20 ","pages":"Article 100632"},"PeriodicalIF":2.1,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000204/pdfft?md5=1f5c30286b7c1be0b0b30cc7644c0f53&pid=1-s2.0-S2665963824000204-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140308998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-24DOI: 10.1016/j.simpa.2024.100633
Javier Falgueras-Cano , Juan-Antonio Falgueras-Cano , Andrés Moya
We present a computer program called Evolutionary Cellular Automaton (ECA) in Python, which simulates in silico, in the simplest form found, all the known processes and mechanisms underlying natural selection. Mathematical and statistical functions condition the dynamics of real populations, through variables that in each habitat and in each organism acquire a specific parameter. In ECA, we have simplified these variables by working with mean and standard values and by simplifying the interactions between species in such a way that the mechanisms underlying natural selection also work in ECA, but in a digital environment under controlled and reproducible conditions.
{"title":"ECA, a Python tool to study the evolution of life","authors":"Javier Falgueras-Cano , Juan-Antonio Falgueras-Cano , Andrés Moya","doi":"10.1016/j.simpa.2024.100633","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100633","url":null,"abstract":"<div><p>We present a computer program called <em>Evolutionary Cellular Automaton</em> (<em>ECA</em>) in <em>Python</em>, which simulates in silico, in the simplest form found, all the known processes and mechanisms underlying natural selection. Mathematical and statistical functions condition the dynamics of real populations, through variables that in each habitat and in each organism acquire a specific parameter. In <em>ECA</em>, we have simplified these variables by working with mean and standard values and by simplifying the interactions between species in such a way that the mechanisms underlying natural selection also work in <em>ECA</em>, but in a digital environment under controlled and reproducible conditions.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"20 ","pages":"Article 100633"},"PeriodicalIF":2.1,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000216/pdfft?md5=eae27a097c93f6c1bf4b338d9ef603d9&pid=1-s2.0-S2665963824000216-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140320567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1016/j.simpa.2024.100628
Jefferson da Silva Coelho , Marcela Rodrigues Machado , Amanda Aryda S.R. de Sousa
The PyMLDA-Machine Learning for Damage Assessment is an open-source software developed for damage pattern recognition, detection, and quantification that uses the system’s vibration signatures as input. The software automatically evaluates the structure or system integrity by detecting and assessing structural damage by combining supervised, unsupervised, and regression Machine Learning (ML) algorithms. It employs different damage index techniques based on the system’s dynamic response, such as natural or frequency response frequency, to normalise the dataset input of the software. The classification ML route effectively identifies and categorises the damage, even when the integrity condition of the structure is unknown. The regression algorithm quantifies the damage levels, considering the uncertainty quantification in the estimation. The PyMLDA employs a range of validation and cross-validation metrics to evaluate the effectiveness and accuracy of these ML algorithms in detecting and diagnosing structural damage.
PyMLDA-Machine Learning for Damage Assessment 是一款开源软件,用于将系统的振动信号作为输入,进行损伤模式识别、检测和量化。该软件通过结合监督、非监督和回归机器学习(ML)算法来检测和评估结构损伤,从而自动评估结构或系统的完整性。它根据系统的动态响应(如自然或频率响应频率)采用不同的损坏指数技术,对软件输入的数据集进行归一化处理。即使在结构完整性条件未知的情况下,分类 ML 路径也能有效识别损坏并进行分类。回归算法对损坏程度进行量化,同时考虑到估算中的不确定性量化。PyMLDA 采用了一系列验证和交叉验证指标,以评估这些 ML 算法在检测和诊断结构损伤方面的有效性和准确性。
{"title":"PyMLDA: A Python open-source code for Machine Learning Damage Assessment","authors":"Jefferson da Silva Coelho , Marcela Rodrigues Machado , Amanda Aryda S.R. de Sousa","doi":"10.1016/j.simpa.2024.100628","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100628","url":null,"abstract":"<div><p>The PyMLDA-Machine Learning for Damage Assessment is an open-source software developed for damage pattern recognition, detection, and quantification that uses the system’s vibration signatures as input. The software automatically evaluates the structure or system integrity by detecting and assessing structural damage by combining supervised, unsupervised, and regression Machine Learning (ML) algorithms. It employs different damage index techniques based on the system’s dynamic response, such as natural or frequency response frequency, to normalise the dataset input of the software. The classification ML route effectively identifies and categorises the damage, even when the integrity condition of the structure is unknown. The regression algorithm quantifies the damage levels, considering the uncertainty quantification in the estimation. The PyMLDA employs a range of validation and cross-validation metrics to evaluate the effectiveness and accuracy of these ML algorithms in detecting and diagnosing structural damage.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"19 ","pages":"Article 100628"},"PeriodicalIF":2.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000162/pdfft?md5=e20a621814e0f96019555ed5be948ce2&pid=1-s2.0-S2665963824000162-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140066918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1016/j.simpa.2024.100629
Rafael Fresno-Aranda, Juan Sebastian Ojeda-Perez, Pablo Fernandez, Antonio Ruiz-Cortes
Governify is a service governance framework designed to enhance service operation by providing automated audit capabilities. It enables the creation of customized microservice architectures to fit various domains. This framework has been applied in real scenarios in both Industry and Academy where it has served researchers and practitioners in service governance as both a visual analytic tool and a test bed for experiments. Governify has proved its ability to gather insights into potential risks tied to noncompliance and to design and monitor best practices in forms of agreements.
{"title":"Governify. An agreement-based service governance framework","authors":"Rafael Fresno-Aranda, Juan Sebastian Ojeda-Perez, Pablo Fernandez, Antonio Ruiz-Cortes","doi":"10.1016/j.simpa.2024.100629","DOIUrl":"10.1016/j.simpa.2024.100629","url":null,"abstract":"<div><p>Governify is a service governance framework designed to enhance service operation by providing automated audit capabilities. It enables the creation of customized microservice architectures to fit various domains. This framework has been applied in real scenarios in both Industry and Academy where it has served researchers and practitioners in service governance as both a visual analytic tool and a test bed for experiments. Governify has proved its ability to gather insights into potential risks tied to noncompliance and to design and monitor best practices in forms of agreements.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"19 ","pages":"Article 100629"},"PeriodicalIF":2.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000174/pdfft?md5=bc418857fa2b55b9efb8cd594afdb8b3&pid=1-s2.0-S2665963824000174-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139966897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-23DOI: 10.1016/j.simpa.2024.100631
Taseef Hasan Farook, Tashreque Mohammed Haq, James Dudley
Dental Loop Signals (DLS) offers a unique approach to biomedical signal-processing, employing deep learning to convert archived images of mandibular muscle activity during dynamic functions into signal data. DLS, processed through unsupervised learning, introduces a cluster-centric signal processing method, enhancing data normalisation for broad applicability. The modular design of the software facilitates customisable use in Temporomandibular Joint (TMJ) and orthopaedic clinics for long-term patient follow-ups and retrospective research. The software’s robustness increases with a larger dataset of electromyographic muscle activities, promising versatility across devices, clinics, and timeframes.
{"title":"Dental loop signals: Image-to-signal processing for mandibular electromyography","authors":"Taseef Hasan Farook, Tashreque Mohammed Haq, James Dudley","doi":"10.1016/j.simpa.2024.100631","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100631","url":null,"abstract":"<div><p>Dental Loop Signals (DLS) offers a unique approach to biomedical signal-processing, employing deep learning to convert archived images of mandibular muscle activity during dynamic functions into signal data. DLS, processed through unsupervised learning, introduces a cluster-centric signal processing method, enhancing data normalisation for broad applicability. The modular design of the software facilitates customisable use in Temporomandibular Joint (TMJ) and orthopaedic clinics for long-term patient follow-ups and retrospective research. The software’s robustness increases with a larger dataset of electromyographic muscle activities, promising versatility across devices, clinics, and timeframes.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"19 ","pages":"Article 100631"},"PeriodicalIF":2.1,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000198/pdfft?md5=678cf265e992dc2f7016dec195bebc9e&pid=1-s2.0-S2665963824000198-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139985169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-23DOI: 10.1016/j.simpa.2024.100630
Abdullahi Abubakar Mas’ud , Ahmed T. Salawudeen , Abubakar A. Umar , Yusuf A. Shaaban , Firdaus Muhammad-Sukki , Umar Musa , Saud J. Alshammari
The Quasi oppositional smell agent optimization (QOBL-SAO) and its levy flight variant (LFQOBL-SAO) are two cutting-edge software tools for optimizing PV/wind/battery power systems. They can also be used to solve real-world CEC2020 optimization problems and are as good as top-performing software such as IUDE, MAgES and the iLSHAD . The QOBL-SAO exploits the random mode’s weakness and then adds a number to the initial population. The LFQOBL-SAO, on the other hand, improves the random mode’s weakness in order to solve this problem. The LFQOBL-SAO improves performance and search space by using levy flight instead of random code.
{"title":"A QOBL-SAO and its variant: An open source software for optimizing PV/wind/battery system and CEC2020 real world problems","authors":"Abdullahi Abubakar Mas’ud , Ahmed T. Salawudeen , Abubakar A. Umar , Yusuf A. Shaaban , Firdaus Muhammad-Sukki , Umar Musa , Saud J. Alshammari","doi":"10.1016/j.simpa.2024.100630","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100630","url":null,"abstract":"<div><p>The Quasi oppositional smell agent optimization (QOBL-SAO) and its levy flight variant (LFQOBL-SAO) are two cutting-edge software tools for optimizing PV/wind/battery power systems. They can also be used to solve real-world CEC2020 optimization problems and are as good as top-performing software such as IUDE, <span><math><mi>ϵ</mi></math></span> MAgES and the iLSHAD <span><math><mi>ɛ</mi></math></span>. The QOBL-SAO exploits the random mode’s weakness and then adds a number to the initial population. The LFQOBL-SAO, on the other hand, improves the random mode’s weakness in order to solve this problem. The LFQOBL-SAO improves performance and search space by using levy flight instead of random code.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"19 ","pages":"Article 100630"},"PeriodicalIF":2.1,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000186/pdfft?md5=34c0be540ac407437f5949fd3034bd29&pid=1-s2.0-S2665963824000186-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139976020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sahelian transhumance is a seasonal movement of herds based on strategies. These strategies are based on environmental and socio-economic factors. However, it is empirically difficult to establish the influence of each factor on the spatio-temporal distribution of herds. This paper presents a microsimulation software Sahelian transhumance simulator (STS). STS determines the spatio-temporal influence of each factor on herd movements. It also proposes scenarios for developing and securing the Sahelian pastoral space.
{"title":"Sahelian transhumance simulator (STS)","authors":"Cheick Amed Diloma Gabriel Traore , Etienne Delay , Djibril Diop , Alassane Bah","doi":"10.1016/j.simpa.2024.100627","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100627","url":null,"abstract":"<div><p>Sahelian transhumance is a seasonal movement of herds based on strategies. These strategies are based on environmental and socio-economic factors. However, it is empirically difficult to establish the influence of each factor on the spatio-temporal distribution of herds. This paper presents a microsimulation software Sahelian transhumance simulator (STS). STS determines the spatio-temporal influence of each factor on herd movements. It also proposes scenarios for developing and securing the Sahelian pastoral space.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"19 ","pages":"Article 100627"},"PeriodicalIF":2.1,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000150/pdfft?md5=d5b9056c786e997135232879d9890481&pid=1-s2.0-S2665963824000150-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139975932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-17DOI: 10.1016/j.simpa.2024.100626
Asia Samreen , Syed Asif Ali , Hina Shakir
This paper presents a framework for automatic creation of an emotions-labeled dataset specifically designed for short texts written in a blend of Roman Urdu and English, and addresses the inherent absence of distinct structure in Roman Urdu language. The software development is carried out in two key phases. During the first phase, cleaning and automatic annotation of raw text is performed and in the second phase, classification of emotions along with prediction is carried out. The developed software significantly simplifies the process of dataset creation by employing natural language processing (NLP) techniques, tailored for the mixed-codes.
{"title":"AACEM: Automatic Annotation and Classification of Emotions for mixed-codes","authors":"Asia Samreen , Syed Asif Ali , Hina Shakir","doi":"10.1016/j.simpa.2024.100626","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100626","url":null,"abstract":"<div><p>This paper presents a framework for automatic creation of an emotions-labeled dataset specifically designed for short texts written in a blend of Roman Urdu and English, and addresses the inherent absence of distinct structure in Roman Urdu language. The software development is carried out in two key phases. During the first phase, cleaning and automatic annotation of raw text is performed and in the second phase, classification of emotions along with prediction is carried out. The developed software significantly simplifies the process of dataset creation by employing natural language processing (NLP) techniques, tailored for the mixed-codes.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"19 ","pages":"Article 100626"},"PeriodicalIF":2.1,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000149/pdfft?md5=2f00d3a4c8d3114d2d8f044a626cb533&pid=1-s2.0-S2665963824000149-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139935302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-17DOI: 10.1016/j.simpa.2024.100625
Jai Keerthy Chowlur Revanna , Nushwan Yousif Baithoon Al-Nakash
In the modern e-commerce landscape, timely package delivery faces hurdles amid fluctuating traffic conditions. This article proposes optimization techniques employing adaptable intelligent systems for dynamic route adjustments. The primary approach used here is an AI-driven optimal path routing system, leveraging Ant Colony Optimization (ACO) and Genetic Algorithm (GA). Integration of Google Maps (G-Map API) with real-time traffic data enhances route accuracy, ensuring efficient vehicle routing. By addressing these challenges, this research aims to streamline delivery processes and contribute to the advancement of vehicle routing methodologies in the dynamic e-commerce domain.
{"title":"Impact of ACO intelligent vehicle real-time software in finding shortest path","authors":"Jai Keerthy Chowlur Revanna , Nushwan Yousif Baithoon Al-Nakash","doi":"10.1016/j.simpa.2024.100625","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100625","url":null,"abstract":"<div><p>In the modern e-commerce landscape, timely package delivery faces hurdles amid fluctuating traffic conditions. This article proposes optimization techniques employing adaptable intelligent systems for dynamic route adjustments. The primary approach used here is an AI-driven optimal path routing system, leveraging Ant Colony Optimization (ACO) and Genetic Algorithm (GA). Integration of Google Maps (G-Map API) with real-time traffic data enhances route accuracy, ensuring efficient vehicle routing. By addressing these challenges, this research aims to streamline delivery processes and contribute to the advancement of vehicle routing methodologies in the dynamic e-commerce domain.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"19 ","pages":"Article 100625"},"PeriodicalIF":2.1,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000137/pdfft?md5=b440445d8a55c8eada5ed99ef17c29d6&pid=1-s2.0-S2665963824000137-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139935301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-10DOI: 10.1016/j.simpa.2024.100619
Arshaan Nazir, Thadaka Kalyan Chakravarthy, David Amore Cecchini, Rakshit Khajuria, Prikshit Sharma, Ali Tarik Mirik, Veysel Kocaman, David Talby
The use of natural language processing (NLP) models, including the more recent large language models (LLM) in real-world applications obtained relevant success in the past years. To measure the performance of these systems, traditional performance metrics such as accuracy, precision, recall, and f1-score are used. Although it is important to measure the performance of the models in those terms, natural language often requires an holistic evaluation that consider other important aspects such as robustness, bias, accuracy, toxicity, fairness, safety, efficiency, clinical relevance, security, representation, disinformation, political orientation, sensitivity, factuality, legal concerns, and vulnerabilities. To address the gap, we introduce LangTest, an open source Python toolkit, aimed at reshaping the evaluation of LLMs and NLP models in real-world applications. The project aims to empower data scientists, enabling them to meet high standards in the ever-evolving landscape of AI model development. Specifically, it provides a comprehensive suite of more than 60 test types, ensuring a more comprehensive understanding of a model’s behavior and responsible AI use. In this experiment, a Named Entity Recognition (NER) clinical model showed significant improvement in its capabilities to identify clinical entities in text after applying data augmentation for robustness.
{"title":"LangTest: A comprehensive evaluation library for custom LLM and NLP models","authors":"Arshaan Nazir, Thadaka Kalyan Chakravarthy, David Amore Cecchini, Rakshit Khajuria, Prikshit Sharma, Ali Tarik Mirik, Veysel Kocaman, David Talby","doi":"10.1016/j.simpa.2024.100619","DOIUrl":"https://doi.org/10.1016/j.simpa.2024.100619","url":null,"abstract":"<div><p>The use of natural language processing (NLP) models, including the more recent large language models (LLM) in real-world applications obtained relevant success in the past years. To measure the performance of these systems, traditional performance metrics such as accuracy, precision, recall, and f1-score are used. Although it is important to measure the performance of the models in those terms, natural language often requires an holistic evaluation that consider other important aspects such as robustness, bias, accuracy, toxicity, fairness, safety, efficiency, clinical relevance, security, representation, disinformation, political orientation, sensitivity, factuality, legal concerns, and vulnerabilities. To address the gap, we introduce <em>LangTest</em>, an open source Python toolkit, aimed at reshaping the evaluation of LLMs and NLP models in real-world applications. The project aims to empower data scientists, enabling them to meet high standards in the ever-evolving landscape of AI model development. Specifically, it provides a comprehensive suite of more than 60 test types, ensuring a more comprehensive understanding of a model’s behavior and responsible AI use. In this experiment, a Named Entity Recognition (NER) clinical model showed significant improvement in its capabilities to identify clinical entities in text after applying data augmentation for robustness.</p></div>","PeriodicalId":29771,"journal":{"name":"Software Impacts","volume":"19 ","pages":"Article 100619"},"PeriodicalIF":2.1,"publicationDate":"2024-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665963824000071/pdfft?md5=08c3b88d18208044478d2ee4f4d9432b&pid=1-s2.0-S2665963824000071-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139732653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}