MLR-predictor: a versatile and efficient computational framework for multi-label requirements classification.

IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Frontiers in Artificial Intelligence Pub Date : 2024-11-27 eCollection Date: 2024-01-01 DOI:10.3389/frai.2024.1481581

Summra Saleem, Muhammad Nabeel Asim, Ludger Van Elst, Markus Junker, Andreas Dengel

{"title":"MLR-predictor: a versatile and efficient computational framework for multi-label requirements classification.","authors":"Summra Saleem, Muhammad Nabeel Asim, Ludger Van Elst, Markus Junker, Andreas Dengel","doi":"10.3389/frai.2024.1481581","DOIUrl":null,"url":null,"abstract":"Introduction: Requirements classification is an essential task for development of a successful software by incorporating all relevant aspects of users' needs. Additionally, it aids in the identification of project failure risks and facilitates to achieve project milestones in more comprehensive way. Several machine learning predictors are developed for binary or multi-class requirements classification. However, a few predictors are designed for multi-label classification and they are not practically useful due to less predictive performance.Method: MLR-Predictor makes use of innovative OkapiBM25 model to transforms requirements text into statistical vectors by computing words informative patterns. Moreover, predictor transforms multi-label requirements classification data into multi-class classification problem and utilize logistic regression classifier for categorization of requirements. The performance of the proposed predictor is evaluated and compared with 123 machine learning and 9 deep learning-based predictive pipelines across three public benchmark requirements classification datasets using eight different evaluation measures.Results: The large-scale experimental results demonstrate that proposed MLR-Predictor outperforms 123 adopted machine learning and 9 deep learning predictive pipelines, as well as the state-of-the-art requirements classification predictor. Specifically, in comparison to state-of-the-art predictor, it achieves a 13% improvement in macro F1-measure on the PROMISE dataset, a 1% improvement on the EHR-binary dataset, and a 2.5% improvement on the EHR-multiclass dataset.Discussion: As a case study, the generalizability of proposed predictor is evaluated on softwares customer reviews classification data. In this context, the proposed predictor outperformed the state-of-the-art BERT language model by F-1 score of 1.4%. These findings underscore the robustness and effectiveness of the proposed MLR-Predictor in various contexts, establishing its utility as a promising solution for requirements classification task.","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1481581"},"PeriodicalIF":4.7000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11632133/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2024.1481581","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Requirements classification is an essential task for development of a successful software by incorporating all relevant aspects of users' needs. Additionally, it aids in the identification of project failure risks and facilitates to achieve project milestones in more comprehensive way. Several machine learning predictors are developed for binary or multi-class requirements classification. However, a few predictors are designed for multi-label classification and they are not practically useful due to less predictive performance.

Method: MLR-Predictor makes use of innovative OkapiBM25 model to transforms requirements text into statistical vectors by computing words informative patterns. Moreover, predictor transforms multi-label requirements classification data into multi-class classification problem and utilize logistic regression classifier for categorization of requirements. The performance of the proposed predictor is evaluated and compared with 123 machine learning and 9 deep learning-based predictive pipelines across three public benchmark requirements classification datasets using eight different evaluation measures.

Results: The large-scale experimental results demonstrate that proposed MLR-Predictor outperforms 123 adopted machine learning and 9 deep learning predictive pipelines, as well as the state-of-the-art requirements classification predictor. Specifically, in comparison to state-of-the-art predictor, it achieves a 13% improvement in macro F1-measure on the PROMISE dataset, a 1% improvement on the EHR-binary dataset, and a 2.5% improvement on the EHR-multiclass dataset.

Discussion: As a case study, the generalizability of proposed predictor is evaluated on softwares customer reviews classification data. In this context, the proposed predictor outperformed the state-of-the-art BERT language model by F-1 score of 1.4%. These findings underscore the robustness and effectiveness of the proposed MLR-Predictor in various contexts, establishing its utility as a promising solution for requirements classification task.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

mlr预测器：一个多标签需求分类的通用和高效的计算框架。

简介：需求分类是一个成功软件开发的基本任务，它包含了用户需求的所有相关方面。此外，它有助于识别项目失败风险，并有助于更全面地实现项目里程碑。开发了几种机器学习预测器用于二元或多类需求分类。然而，一些预测器是为多标签分类设计的，由于预测性能较差，它们在实际中并不有用。方法：MLR-Predictor利用创新的OkapiBM25模型，通过计算词信息模式，将需求文本转换为统计向量。预测器将多标签需求分类数据转化为多类分类问题，并利用逻辑回归分类器对需求进行分类。使用八种不同的评估措施，对所提出的预测器的性能进行了评估，并与123个机器学习和9个基于深度学习的预测管道在三个公共基准需求分类数据集中进行了比较。结果：大规模实验结果表明，所提出的MLR-Predictor优于123个采用的机器学习和9个深度学习预测管道，以及最先进的需求分类预测器。具体来说，与最先进的预测器相比，它在PROMISE数据集上的宏观f1测量提高了13%，在ehr -二进制数据集上提高了1%，在ehr -多类数据集上提高了2.5%。讨论：作为一个案例研究，在软件客户评论分类数据上评估了所提出的预测器的泛化性。在这种情况下，所提出的预测器比最先进的BERT语言模型的F-1得分高出1.4%。这些发现强调了所提出的MLR-Predictor在各种环境中的健壮性和有效性，建立了它作为需求分类任务的有前途的解决方案的实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊