使用集成学习协助交付前固件质量评估

IF 1 4区工程技术 Q3 ENGINEERING, MULTIDISCIPLINARY Journal of the Chinese Institute of Engineers Pub Date : 2023-10-04 DOI:10.1080/02533839.2023.2262711

Zheng-Yun Zhuang, Yu-Chuan Hsu, Shyan-Ming Yuan

{"title":"使用集成学习协助交付前固件质量评估","authors":"Zheng-Yun Zhuang, Yu-Chuan Hsu, Shyan-Ming Yuan","doi":"10.1080/02533839.2023.2262711","DOIUrl":null,"url":null,"abstract":"ABSTRACTThis study uses retrospective data for firmware tests as the input data sets to train four machine learning models with embedded standalone classifiers. None of these models provide accurate predictions during validation, so model optimization trials adjust the training-validation data portfolio and hyper parameters for each model. Consequently, only the random forest classifier with the best parametric settings just achieves the 90% prediction accuracy required by the standard. Ensemble learning (EL) is then applied using several combinations over the standalone models, and the EL model using logistic regression as the meta classifier increases the accuracy by 6% (i.e. to 96%), which is sufficient for establishing a predictive system. Using the ‘X-minute’ method, it is further identified that the execution period (also the data sampling period) for the sequential read test workload can be reduced from 30 (in current practice) to 20 minutes and that the predictions are sufficiently accurate for system implementation using the EL model. Applying the similarity confirmation method for each pair of ‘score vectors’ (each of which contains a model’s prediction accuracies), several observations distinguishing the performance and the predictive behavioral patterns of the benchmarked models are further confirmed. The knowledge from this advanced research has implications which may benefit future practice in industry.CO EDITOR-IN-CHIEF: Sun, Hung-MinASSOCIATE EDITOR: Sun, Hung-MinKEYWORDS: Quality controlfirmware testingensemble machine learningprocess re-engineering and optimizationdecision-support systemAI in industry Nomenclature AI=artificial intelligenceAPS=automated predictive systemCD=continuous deliveryCI=continuous integrationCOVID-19=corona-virus disease 2019CSV=comma-separated valuesCWV=criteria weight vectorDDDM (D3M)=data-driven decision-makingDSS=decision support systemsEL=ensemble learningFN=false negativeFP=false positiveFW=firmwareI/O=input and outputk-NN=k nearest neighborsLR=logistic regressionMADM=multi-attribute decision-makingMCDM=multi-criteria decision-makingML=machine learningOWV=opinion weight vectorR&D=research and developmentRF=random forestROV=rand order vectorSCM=similarity confirmation methodSOP=standard operating procedureSSD=solid state driveSV=score vectorSVM=support vector machineTN=true negativeTP=true positiveTTM=time to marketVCS=version control systemDisclosure statementNo potential conflict of interest was reported by the authors.Additional informationFundingThis work was supported by the Ministry of Science and Technology, Taiwan (ROC), under grants [MOST-108-2511-H-009-009-MY3, MOST-109-2410-H-992 -015 and MOST-111-2410-H-992-011], each in part.","PeriodicalId":17313,"journal":{"name":"Journal of the Chinese Institute of Engineers","volume":"20 1","pages":"0"},"PeriodicalIF":1.0000,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assisting pre-delivery firmware quality assessments using ensemble learning\",\"authors\":\"Zheng-Yun Zhuang, Yu-Chuan Hsu, Shyan-Ming Yuan\",\"doi\":\"10.1080/02533839.2023.2262711\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACTThis study uses retrospective data for firmware tests as the input data sets to train four machine learning models with embedded standalone classifiers. None of these models provide accurate predictions during validation, so model optimization trials adjust the training-validation data portfolio and hyper parameters for each model. Consequently, only the random forest classifier with the best parametric settings just achieves the 90% prediction accuracy required by the standard. Ensemble learning (EL) is then applied using several combinations over the standalone models, and the EL model using logistic regression as the meta classifier increases the accuracy by 6% (i.e. to 96%), which is sufficient for establishing a predictive system. Using the ‘X-minute’ method, it is further identified that the execution period (also the data sampling period) for the sequential read test workload can be reduced from 30 (in current practice) to 20 minutes and that the predictions are sufficiently accurate for system implementation using the EL model. Applying the similarity confirmation method for each pair of ‘score vectors’ (each of which contains a model’s prediction accuracies), several observations distinguishing the performance and the predictive behavioral patterns of the benchmarked models are further confirmed. The knowledge from this advanced research has implications which may benefit future practice in industry.CO EDITOR-IN-CHIEF: Sun, Hung-MinASSOCIATE EDITOR: Sun, Hung-MinKEYWORDS: Quality controlfirmware testingensemble machine learningprocess re-engineering and optimizationdecision-support systemAI in industry Nomenclature AI=artificial intelligenceAPS=automated predictive systemCD=continuous deliveryCI=continuous integrationCOVID-19=corona-virus disease 2019CSV=comma-separated valuesCWV=criteria weight vectorDDDM (D3M)=data-driven decision-makingDSS=decision support systemsEL=ensemble learningFN=false negativeFP=false positiveFW=firmwareI/O=input and outputk-NN=k nearest neighborsLR=logistic regressionMADM=multi-attribute decision-makingMCDM=multi-criteria decision-makingML=machine learningOWV=opinion weight vectorR&D=research and developmentRF=random forestROV=rand order vectorSCM=similarity confirmation methodSOP=standard operating procedureSSD=solid state driveSV=score vectorSVM=support vector machineTN=true negativeTP=true positiveTTM=time to marketVCS=version control systemDisclosure statementNo potential conflict of interest was reported by the authors.Additional informationFundingThis work was supported by the Ministry of Science and Technology, Taiwan (ROC), under grants [MOST-108-2511-H-009-009-MY3, MOST-109-2410-H-992 -015 and MOST-111-2410-H-992-011], each in part.\",\"PeriodicalId\":17313,\"journal\":{\"name\":\"Journal of the Chinese Institute of Engineers\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Chinese Institute of Engineers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/02533839.2023.2262711\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Chinese Institute of Engineers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/02533839.2023.2262711","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

摘要本研究使用固件测试的回顾性数据作为输入数据集来训练四个具有嵌入式独立分类器的机器学习模型。这些模型在验证过程中都没有提供准确的预测，因此模型优化试验调整每个模型的训练-验证数据组合和超参数。因此，只有具有最佳参数设置的随机森林分类器才能达到标准要求的90%的预测精度。然后在独立模型上使用几种组合来应用集成学习(EL)，并且使用逻辑回归作为元分类器的EL模型将精度提高了6%(即96%)，这足以建立预测系统。使用“x分钟”方法，进一步确定了顺序读取测试工作负载的执行周期(也是数据采样周期)可以从30分钟(在当前实践中)减少到20分钟，并且预测对于使用EL模型的系统实现来说足够准确。对每一对“得分向量”(每一对都包含一个模型的预测精度)应用相似性确认方法，进一步确认了区分基准模型的性能和预测行为模式的几个观察结果。这项先进研究的知识可能对未来的工业实践有益。副主编:孙宏敏质量控制固件测试集成机器学习流程重新设计和优化决策支持系统行业内AI术语AI=人工智能aps =自动预测系统cd =持续交付ci =持续集成covid -19=冠状病毒病2019CSV=逗号分隔值cwv =标准权重向量dddm (D3M)=数据驱动决策dss =决策支持系统sel =集成学习fn =假阴性fp =假阳性fw =firmwareI/O=输入和输出nn =k最近近邻slr =逻辑回归madm =多属性决策mcdm =多标准决策ml =机器学习gowv =意见权重向量r&d =研究与开发trf =随机森林strov =rand order vector scm =相似度确认方法sop =标准操作程序ressd =固态驱动器v =得分向量svm =支持向量machineTN=真负tp =真正ttm =上市时间vcs =版本控制系统披露声明作者。本研究由台湾科学技术部(ROC)资助，项目资助[MOST-108-2511-H-009-009-MY3, MOST-109-2410-H-992 -015和MOST-111-2410-H-992-011]，各部分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Assisting pre-delivery firmware quality assessments using ensemble learning

ABSTRACTThis study uses retrospective data for firmware tests as the input data sets to train four machine learning models with embedded standalone classifiers. None of these models provide accurate predictions during validation, so model optimization trials adjust the training-validation data portfolio and hyper parameters for each model. Consequently, only the random forest classifier with the best parametric settings just achieves the 90% prediction accuracy required by the standard. Ensemble learning (EL) is then applied using several combinations over the standalone models, and the EL model using logistic regression as the meta classifier increases the accuracy by 6% (i.e. to 96%), which is sufficient for establishing a predictive system. Using the ‘X-minute’ method, it is further identified that the execution period (also the data sampling period) for the sequential read test workload can be reduced from 30 (in current practice) to 20 minutes and that the predictions are sufficiently accurate for system implementation using the EL model. Applying the similarity confirmation method for each pair of ‘score vectors’ (each of which contains a model’s prediction accuracies), several observations distinguishing the performance and the predictive behavioral patterns of the benchmarked models are further confirmed. The knowledge from this advanced research has implications which may benefit future practice in industry.CO EDITOR-IN-CHIEF: Sun, Hung-MinASSOCIATE EDITOR: Sun, Hung-MinKEYWORDS: Quality controlfirmware testingensemble machine learningprocess re-engineering and optimizationdecision-support systemAI in industry Nomenclature AI=artificial intelligenceAPS=automated predictive systemCD=continuous deliveryCI=continuous integrationCOVID-19=corona-virus disease 2019CSV=comma-separated valuesCWV=criteria weight vectorDDDM (D3M)=data-driven decision-makingDSS=decision support systemsEL=ensemble learningFN=false negativeFP=false positiveFW=firmwareI/O=input and outputk-NN=k nearest neighborsLR=logistic regressionMADM=multi-attribute decision-makingMCDM=multi-criteria decision-makingML=machine learningOWV=opinion weight vectorR&D=research and developmentRF=random forestROV=rand order vectorSCM=similarity confirmation methodSOP=standard operating procedureSSD=solid state driveSV=score vectorSVM=support vector machineTN=true negativeTP=true positiveTTM=time to marketVCS=version control systemDisclosure statementNo potential conflict of interest was reported by the authors.Additional informationFundingThis work was supported by the Ministry of Science and Technology, Taiwan (ROC), under grants [MOST-108-2511-H-009-009-MY3, MOST-109-2410-H-992 -015 and MOST-111-2410-H-992-011], each in part.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of the Chinese Institute of Engineers 工程技术-工程：综合

CiteScore

2.30

自引率

9.10%

发文量

审稿时长

6.8 months

期刊介绍： Encompassing a wide range of engineering disciplines and industrial applications, JCIE includes the following topics: 1.Chemical engineering 2.Civil engineering 3.Computer engineering 4.Electrical engineering 5.Electronics 6.Mechanical engineering and fields related to the above.