Ensemble Case based Reasoning Imputation in Breast Cancer Classification

IF 0.5 4区 计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of Information Science and Engineering Pub Date : 2021-09-01 DOI:10.6688/JISE.202109_37(5).0004
Imane Chlioui, A. Idri, Ibtissam Abnane, M. Ezzat
{"title":"Ensemble Case based Reasoning Imputation in Breast Cancer Classification","authors":"Imane Chlioui, A. Idri, Ibtissam Abnane, M. Ezzat","doi":"10.6688/JISE.202109_37(5).0004","DOIUrl":null,"url":null,"abstract":"Missing Data (MD) is a common drawback that affects breast cancer classification. Thus, handling missing data is primordial before building any breast cancer classifier. This paper presents the impact of using ensemble Case-Based Reasoning (CBR) imputation on breast cancer classification. Thereafter, we evaluated the influence of CBR using parameter tuning and ensemble CBR (E-CBR) with three missingness mechanisms (MCAR: missing completely at random, MAR: missing at random and NMAR: not missing at random) and nine percentages (10% to 90%) on the accuracy rates of five classifiers: Decision trees, Random forest, K-nearest neighbor, Support vector machine and Multi-layer perceptron over two Wisconsin breast cancer datasets. All experiments were implemented using Weka JAVA API code 3.8; SPSS v20 was used for statistical tests. The findings confirmed that E-CBR yields to better results compared to CBR for the five classifiers. The MD percentage affects negatively the classifier performance: as the MD percentage increases, the accuracy rates of the classifier decrease regardless the MD mechanism and technique. RF with E-CBR outperformed all the other combinations (MD technique, classifier) with 89.72% for MCAR, 87.08% for MAR and 86.84% for NMAR.","PeriodicalId":50177,"journal":{"name":"Journal of Information Science and Engineering","volume":null,"pages":null},"PeriodicalIF":0.5000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.6688/JISE.202109_37(5).0004","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 3

Abstract

Missing Data (MD) is a common drawback that affects breast cancer classification. Thus, handling missing data is primordial before building any breast cancer classifier. This paper presents the impact of using ensemble Case-Based Reasoning (CBR) imputation on breast cancer classification. Thereafter, we evaluated the influence of CBR using parameter tuning and ensemble CBR (E-CBR) with three missingness mechanisms (MCAR: missing completely at random, MAR: missing at random and NMAR: not missing at random) and nine percentages (10% to 90%) on the accuracy rates of five classifiers: Decision trees, Random forest, K-nearest neighbor, Support vector machine and Multi-layer perceptron over two Wisconsin breast cancer datasets. All experiments were implemented using Weka JAVA API code 3.8; SPSS v20 was used for statistical tests. The findings confirmed that E-CBR yields to better results compared to CBR for the five classifiers. The MD percentage affects negatively the classifier performance: as the MD percentage increases, the accuracy rates of the classifier decrease regardless the MD mechanism and technique. RF with E-CBR outperformed all the other combinations (MD technique, classifier) with 89.72% for MCAR, 87.08% for MAR and 86.84% for NMAR.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于集成案例的推理归算在乳腺癌分类中的应用
缺失数据(MD)是影响乳腺癌分类的一个常见缺陷。因此,在建立任何乳腺癌分类器之前,处理缺失的数据是原始的。本文介绍了集成案例推理(CBR)方法在乳腺癌分类中的应用。之后,我们使用参数调整和集成CBR (E-CBR)评估了CBR的影响,CBR具有三种缺失机制(MCAR:完全随机缺失,MAR:随机缺失和NMAR:不随机缺失)和9个百分比(10%至90%)对五个分类器的准确率的影响:决策树,随机森林,k近邻,支持向量机和多层感知器在两个威斯康星州乳腺癌数据集上。所有实验均使用Weka JAVA API代码3.8实现;采用SPSS v20进行统计检验。研究结果证实,与CBR相比,5种分类器的E-CBR产生更好的结果。MD百分比对分类器性能有负面影响:随着MD百分比的增加,无论MD机制和技术如何,分类器的准确率都会下降。射频联合E-CBR优于其他组合(MD技术、分类器),MCAR、MAR和NMAR的准确率分别为89.72%、87.08%和86.84%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Information Science and Engineering
Journal of Information Science and Engineering 工程技术-计算机:信息系统
CiteScore
2.00
自引率
0.00%
发文量
4
审稿时长
8 months
期刊介绍: The Journal of Information Science and Engineering is dedicated to the dissemination of information on computer science, computer engineering, and computer systems. This journal encourages articles on original research in the areas of computer hardware, software, man-machine interface, theory and applications. tutorial papers in the above-mentioned areas, and state-of-the-art papers on various aspects of computer systems and applications.
期刊最新文献
MedCheX: An Efficient COVID-19 Detection Model for Clinical Usage Spatiotemporal Data Warehousing for Event Tracking Applications An Optimized Modelling and Simulation on Task Scheduling for Multi-Processor System using Hybridized ACO-CVOA An Approach to Monitor Vaccine Quality During Distribution Using Internet of Things Data Science Applied to Marketing: A Literature Review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1