Annisa Nugraheni, R. Ramadhani, Amalia Beladinna Arifa, Agi Prasetiadi
{"title":"Perbandingan Performa Antara Algoritma Naive Bayes Dan K-Nearest Neighbour Pada Klasifikasi Kanker Payudara","authors":"Annisa Nugraheni, R. Ramadhani, Amalia Beladinna Arifa, Agi Prasetiadi","doi":"10.20895/dinda.v2i1.391","DOIUrl":null,"url":null,"abstract":"Breast cancer is the second most common cause of death from cancer after lung cancer is in the first place. Breast cancer occurs when cells in breast tissue begin to grow uncontrollably and can disrupt existing healthy tissue. Therefore, there is a need for a classification to distinguish breast cancer patients and healthy people. Based on previous research, the Naïve Bayes and K-Nearest Neighbor algorithms are considered capable of classifying breast cancer. In the research process using the breast cancer dataset from the Breast Cancer Coimbra dataset in 2018 UCI Machine Learning Repository with a total of 116 data, while for the calculation of the feasibility of the method using the Confusion Matrix (Accuracy, Precision, and Recall) and the ROC-AUC curve. The purpose of this study is to compare the performance of the Naïve Bayes and K-Nearest Neighbor algorithms. In testing using the Naïve Bayes algorithm and the K-Nearest Neighbor algorithm, there are several test scenarios, namely, data testing before and after normalization, model testing based on a comparison of training data and testing data, model testing based on K values ​​in K-Nearest Neighbors, and model testing. based on the selection of the strongest attribute with the Pearson correlation test. The results of this study indicate that the Naïve Bayes algorithm has the highest average accuracy of 69.12%, healthy precision 64.90%, pain precision 83%, healthy recall 88%, sick recall 61.11% and AUC 0.82 which is included in the good classification category. Meanwhile, the highest average results of the K-Nearest Neighbor algorithm are 76.83% for accuracy, 76% healthy precision, 80.21% pain precision, 74.18% for healthy recall, 80.81% sick recall and 0.91 AUC which is included in the excellent classification category.","PeriodicalId":419119,"journal":{"name":"Journal of Dinda : Data Science, Information Technology, and Data Analytics","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Dinda : Data Science, Information Technology, and Data Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20895/dinda.v2i1.391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

乳腺癌是仅次于肺癌的第二大常见癌症死亡原因。当乳腺组织中的细胞开始不受控制地生长并破坏现有的健康组织时,就会发生乳腺癌。因此,有必要对乳腺癌患者和健康人进行分类。根据之前的研究,Naïve贝叶斯和k近邻算法被认为能够对乳腺癌进行分类。在研究过程中使用了来自2018年UCI机器学习存储库中乳腺癌科英布拉数据集的乳腺癌数据集共116个数据,同时使用混淆矩阵(Accuracy, Precision, and Recall)和ROC-AUC曲线来计算该方法的可行性。本研究的目的是比较Naïve贝叶斯和k -最近邻算法的性能。在使用Naïve贝叶斯算法和K近邻算法进行测试时,有几种测试场景,分别是归一化前后的数据测试、基于训练数据和测试数据对比的模型测试、基于K近邻中K值的模型测试、模型测试。通过Pearson相关检验选择最强属性。研究结果表明,Naïve贝叶斯算法的平均准确率最高,为69.12%,健康准确率为64.90%,疼痛准确率为83%,健康召回率为88%,疾病召回率为61.11%,AUC为0.82,属于良好分类类别。同时,k -最邻近算法的最高平均结果为准确率76.83%,健康准确率76%,疼痛准确率80.21%,健康召回率74.18%,疾病召回率80.81%,AUC 0.91,属于优秀分类类别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Perbandingan Performa Antara Algoritma Naive Bayes Dan K-Nearest Neighbour Pada Klasifikasi Kanker Payudara
Breast cancer is the second most common cause of death from cancer after lung cancer is in the first place. Breast cancer occurs when cells in breast tissue begin to grow uncontrollably and can disrupt existing healthy tissue. Therefore, there is a need for a classification to distinguish breast cancer patients and healthy people. Based on previous research, the Naïve Bayes and K-Nearest Neighbor algorithms are considered capable of classifying breast cancer. In the research process using the breast cancer dataset from the Breast Cancer Coimbra dataset in 2018 UCI Machine Learning Repository with a total of 116 data, while for the calculation of the feasibility of the method using the Confusion Matrix (Accuracy, Precision, and Recall) and the ROC-AUC curve. The purpose of this study is to compare the performance of the Naïve Bayes and K-Nearest Neighbor algorithms. In testing using the Naïve Bayes algorithm and the K-Nearest Neighbor algorithm, there are several test scenarios, namely, data testing before and after normalization, model testing based on a comparison of training data and testing data, model testing based on K values ​​in K-Nearest Neighbors, and model testing. based on the selection of the strongest attribute with the Pearson correlation test. The results of this study indicate that the Naïve Bayes algorithm has the highest average accuracy of 69.12%, healthy precision 64.90%, pain precision 83%, healthy recall 88%, sick recall 61.11% and AUC 0.82 which is included in the good classification category. Meanwhile, the highest average results of the K-Nearest Neighbor algorithm are 76.83% for accuracy, 76% healthy precision, 80.21% pain precision, 74.18% for healthy recall, 80.81% sick recall and 0.91 AUC which is included in the excellent classification category.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Classification of Sleep Disorders Using Random Forest on Sleep Health and Lifestyle Dataset Classification of Drug Types using Decision Tree Algorithm Dominant Requirements for Student Graduation in the Faculty of Informatics using the C4.5 Algorithm Minimalist DCT-based Depthwise Separable Convolutional Neural Network Approach for Tangut Script The Descriptive Analysis of Perceptions of ITTP Data Science Students regarding Face-to-Face Learning Plans
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1