Prediction of breast cancer using machine learning algorithms on different datasets

IF 0.4 Q4 ENGINEERING, MULTIDISCIPLINARY Ingenieria Solidaria Pub Date : 2023-06-14 DOI:10.16925/2357-6014.2023.01.08
Ömer Çağrı Yavuz, M. H. Calp, Hazel Ceren Erkengel
{"title":"Prediction of breast cancer using machine learning algorithms on different datasets","authors":"Ömer Çağrı Yavuz, M. H. Calp, Hazel Ceren Erkengel","doi":"10.16925/2357-6014.2023.01.08","DOIUrl":null,"url":null,"abstract":"Breast cancer is a disease that is becoming more and more common day by day, causing emotional and behavioral reactions and having fatal consequences if not detected early. At this point, traditional methods are insufficient, especially in early diagnosis. In this context, this study aimed to predict breast cancer by using machine learning (ML) algorithms on different datasets and to demonstrate the applicability of these algorithms. Algorithm performances were compared on balanced and unbalanced datasets, taking into account the performance metrics obtained in applications on different datasets. In addition, a model based on the Borda Voting method was developed by including the results obtained from four different algorithms (NB, KNN, DT, and RF) in the process. The prediction values obtained from each algorithm were written in different columns on the same excel file and the most repetitive value was accepted as the final result value. The developed model was tested on real data consisting of 60 records and the results were analyzed. When the results were examined, it was seen that higher performance was obtained with the proposed RF model compared to similar studies in the literature. Finally, the prediction results obtained with the developed model revealed the applicability of ML algorithms in the diagnosis of breast cancer.","PeriodicalId":41023,"journal":{"name":"Ingenieria Solidaria","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ingenieria Solidaria","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.16925/2357-6014.2023.01.08","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Breast cancer is a disease that is becoming more and more common day by day, causing emotional and behavioral reactions and having fatal consequences if not detected early. At this point, traditional methods are insufficient, especially in early diagnosis. In this context, this study aimed to predict breast cancer by using machine learning (ML) algorithms on different datasets and to demonstrate the applicability of these algorithms. Algorithm performances were compared on balanced and unbalanced datasets, taking into account the performance metrics obtained in applications on different datasets. In addition, a model based on the Borda Voting method was developed by including the results obtained from four different algorithms (NB, KNN, DT, and RF) in the process. The prediction values obtained from each algorithm were written in different columns on the same excel file and the most repetitive value was accepted as the final result value. The developed model was tested on real data consisting of 60 records and the results were analyzed. When the results were examined, it was seen that higher performance was obtained with the proposed RF model compared to similar studies in the literature. Finally, the prediction results obtained with the developed model revealed the applicability of ML algorithms in the diagnosis of breast cancer.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在不同数据集上使用机器学习算法预测乳腺癌
乳腺癌是一种日益常见的疾病,如果不及早发现,会引起情绪和行为反应,并造成致命后果。在这一点上,传统的方法是不够的,特别是在早期诊断。在此背景下,本研究旨在通过在不同数据集上使用机器学习(ML)算法来预测乳腺癌,并证明这些算法的适用性。考虑在不同数据集上的应用所获得的性能指标,比较了算法在平衡数据集和不平衡数据集上的性能。此外,通过将NB、KNN、DT和RF四种不同算法的结果纳入该过程,建立了基于Borda投票法的模型。每个算法得到的预测值分别写在同一个excel文件的不同列中,重复次数最多的值被接受为最终结果值。在60条记录的实际数据上对所建立的模型进行了检验,并对结果进行了分析。当对结果进行检验时,可以看到与文献中的类似研究相比,所提出的射频模型获得了更高的性能。最后,利用所建立的模型获得的预测结果揭示了ML算法在乳腺癌诊断中的适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Ingenieria Solidaria
Ingenieria Solidaria ENGINEERING, MULTIDISCIPLINARY-
自引率
0.00%
发文量
10
期刊最新文献
Methodological design of a strategy for the productive and sustainable development of communities in Cauca User-centered web accessibility : recommendations for ensuring access to government information for older adults Ingeniería SolidariaResearch article. https://doi.org/10.16925/2357-6014.2023.03.03 1 Lovely Professional University, India Email: bhuvanpuri1199@gmail.com ORCID: https://orcid.org/0000-0002-3098-7892 2 Lovely Professional University, India Email: rameshwar.20345@lpu.co.in ORCID: https://orcid.org/0000-0002-5369-7433 A review on the role of IoT, ai, and blockchain in agriculture & crop diseases detection using a text mining approach Analysis of learning outcomes in engineering programs Factors affecting product life cycle in electronic enterprises – evidence from an emerging country
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1