Improving the Accuracy of Oncology Diagnosis: A Machine Learning-Based Approach to Cancer Prediction

M. Cabanillas-Carbonell, Joselyn Zapata-Paulini
{"title":"Improving the Accuracy of Oncology Diagnosis: A Machine Learning-Based Approach to Cancer Prediction","authors":"M. Cabanillas-Carbonell, Joselyn Zapata-Paulini","doi":"10.3991/ijoe.v20i11.49139","DOIUrl":null,"url":null,"abstract":"Cancer ranks among the most lethal illnesses worldwide, and predicting its onset can be a crucial factor in enhancing people’s quality of life by taking preventive measures to improve treatment and survival. This study conducted comparative research to determine the machine learning model with the highest accuracy for tumor type classification, distinguishing between malignant (cancer) and benign tumors. The models evaluated include decision tree (DT), naive bayes (NB), extra trees classifier (ETM), random forest (RF), K-means clustering (K-means), logistic regression (LR), adaptive boosting (AdaBoost), gradient boosting (GB), light gradient boosting machine (LightGBM), and extreme gradient boosting (XGBoost) to identify the one with the best accuracy. The models were trained using a dataset of 569 records and a total of 32 variables, containing patient information and tumor characteristics. The study was structured into sections, such as related studies, descriptions of the models, case study development, results, discussion, and conclusions. The models’ performance was evaluated based on metrics of precision, sensitivity, accuracy, and F1 score. Following the training, the results positioned the XGBoost model as having the best performance, achieving 98% precision, accuracy, sensitivity, and F1 score.","PeriodicalId":507997,"journal":{"name":"International Journal of Online and Biomedical Engineering (iJOE)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Online and Biomedical Engineering (iJOE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3991/ijoe.v20i11.49139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Cancer ranks among the most lethal illnesses worldwide, and predicting its onset can be a crucial factor in enhancing people’s quality of life by taking preventive measures to improve treatment and survival. This study conducted comparative research to determine the machine learning model with the highest accuracy for tumor type classification, distinguishing between malignant (cancer) and benign tumors. The models evaluated include decision tree (DT), naive bayes (NB), extra trees classifier (ETM), random forest (RF), K-means clustering (K-means), logistic regression (LR), adaptive boosting (AdaBoost), gradient boosting (GB), light gradient boosting machine (LightGBM), and extreme gradient boosting (XGBoost) to identify the one with the best accuracy. The models were trained using a dataset of 569 records and a total of 32 variables, containing patient information and tumor characteristics. The study was structured into sections, such as related studies, descriptions of the models, case study development, results, discussion, and conclusions. The models’ performance was evaluated based on metrics of precision, sensitivity, accuracy, and F1 score. Following the training, the results positioned the XGBoost model as having the best performance, achieving 98% precision, accuracy, sensitivity, and F1 score.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
提高肿瘤诊断的准确性:基于机器学习的癌症预测方法
癌症是全球致死率最高的疾病之一,而预测癌症的发病可以通过采取预防措施提高治疗和生存率,从而成为提高人们生活质量的关键因素。本研究进行了比较研究,以确定在肿瘤类型分类(区分恶性肿瘤(癌症)和良性肿瘤)方面准确率最高的机器学习模型。评估的模型包括决策树(DT)、奈夫贝叶斯(NB)、额外树分类器(ETM)、随机森林(RF)、K-means 聚类(K-means)、逻辑回归(LR)、自适应提升(AdaBoost)、梯度提升(GB)、轻梯度提升机(LightGBM)和极端梯度提升(XGBoost),以找出准确率最高的模型。模型的训练使用了一个包含 569 条记录和总共 32 个变量的数据集,其中包含患者信息和肿瘤特征。本研究分为相关研究、模型描述、案例研究开发、结果、讨论和结论等部分。根据精确度、灵敏度、准确度和 F1 分数等指标对模型的性能进行了评估。训练结束后,结果表明 XGBoost 模型性能最佳,精确度、准确度、灵敏度和 F1 分数均达到 98%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
XAI-PhD: Fortifying Trust of Phishing URL Detection Empowered by Shapley Additive Explanations Improving the Accuracy of Oncology Diagnosis: A Machine Learning-Based Approach to Cancer Prediction Social Robots, Mindfulness, and Kindergarten Blockchain of Things for Securing and Managing Water 4.0 Applications Intelligent Interconnected Healthcare System: Integrating IoT and Big Data for Personalized Patient Care
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1