Multi-omics-based Machine Learning for the Subtype Classification of Breast Cancer

IF 2.9 4区 综合性期刊 Q1 Multidisciplinary Arabian Journal for Science and Engineering Pub Date : 2024-09-10 DOI:10.1007/s13369-024-09341-7
Asmaa M. Hassan, Safaa M. Naeem, Mohamed A. A. Eldosoky, Mai S. Mabrouk
{"title":"Multi-omics-based Machine Learning for the Subtype Classification of Breast Cancer","authors":"Asmaa M. Hassan, Safaa M. Naeem, Mohamed A. A. Eldosoky, Mai S. Mabrouk","doi":"10.1007/s13369-024-09341-7","DOIUrl":null,"url":null,"abstract":"<p>Cancer is a complicated disease that produces deregulatory changes in cellular activities (such as proteins). Data from these levels must be integrated into multi-omics analyses to better understand cancer and its progression. Deep learning approaches have recently helped with multi-omics analysis of cancer data. Breast cancer is a prevalent form of cancer among women, resulting from a multitude of clinical, lifestyle, social, and economic factors. The goal of this study was to predict breast cancer using several machine learning methods. We applied the architecture for mono-omics data analysis of the Cancer Genome Atlas Breast Cancer datasets in our analytical investigation. The following classifiers were used: random forest, partial least squares, Naive Bayes, decision trees, neural networks, and Lasso regularization. They were used and evaluated using the area under the curve metric. The random forest classifier and the Lasso regularization classifier achieved the highest area under the curve values of 0.99 each. These areas under the curve values were obtained using the mono-omics data employed in this investigation. The random forest and Lasso regularization classifiers achieved the maximum prediction accuracy, showing that they are appropriate for this problem. For all mono-omics classification models used in this paper, random forest and Lasso regression offer the best results for all metrics (precision, recall, and F1 score). The integration of various risk factors in breast cancer prediction modeling can aid in early diagnosis and treatment, utilizing data collection, storage, and intelligent systems for disease management. The integration of diverse risk factors in breast cancer prediction modeling holds promise for early diagnosis and treatment. Leveraging data collection, storage, and intelligent systems can further enhance disease management strategies, ultimately contributing to improved patient outcomes.</p>","PeriodicalId":8109,"journal":{"name":"Arabian Journal for Science and Engineering","volume":"2 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arabian Journal for Science and Engineering","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1007/s13369-024-09341-7","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0

Abstract

Cancer is a complicated disease that produces deregulatory changes in cellular activities (such as proteins). Data from these levels must be integrated into multi-omics analyses to better understand cancer and its progression. Deep learning approaches have recently helped with multi-omics analysis of cancer data. Breast cancer is a prevalent form of cancer among women, resulting from a multitude of clinical, lifestyle, social, and economic factors. The goal of this study was to predict breast cancer using several machine learning methods. We applied the architecture for mono-omics data analysis of the Cancer Genome Atlas Breast Cancer datasets in our analytical investigation. The following classifiers were used: random forest, partial least squares, Naive Bayes, decision trees, neural networks, and Lasso regularization. They were used and evaluated using the area under the curve metric. The random forest classifier and the Lasso regularization classifier achieved the highest area under the curve values of 0.99 each. These areas under the curve values were obtained using the mono-omics data employed in this investigation. The random forest and Lasso regularization classifiers achieved the maximum prediction accuracy, showing that they are appropriate for this problem. For all mono-omics classification models used in this paper, random forest and Lasso regression offer the best results for all metrics (precision, recall, and F1 score). The integration of various risk factors in breast cancer prediction modeling can aid in early diagnosis and treatment, utilizing data collection, storage, and intelligent systems for disease management. The integration of diverse risk factors in breast cancer prediction modeling holds promise for early diagnosis and treatment. Leveraging data collection, storage, and intelligent systems can further enhance disease management strategies, ultimately contributing to improved patient outcomes.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多组学的机器学习用于乳腺癌亚型分类
癌症是一种复杂的疾病,会导致细胞活动(如蛋白质)发生脱节变化。必须将这些层面的数据整合到多组学分析中,才能更好地了解癌症及其进展。最近,深度学习方法为癌症数据的多组学分析提供了帮助。乳腺癌是女性中的一种常见癌症,由多种临床、生活方式、社会和经济因素导致。本研究的目标是使用多种机器学习方法预测乳腺癌。我们在分析调查中应用了癌症基因组图谱乳腺癌数据集的单组学数据分析架构。我们使用了以下分类器:随机森林、偏最小二乘、奈夫贝叶斯、决策树、神经网络和拉索正则化。我们使用曲线下面积指标对这些分类器进行了评估。随机森林分类器和 Lasso 正则化分类器的曲线下面积值最高,均为 0.99。这些曲线下面积值是使用本研究中使用的单组学数据获得的。随机森林分类器和 Lasso 正则化分类器的预测准确率最高,表明它们适用于这一问题。在本文使用的所有单组学分类模型中,随机森林和拉索回归在所有指标(精确度、召回率和 F1 分数)上都取得了最佳结果。在乳腺癌预测模型中整合各种风险因素有助于早期诊断和治疗,利用数据收集、存储和智能系统进行疾病管理。在乳腺癌预测建模中整合各种风险因素,有望实现早期诊断和治疗。利用数据收集、存储和智能系统可以进一步加强疾病管理策略,最终有助于改善患者的治疗效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Arabian Journal for Science and Engineering
Arabian Journal for Science and Engineering 综合性期刊-综合性期刊
CiteScore
5.20
自引率
3.40%
发文量
0
审稿时长
4.3 months
期刊介绍: King Fahd University of Petroleum & Minerals (KFUPM) partnered with Springer to publish the Arabian Journal for Science and Engineering (AJSE). AJSE, which has been published by KFUPM since 1975, is a recognized national, regional and international journal that provides a great opportunity for the dissemination of research advances from the Kingdom of Saudi Arabia, MENA and the world.
期刊最新文献
Statistical Analysis and Accurate Prediction of Thermophysical Properties of ZnO-MWCNT/EG-Water Hybrid Nanofluid Using Several Artificial Intelligence Methods Proposing a New Egg-Shaped Profile to Further Enhance the Hydrothermal Performance of Extended Dimple Tubes in Turbulent Flows Violence Detection Using Deep Learning Effects of Iron Ion Ratios on the Synthesis and Adsorption Capacity of the Magnetic Graphene Oxide Nanomaterials Enhancing Elderly Care with Wearable Technology: Development of a Dataset for Fall Detection and ADL Classification During Muslim Prayer Activities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1