Identification of Gene Expression in Different Stages of Breast Cancer with Machine Learning

Cancers Pub Date : 2024-05-14 DOI:10.3390/cancers16101864
Ali J. Abidalkareem, Ali K. Ibrahim, Moaed A. Abd, Oneeb Rehman, Hanqi Zhuang
{"title":"Identification of Gene Expression in Different Stages of Breast Cancer with Machine Learning","authors":"Ali J. Abidalkareem, Ali K. Ibrahim, Moaed A. Abd, Oneeb Rehman, Hanqi Zhuang","doi":"10.3390/cancers16101864","DOIUrl":null,"url":null,"abstract":"Determining the tumor origin in humans is vital in clinical applications of molecular diagnostics. Metastatic cancer is usually a very aggressive disease with limited diagnostic procedures, despite the fact that many protocols have been evaluated for their effectiveness in prognostication. Research has shown that dysregulation in miRNAs (a class of non-coding, regulatory RNAs) is remarkably involved in oncogenic conditions. This research paper aims to develop a machine learning model that processes an array of miRNAs in 1097 metastatic tissue samples from patients who suffered from various stages of breast cancer. The suggested machine learning model is fed with miRNA quantitative read count data taken from The Cancer Genome Atlas Data Repository. Two main feature-selection techniques have been used, mainly Neighborhood Component Analysis and Minimum Redundancy Maximum Relevance, to identify the most discriminant and relevant miRNAs for their up-regulated and down-regulated states. These miRNAs are then validated as biological identifiers for each of the four cancer stages in breast tumors. Both machine learning algorithms yield performance scores that are significantly higher than the traditional fold-change approach, particularly in earlier stages of cancer, with Neighborhood Component Analysis and Minimum Redundancy Maximum Relevance achieving accuracy scores of up to 0.983 and 0.931, respectively, compared to 0.920 for the FC method. This study underscores the potential of advanced feature-selection methods in enhancing the accuracy of cancer stage identification, paving the way for improved diagnostic and therapeutic strategies in oncology.","PeriodicalId":504676,"journal":{"name":"Cancers","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/cancers16101864","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Determining the tumor origin in humans is vital in clinical applications of molecular diagnostics. Metastatic cancer is usually a very aggressive disease with limited diagnostic procedures, despite the fact that many protocols have been evaluated for their effectiveness in prognostication. Research has shown that dysregulation in miRNAs (a class of non-coding, regulatory RNAs) is remarkably involved in oncogenic conditions. This research paper aims to develop a machine learning model that processes an array of miRNAs in 1097 metastatic tissue samples from patients who suffered from various stages of breast cancer. The suggested machine learning model is fed with miRNA quantitative read count data taken from The Cancer Genome Atlas Data Repository. Two main feature-selection techniques have been used, mainly Neighborhood Component Analysis and Minimum Redundancy Maximum Relevance, to identify the most discriminant and relevant miRNAs for their up-regulated and down-regulated states. These miRNAs are then validated as biological identifiers for each of the four cancer stages in breast tumors. Both machine learning algorithms yield performance scores that are significantly higher than the traditional fold-change approach, particularly in earlier stages of cancer, with Neighborhood Component Analysis and Minimum Redundancy Maximum Relevance achieving accuracy scores of up to 0.983 and 0.931, respectively, compared to 0.920 for the FC method. This study underscores the potential of advanced feature-selection methods in enhancing the accuracy of cancer stage identification, paving the way for improved diagnostic and therapeutic strategies in oncology.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用机器学习识别乳腺癌不同阶段的基因表达
确定人类肿瘤的来源对于分子诊断的临床应用至关重要。转移性癌症通常是一种侵袭性很强的疾病,其诊断程序非常有限,尽管许多方案已被评估为对预后有效。研究表明,miRNAs(一类非编码、调节性 RNAs)的失调与致癌条件密切相关。本研究论文旨在开发一种机器学习模型,用于处理 1097 份转移组织样本中的 miRNAs 阵列,这些样本来自不同阶段的乳腺癌患者。所建议的机器学习模型采用的 miRNA 定量读数数据取自癌症基因组图谱数据存储库(The Cancer Genome Atlas Data Repository)。该模型使用了两种主要的特征选择技术,主要是邻近成分分析和最小冗余度最大相关性,以识别最具区分性和与上调和下调状态最相关的 miRNA。然后对这些 miRNA 进行验证,以作为乳腺肿瘤四个癌症分期的生物学标识符。这两种机器学习算法的性能得分都明显高于传统的折叠变化方法,特别是在癌症的早期阶段,邻近成分分析和最小冗余最大相关性的准确度得分分别高达 0.983 和 0.931,而 FC 方法的准确度得分仅为 0.920。这项研究强调了先进特征选择方法在提高癌症分期识别准确性方面的潜力,为改进肿瘤学诊断和治疗策略铺平了道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Aurora Kinase A Inhibition Potentiates Platinum and Radiation Cytotoxicity in Non-Small-Cell Lung Cancer Cells and Induces Expression of Alternative Immune Checkpoints Development and Characterization of Syngeneic Orthotopic Transplant Models of Obesity-Responsive Triple-Negative Breast Cancer in C57BL/6J Mice The Effects of Gynecological Tumor Irradiation on the Immune System A Monocentric Analysis of Implantable Ports in Cancer Treatment: Five-Year Efficacy and Safety Evaluation Drug Combination Nanoparticles Containing Gemcitabine and Paclitaxel Enable Orthotopic 4T1 Breast Tumor Regression
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1