Two-Stage Gene Selection Technique For Identifying Significant Prognosis Biomarkers In Breast Cancer

International Journal of Computing and Digital Systems Pub Date : 2024-07-01 DOI:10.12785/ijcds/160107

Monika Lamba, Geetika Munjal, Yogita Gigras

{"title":"Two-Stage Gene Selection Technique For Identifying Significant Prognosis Biomarkers In Breast Cancer","authors":"Monika Lamba, Geetika Munjal, Yogita Gigras","doi":"10.12785/ijcds/160107","DOIUrl":null,"url":null,"abstract":": One crucial stage in the data preparation procedure for breast cancer classification involves extracting a selection of meaningful genes from microarray gene expression data. This stage is crucial because it discovers genes whose expression patterns can di ff erentiate between di ff erent types or stages of breast cancer. Two highly e ff ective algorithms, CONSISTENCY-BFS and CFS-BFS, have been developed for gene selection. These algorithms are designed to identify the genes that are most crucial in distinguishing between di ff erent types and stages of breast cancer by analysing large volumes of genetic data. A noteworthy advancement is a refined 2-Stage Gene Selection technique specifically designed for predicting subtypes in breast cancer. The initial phase of the 2-Stage Gene Selection (GeS) approach relies on the CFS-BFS algorithm, which plays a crucial role in e ff ectively eliminating unnecessary, distracting, and redundant genes. The initial filtering process plays a crucial role in simplifying the dataset and identifying the genes that have the highest potential to shed light on the category of breast cancer. The CONSISTENCY-BFS algorithm guarantees that only the most pertinent genes are retained by further refining the gene selection process. This stage is essential for eliminating any remaining uncertainty and enhancing the overall e ffi ciency of the algorithm. This innovative approach represents a significant advancement in the field of bioinformatics as it o ff ers a more accurate and targeted method for selecting genes based on their relevance to breast cancer classification. When the 2-Stage GeS is constructed using Hidden Weight Naive Bayes, remarkably, it yields more precise and dependable outcomes. The indicators that demonstrate positive outcomes encompass recollection, accuracy, f-score, and fallout rankings. The Kaplan-Meier Survival Model was employed to further validate the top four genes, namely E2F3, PSMC3IP, GINS1, and PLAGL2. Presumably, precision therapy will specifically focus on targeting the genes E2F3 and GINS1.","PeriodicalId":37180,"journal":{"name":"International Journal of Computing and Digital Systems","volume":"295 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computing and Digital Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12785/ijcds/160107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

: One crucial stage in the data preparation procedure for breast cancer classification involves extracting a selection of meaningful genes from microarray gene expression data. This stage is crucial because it discovers genes whose expression patterns can di ff erentiate between di ff erent types or stages of breast cancer. Two highly e ff ective algorithms, CONSISTENCY-BFS and CFS-BFS, have been developed for gene selection. These algorithms are designed to identify the genes that are most crucial in distinguishing between di ff erent types and stages of breast cancer by analysing large volumes of genetic data. A noteworthy advancement is a refined 2-Stage Gene Selection technique specifically designed for predicting subtypes in breast cancer. The initial phase of the 2-Stage Gene Selection (GeS) approach relies on the CFS-BFS algorithm, which plays a crucial role in e ff ectively eliminating unnecessary, distracting, and redundant genes. The initial filtering process plays a crucial role in simplifying the dataset and identifying the genes that have the highest potential to shed light on the category of breast cancer. The CONSISTENCY-BFS algorithm guarantees that only the most pertinent genes are retained by further refining the gene selection process. This stage is essential for eliminating any remaining uncertainty and enhancing the overall e ffi ciency of the algorithm. This innovative approach represents a significant advancement in the field of bioinformatics as it o ff ers a more accurate and targeted method for selecting genes based on their relevance to breast cancer classification. When the 2-Stage GeS is constructed using Hidden Weight Naive Bayes, remarkably, it yields more precise and dependable outcomes. The indicators that demonstrate positive outcomes encompass recollection, accuracy, f-score, and fallout rankings. The Kaplan-Meier Survival Model was employed to further validate the top four genes, namely E2F3, PSMC3IP, GINS1, and PLAGL2. Presumably, precision therapy will specifically focus on targeting the genes E2F3 and GINS1.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

识别乳腺癌重要预后生物标志物的两阶段基因选择技术

:乳腺癌分类数据准备过程中的一个关键阶段是从微阵列基因表达数据中提取有意义的基因。这一阶段至关重要，因为它能发现表达模式可区分不同类型或不同阶段乳腺癌的基因。目前已开发出 CONSISTENCY-BFS 和 CFS-BFS 两种高效的基因选择算法。这些算法旨在通过分析大量基因数据，找出对区分不同类型和分期的乳腺癌最为关键的基因。一个值得注意的进展是专门为预测乳腺癌亚型而设计的精炼的两阶段基因选择技术。两阶段基因选择（GeS）方法的初始阶段依赖于 CFS-BFS 算法，该算法在有效剔除不必要、干扰和冗余基因方面发挥着至关重要的作用。初始过滤过程在简化数据集和识别最有可能揭示乳腺癌类别的基因方面发挥着至关重要的作用。CONSISTENCY-BFS 算法通过进一步完善基因筛选过程，确保只保留最相关的基因。这一阶段对于消除剩余的不确定性和提高算法的整体效率至关重要。这种创新方法是生物信息学领域的一大进步，因为它提供了一种更准确、更有针对性的方法，根据基因与乳腺癌分类的相关性来选择基因。当使用隐藏加权 Naive Bayes 算法构建 2 阶段 GeS 时，它能产生更精确、更可靠的结果。显示积极结果的指标包括回忆率、准确率、f-分数和失效排名。卡普兰-梅耶生存模型（Kaplan-Meier Survival Model）被用来进一步验证前四个基因，即 E2F3、PSMC3IP、GINS1 和 PLAGL2。据推测，精准治疗将特别关注靶向基因 E2F3 和 GINS1。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊