{"title":"Two-Stage Gene Selection Technique For Identifying Significant Prognosis Biomarkers In Breast Cancer","authors":"Monika Lamba, Geetika Munjal, Yogita Gigras","doi":"10.12785/ijcds/160107","DOIUrl":null,"url":null,"abstract":": One crucial stage in the data preparation procedure for breast cancer classification involves extracting a selection of meaningful genes from microarray gene expression data. This stage is crucial because it discovers genes whose expression patterns can di ff erentiate between di ff erent types or stages of breast cancer. Two highly e ff ective algorithms, CONSISTENCY-BFS and CFS-BFS, have been developed for gene selection. These algorithms are designed to identify the genes that are most crucial in distinguishing between di ff erent types and stages of breast cancer by analysing large volumes of genetic data. A noteworthy advancement is a refined 2-Stage Gene Selection technique specifically designed for predicting subtypes in breast cancer. The initial phase of the 2-Stage Gene Selection (GeS) approach relies on the CFS-BFS algorithm, which plays a crucial role in e ff ectively eliminating unnecessary, distracting, and redundant genes. The initial filtering process plays a crucial role in simplifying the dataset and identifying the genes that have the highest potential to shed light on the category of breast cancer. The CONSISTENCY-BFS algorithm guarantees that only the most pertinent genes are retained by further refining the gene selection process. This stage is essential for eliminating any remaining uncertainty and enhancing the overall e ffi ciency of the algorithm. This innovative approach represents a significant advancement in the field of bioinformatics as it o ff ers a more accurate and targeted method for selecting genes based on their relevance to breast cancer classification. When the 2-Stage GeS is constructed using Hidden Weight Naive Bayes, remarkably, it yields more precise and dependable outcomes. The indicators that demonstrate positive outcomes encompass recollection, accuracy, f-score, and fallout rankings. The Kaplan-Meier Survival Model was employed to further validate the top four genes, namely E2F3, PSMC3IP, GINS1, and PLAGL2. Presumably, precision therapy will specifically focus on targeting the genes E2F3 and GINS1.","PeriodicalId":37180,"journal":{"name":"International Journal of Computing and Digital Systems","volume":"295 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computing and Digital Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12785/ijcds/160107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
: One crucial stage in the data preparation procedure for breast cancer classification involves extracting a selection of meaningful genes from microarray gene expression data. This stage is crucial because it discovers genes whose expression patterns can di ff erentiate between di ff erent types or stages of breast cancer. Two highly e ff ective algorithms, CONSISTENCY-BFS and CFS-BFS, have been developed for gene selection. These algorithms are designed to identify the genes that are most crucial in distinguishing between di ff erent types and stages of breast cancer by analysing large volumes of genetic data. A noteworthy advancement is a refined 2-Stage Gene Selection technique specifically designed for predicting subtypes in breast cancer. The initial phase of the 2-Stage Gene Selection (GeS) approach relies on the CFS-BFS algorithm, which plays a crucial role in e ff ectively eliminating unnecessary, distracting, and redundant genes. The initial filtering process plays a crucial role in simplifying the dataset and identifying the genes that have the highest potential to shed light on the category of breast cancer. The CONSISTENCY-BFS algorithm guarantees that only the most pertinent genes are retained by further refining the gene selection process. This stage is essential for eliminating any remaining uncertainty and enhancing the overall e ffi ciency of the algorithm. This innovative approach represents a significant advancement in the field of bioinformatics as it o ff ers a more accurate and targeted method for selecting genes based on their relevance to breast cancer classification. When the 2-Stage GeS is constructed using Hidden Weight Naive Bayes, remarkably, it yields more precise and dependable outcomes. The indicators that demonstrate positive outcomes encompass recollection, accuracy, f-score, and fallout rankings. The Kaplan-Meier Survival Model was employed to further validate the top four genes, namely E2F3, PSMC3IP, GINS1, and PLAGL2. Presumably, precision therapy will specifically focus on targeting the genes E2F3 and GINS1.