Spandana Chereddy, Ippatapu Venkata Srisurya, Harshit Bogineni, P. R, Bharathi Mohan G
{"title":"Predicting the Driver Variants and Mutations in Lung Cancer Genome using Transcriptional Regulation Network","authors":"Spandana Chereddy, Ippatapu Venkata Srisurya, Harshit Bogineni, P. R, Bharathi Mohan G","doi":"10.1109/ICCMC56507.2023.10084125","DOIUrl":null,"url":null,"abstract":"Among thousands of potential mutations, identifying and separating cancer driver genes remains a big difficulty. Precise identification of driver genes and mutations is crucial for cancer research and treatment personalization based on accurate patient classification. Many driver mutations within a gene exist at low rates due to inter-tumor genetic variability, making it difficult to identify them from non-driver mutations. Proposed model uses a transcription adjustment network and its data set from the database REGNETWORK. The subject of the paper is to discovery of genes that cause lung cancer with a network approach. To do this, centralization and socialization in graph is used. The degree of centrality, degree of mediocrity, and proximity are considered as parameters in identifying lung cancer gene (with the cancer-causing mutation). Socializing of data is implemented to find genes that are more closely related to each other. Various transcription factors, genes, and their interconnections make create a particular class of biological network called a transcriptional regulatory network. These networks were analyzed to look at how information moves through a biological system and to spot paths that are advantageous for various tasks. Nodes in this network are genes and transcripts, so there are two types of modules in the network, Gene module and transcription factor module. Edges represents physical or regulatory interaction between them Two different algorithms are used to build the network model and comparison is done using Accuracy, F1 Score, Recall, Precision. Using this algorithm, the influential genes (propagation occurs in them) are identified in each community, and finally the total of the influential genes. The results from all communities were predicted as lung cancer genes and evaluated using certain criteria.","PeriodicalId":197059,"journal":{"name":"2023 7th International Conference on Computing Methodologies and Communication (ICCMC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 7th International Conference on Computing Methodologies and Communication (ICCMC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMC56507.2023.10084125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Among thousands of potential mutations, identifying and separating cancer driver genes remains a big difficulty. Precise identification of driver genes and mutations is crucial for cancer research and treatment personalization based on accurate patient classification. Many driver mutations within a gene exist at low rates due to inter-tumor genetic variability, making it difficult to identify them from non-driver mutations. Proposed model uses a transcription adjustment network and its data set from the database REGNETWORK. The subject of the paper is to discovery of genes that cause lung cancer with a network approach. To do this, centralization and socialization in graph is used. The degree of centrality, degree of mediocrity, and proximity are considered as parameters in identifying lung cancer gene (with the cancer-causing mutation). Socializing of data is implemented to find genes that are more closely related to each other. Various transcription factors, genes, and their interconnections make create a particular class of biological network called a transcriptional regulatory network. These networks were analyzed to look at how information moves through a biological system and to spot paths that are advantageous for various tasks. Nodes in this network are genes and transcripts, so there are two types of modules in the network, Gene module and transcription factor module. Edges represents physical or regulatory interaction between them Two different algorithms are used to build the network model and comparison is done using Accuracy, F1 Score, Recall, Precision. Using this algorithm, the influential genes (propagation occurs in them) are identified in each community, and finally the total of the influential genes. The results from all communities were predicted as lung cancer genes and evaluated using certain criteria.
在成千上万的潜在突变中,识别和分离癌症驱动基因仍然是一个很大的困难。准确识别驱动基因和突变对于癌症研究和基于准确患者分类的个性化治疗至关重要。由于肿瘤间的遗传变异,一个基因内的许多驱动突变以低率存在,这使得很难从非驱动突变中识别它们。该模型使用了一个转录调整网络及其REGNETWORK数据库中的数据集。本文的主题是用网络方法发现导致肺癌的基因。为此,使用了图中的集中化和社会化。中心性、平庸性和接近性是识别肺癌基因(含致癌突变)的参数。数据的社会化是为了找到彼此关系更密切的基因。各种转录因子、基因和它们之间的相互联系构成了一类特殊的生物网络,称为转录调控网络。对这些网络进行分析,以观察信息如何在生物系统中移动,并找出对各种任务有利的路径。这个网络中的节点是基因和转录本,所以网络中有两种类型的模块,基因模块和转录因子模块。边缘表示它们之间的物理或调节相互作用,使用两种不同的算法来构建网络模型,并使用Accuracy, F1 Score, Recall, Precision进行比较。利用该算法在每个群落中识别出影响基因(繁殖发生在其中),最后确定影响基因的总数。所有社区的结果都被预测为肺癌基因,并使用某些标准进行评估。