{"title":"Detecting Long Non-Coding RNAs Responsible for Cancer Development","authors":"Mitra Datta Ganapaneni, Kundhana Harshitha Paruchuru, J. Ambati, Mahesh Valavala, C.C Sobin","doi":"10.1109/OCIT56763.2022.00040","DOIUrl":null,"url":null,"abstract":"Long noncoding RNAs (lncRNA) have a vital role in tumor development. Variation in expressions of IncRNAs affect several target genes related to tumor initiation and development. Recent studies in Carcinogenesis have indicated the importance of IncRNA in cancer progression, diagnosis, and treatment. The purpose of our research is to identify the key cancer-related IncRNAs. It is considered a complex task to identify key IncRNAs in cancer with existing cancer data of tumor patients due to the high dimensionality nature of expression profiles. LncRNA expression profiles of 12309 IncRNAs and 2221 patients are gathered from TCGA. A Computational framework is proposed considering 5 cancer types (Bladder, Colon, Cervical, Liver, Head, and Neck) comprising four Machine learning classification models named K-Nearest Neighbor, Naive Bayes, Random Forest, and Support Vector Machine. An essential component in the framework is to use models along with the state-of-the-art Variance threshold, L1-based, and Tree-based feature selection algorithms for differential analysis. The study resulted in identifying 234 key IncRNAs capable of differentiating 5 cancer types. The capability of identified key IncRNAs is observed by the performance of classification models resulting in the highest 98.2% accuracy by SVM. Furthermore, the correlation analysis of 234 IncRNAs experimentally validated the results.","PeriodicalId":425541,"journal":{"name":"2022 OITS International Conference on Information Technology (OCIT)","volume":"117 20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 OITS International Conference on Information Technology (OCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/OCIT56763.2022.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Long noncoding RNAs (lncRNA) have a vital role in tumor development. Variation in expressions of IncRNAs affect several target genes related to tumor initiation and development. Recent studies in Carcinogenesis have indicated the importance of IncRNA in cancer progression, diagnosis, and treatment. The purpose of our research is to identify the key cancer-related IncRNAs. It is considered a complex task to identify key IncRNAs in cancer with existing cancer data of tumor patients due to the high dimensionality nature of expression profiles. LncRNA expression profiles of 12309 IncRNAs and 2221 patients are gathered from TCGA. A Computational framework is proposed considering 5 cancer types (Bladder, Colon, Cervical, Liver, Head, and Neck) comprising four Machine learning classification models named K-Nearest Neighbor, Naive Bayes, Random Forest, and Support Vector Machine. An essential component in the framework is to use models along with the state-of-the-art Variance threshold, L1-based, and Tree-based feature selection algorithms for differential analysis. The study resulted in identifying 234 key IncRNAs capable of differentiating 5 cancer types. The capability of identified key IncRNAs is observed by the performance of classification models resulting in the highest 98.2% accuracy by SVM. Furthermore, the correlation analysis of 234 IncRNAs experimentally validated the results.