{"title":"Predicting Essential Genes of Escherichia coli based on Clustering Method","authors":"Xiao Liu, Ting He, Zhirui Guo, Meixiang Ren","doi":"10.1145/3340074.3340080","DOIUrl":null,"url":null,"abstract":"Essential genes are important to the survival or reproduction of organisms. Computational methods for predicting essential genes are mainly supervised classification methods. These methods need label information of genes which the newly sequenced genes are absence. This encourages us to use unsupervised methods to predict essential genes. Here, the K-means clustering algorithm was used to predict the essential genes of Escherichia coli after the Relief algorithm was used to weight the features. A membership calculation method based on Euclidean distance between genes was designed to get AUC (area under curve) score. The average AUC score was 0.989. This research enables a satisfied prediction of essential genes.","PeriodicalId":196396,"journal":{"name":"Proceedings of the 2019 11th International Conference on Bioinformatics and Biomedical Technology","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 11th International Conference on Bioinformatics and Biomedical Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3340074.3340080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Essential genes are important to the survival or reproduction of organisms. Computational methods for predicting essential genes are mainly supervised classification methods. These methods need label information of genes which the newly sequenced genes are absence. This encourages us to use unsupervised methods to predict essential genes. Here, the K-means clustering algorithm was used to predict the essential genes of Escherichia coli after the Relief algorithm was used to weight the features. A membership calculation method based on Euclidean distance between genes was designed to get AUC (area under curve) score. The average AUC score was 0.989. This research enables a satisfied prediction of essential genes.