{"title":"一种基于核苷酸结构三联体组合的新型Pre-microRNA鉴定鲁棒特征选择方法","authors":"Petra Stepanowsky, Jihoon Kim, L. Ohno-Machado","doi":"10.1109/HISB.2012.20","DOIUrl":null,"url":null,"abstract":"MicroRNAs are a class of small non-coding RNAs that play an important role in post-transcriptional regulation of gene products. Identification of novel microRNA is difficult because the validated microRNA set is still small in size and diverse. Existing feature selection methods use different combinations of features related to the biogenesis of microRNAs, but performance evaluations are not comprehensive. We developed a robust feature selection method using a combination of three types of nucleotide-structure triplets, the minimum free energy of the secondary structure of precursor microRNAs and other extracted characteristics. We compared our new combination feature set and three other previously published sets using three different classifiers: logistic regression, support vector machine, and random forest. Our proposed feature set was not only robust across all classifier methods, but also had the highest classification performance, as measured by the area under the ROC curve.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Robust Feature Selection Method for Novel Pre-microRNA Identification Using a Combination of Nucleotide-Structure Triplets\",\"authors\":\"Petra Stepanowsky, Jihoon Kim, L. Ohno-Machado\",\"doi\":\"10.1109/HISB.2012.20\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MicroRNAs are a class of small non-coding RNAs that play an important role in post-transcriptional regulation of gene products. Identification of novel microRNA is difficult because the validated microRNA set is still small in size and diverse. Existing feature selection methods use different combinations of features related to the biogenesis of microRNAs, but performance evaluations are not comprehensive. We developed a robust feature selection method using a combination of three types of nucleotide-structure triplets, the minimum free energy of the secondary structure of precursor microRNAs and other extracted characteristics. We compared our new combination feature set and three other previously published sets using three different classifiers: logistic regression, support vector machine, and random forest. Our proposed feature set was not only robust across all classifier methods, but also had the highest classification performance, as measured by the area under the ROC curve.\",\"PeriodicalId\":375089,\"journal\":{\"name\":\"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HISB.2012.20\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HISB.2012.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Robust Feature Selection Method for Novel Pre-microRNA Identification Using a Combination of Nucleotide-Structure Triplets
MicroRNAs are a class of small non-coding RNAs that play an important role in post-transcriptional regulation of gene products. Identification of novel microRNA is difficult because the validated microRNA set is still small in size and diverse. Existing feature selection methods use different combinations of features related to the biogenesis of microRNAs, but performance evaluations are not comprehensive. We developed a robust feature selection method using a combination of three types of nucleotide-structure triplets, the minimum free energy of the secondary structure of precursor microRNAs and other extracted characteristics. We compared our new combination feature set and three other previously published sets using three different classifiers: logistic regression, support vector machine, and random forest. Our proposed feature set was not only robust across all classifier methods, but also had the highest classification performance, as measured by the area under the ROC curve.