{"title":"Gene Selection using Intelligent Dynamic Genetic Algorithm and Random Forest","authors":"Elham Pashaei, Elnaz Pashaei","doi":"10.23919/ELECO47770.2019.8990557","DOIUrl":null,"url":null,"abstract":"Microarray gene expression data has provided a successful framework for investigating cancer and genetic diseases. Finding cancer-related genes using feature selection methods is of the greatest importance in microarray analysis. However, selecting a small number of informative genes is a challenging task due to the curse of dimensionality in the microarray dataset. This study introduces a new hybrid model based on the Intelligent Dynamic Genetic Algorithm (IDGA) and random forest to distinguish a small meaningful set of genes for cancer classification. This random forest- based IDGA algorithm uses not only random forest in filtering noisy and redundant genes but also in its fitness function. The proposed method was evaluated on two benchmark datasets, namely leukemia and colon cancer data and top explored genes were reported. Experimental results demonstrate that the suggested method has an excellent selection and classification performance compared to several recently proposed approaches.","PeriodicalId":6611,"journal":{"name":"2019 11th International Conference on Electrical and Electronics Engineering (ELECO)","volume":"31 1","pages":"470-474"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 11th International Conference on Electrical and Electronics Engineering (ELECO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ELECO47770.2019.8990557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
Microarray gene expression data has provided a successful framework for investigating cancer and genetic diseases. Finding cancer-related genes using feature selection methods is of the greatest importance in microarray analysis. However, selecting a small number of informative genes is a challenging task due to the curse of dimensionality in the microarray dataset. This study introduces a new hybrid model based on the Intelligent Dynamic Genetic Algorithm (IDGA) and random forest to distinguish a small meaningful set of genes for cancer classification. This random forest- based IDGA algorithm uses not only random forest in filtering noisy and redundant genes but also in its fitness function. The proposed method was evaluated on two benchmark datasets, namely leukemia and colon cancer data and top explored genes were reported. Experimental results demonstrate that the suggested method has an excellent selection and classification performance compared to several recently proposed approaches.