{"title":"A binary PSO feature selection algorithm for gene expression data","authors":"Suresh Dara, H. Banka","doi":"10.1109/EIC.2015.7230734","DOIUrl":null,"url":null,"abstract":"A Binary Particle Swarm Optimization (BPSO) based features selection algorithm is proposed for selecting important feature subsets from high dimensional gene expression data. Since the data consists of a large number of redundant features, a heuristic based fast preprocessing strategy is used for reducing features as an intermediate step. At first, preprocessing performed on data for generating the distinction table which has been used as input for choosing the important features using BPSO for further selection. The fitness function has been suitably formulated in PSO frame work to handle the conflicting objectives i.e., reducing feature cardinality and maintaining distinctive capability (i.e., classification accuracy). Three high dimensional bench mark datasets considers (i.e. colon cancer, lymphoma and leukemia) and experimental results demonstrated with their detailed comparative studies using k-NN classifier.","PeriodicalId":101532,"journal":{"name":"2014 International Conference on Advances in Communication and Computing Technologies (ICACACT 2014)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Advances in Communication and Computing Technologies (ICACACT 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EIC.2015.7230734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
A Binary Particle Swarm Optimization (BPSO) based features selection algorithm is proposed for selecting important feature subsets from high dimensional gene expression data. Since the data consists of a large number of redundant features, a heuristic based fast preprocessing strategy is used for reducing features as an intermediate step. At first, preprocessing performed on data for generating the distinction table which has been used as input for choosing the important features using BPSO for further selection. The fitness function has been suitably formulated in PSO frame work to handle the conflicting objectives i.e., reducing feature cardinality and maintaining distinctive capability (i.e., classification accuracy). Three high dimensional bench mark datasets considers (i.e. colon cancer, lymphoma and leukemia) and experimental results demonstrated with their detailed comparative studies using k-NN classifier.