{"title":"Predicting Protein Subcellular Localization: A Multiobjective PSO-based Feature Subset Selection from Amino Acid Sequence of Protein","authors":"M. Mandal, A. Mukhopadhyay","doi":"10.1109/ICIT.2014.75","DOIUrl":null,"url":null,"abstract":"In this article, the probable sub cellular location of a protein is predicted by applying multiobjective particle swarm optimization (MOPSO) based feature selection technique. The feature set is created from the different amino acid compositions of the protein. Thus, the sample of protein versus amino acid compositions (features) constitutes the dataset. The proposed algorithm is designed to find subset of features so that the feature relevance is maximized and feature redundancy is minimized simultaneously. After proposed algorithm is executed on the multiclass dataset, some features are selected. Using this resultant features 10-folds cross validation is applied and corresponding accuracy, f-score, entropy, representation entropy and average correlation are calculated. The performance of the proposed method is compared with that of its single objective versions, Sequential Forward Search, Sequential Backward Search and minimum Redundancy Maximum Relevance with two schemes.","PeriodicalId":6486,"journal":{"name":"2014 17th International Conference on Computer and Information Technology (ICCIT)","volume":"18 1","pages":"251-255"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 17th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIT.2014.75","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this article, the probable sub cellular location of a protein is predicted by applying multiobjective particle swarm optimization (MOPSO) based feature selection technique. The feature set is created from the different amino acid compositions of the protein. Thus, the sample of protein versus amino acid compositions (features) constitutes the dataset. The proposed algorithm is designed to find subset of features so that the feature relevance is maximized and feature redundancy is minimized simultaneously. After proposed algorithm is executed on the multiclass dataset, some features are selected. Using this resultant features 10-folds cross validation is applied and corresponding accuracy, f-score, entropy, representation entropy and average correlation are calculated. The performance of the proposed method is compared with that of its single objective versions, Sequential Forward Search, Sequential Backward Search and minimum Redundancy Maximum Relevance with two schemes.