{"title":"Classification of proteins in intracellular and secretory pathway using global descriptors of amino acid sequence","authors":"G. Govindan, A. Nair","doi":"10.1109/WICT.2011.6141236","DOIUrl":null,"url":null,"abstract":"It is widely recognized that the information from the amino acid sequence can serve as crucial pointers in predicting subcellular location of proteins. We introduce a new feature vector for predicting proteins targeted to various compartments in the intracellular and secretory pathway from protein sequence. Features are based on the global Composition, Transition and Distribution (CTD) of amino acid attributes such as hydrophobicity, normalized van der Waals volume, polarity, polarizability, charge, secondary structure and solvent accessibility. Sequences are considered in three equal parts and the features are extracted separately for all the three parts. Based on the feature vectors, we have trained a Support Vector Machine to classify intracellular and secretory proteins. Our method gives an accuracy of 92% in human, 88% in plant and 95% in fungi with independent dataset at root level of the protein sorting pathway.","PeriodicalId":178645,"journal":{"name":"2011 World Congress on Information and Communication Technologies","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 World Congress on Information and Communication Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WICT.2011.6141236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
It is widely recognized that the information from the amino acid sequence can serve as crucial pointers in predicting subcellular location of proteins. We introduce a new feature vector for predicting proteins targeted to various compartments in the intracellular and secretory pathway from protein sequence. Features are based on the global Composition, Transition and Distribution (CTD) of amino acid attributes such as hydrophobicity, normalized van der Waals volume, polarity, polarizability, charge, secondary structure and solvent accessibility. Sequences are considered in three equal parts and the features are extracted separately for all the three parts. Based on the feature vectors, we have trained a Support Vector Machine to classify intracellular and secretory proteins. Our method gives an accuracy of 92% in human, 88% in plant and 95% in fungi with independent dataset at root level of the protein sorting pathway.