E. Marchiori, W. Pirovano, J. Heringa, K. Feenstra
{"title":"A Feature Selection Algorithm for Detecting Subtype Specific Functional Sites from Protein Sequences for Smad Receptor Binding","authors":"E. Marchiori, W. Pirovano, J. Heringa, K. Feenstra","doi":"10.1109/ICMLA.2006.7","DOIUrl":null,"url":null,"abstract":"Multiple sequence alignments are often used to reveal functionally important residues within a protein family. In particular, they can be very useful for identification of key residues that determine functional differences between protein subclasses (subtype specific sites). This paper proposes a new algorithm for selecting subtype specific sites from a set of aligned protein sequences. The algorithm combines a feature selection technique with neighbor position information for selecting and ranking a set of putative relevant sites. The algorithm is applied to a dataset of protein sequences from the MH2 domain of the SMAD family of transcriptor factors. Validation of the results on the basis of the known interaction and function of the sites shows that the algorithm successfully identifies the known (from literature) subtype specific sites and new putative ones","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2006.7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Multiple sequence alignments are often used to reveal functionally important residues within a protein family. In particular, they can be very useful for identification of key residues that determine functional differences between protein subclasses (subtype specific sites). This paper proposes a new algorithm for selecting subtype specific sites from a set of aligned protein sequences. The algorithm combines a feature selection technique with neighbor position information for selecting and ranking a set of putative relevant sites. The algorithm is applied to a dataset of protein sequences from the MH2 domain of the SMAD family of transcriptor factors. Validation of the results on the basis of the known interaction and function of the sites shows that the algorithm successfully identifies the known (from literature) subtype specific sites and new putative ones