Multiple Functions Prediction of Yeast Saccharomyces Cerevisiae Proteins using Protein Interaction Information, Sequence Similarity and FunCat Taxonomy
Sovan Saha, P. Chatterjee, Subhadip Basu, M. Nasipuri
{"title":"Multiple Functions Prediction of Yeast Saccharomyces Cerevisiae Proteins using Protein Interaction Information, Sequence Similarity and FunCat Taxonomy","authors":"Sovan Saha, P. Chatterjee, Subhadip Basu, M. Nasipuri","doi":"10.1109/ICCE50343.2020.9290574","DOIUrl":null,"url":null,"abstract":"Protein function prediction becomes more challenging to the research community as it can be characterized as multi-label, hierarchical multi-class classification problem. This problem becomes complicated in nature as it suffers from several hardships which can be mentioned as: 1) Multiple functional groups with different confidence degree can be integrated with each protein; 2) Disintegrated multiple types of collected from heterogeneous sources; 3) Presence of functional groups in hierarchical relationship not in independent form; 4) incomplete and missing functional annotation of proteins; 5) Imbalanced proportion of functional groups; 6) Use of experimentally or computationally predicted biological data resulting into misleading inference due to false positive data; 7) Efficacy or weakness of artificially created heuristic driven negative sample (for example, Protein non-interacting data) etc. Considering these factors, in this paper, protein functional annotation is done using protein interaction information, sequence similarity where hierarchical relationship among functional groups are used and facilitated by FunCat taxonomy. Protein Interaction data with annotation of MIPS functional Catalogue and FunCat Taxonomy is used for this work.","PeriodicalId":421963,"journal":{"name":"2020 IEEE 1st International Conference for Convergence in Engineering (ICCE)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 1st International Conference for Convergence in Engineering (ICCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE50343.2020.9290574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Protein function prediction becomes more challenging to the research community as it can be characterized as multi-label, hierarchical multi-class classification problem. This problem becomes complicated in nature as it suffers from several hardships which can be mentioned as: 1) Multiple functional groups with different confidence degree can be integrated with each protein; 2) Disintegrated multiple types of collected from heterogeneous sources; 3) Presence of functional groups in hierarchical relationship not in independent form; 4) incomplete and missing functional annotation of proteins; 5) Imbalanced proportion of functional groups; 6) Use of experimentally or computationally predicted biological data resulting into misleading inference due to false positive data; 7) Efficacy or weakness of artificially created heuristic driven negative sample (for example, Protein non-interacting data) etc. Considering these factors, in this paper, protein functional annotation is done using protein interaction information, sequence similarity where hierarchical relationship among functional groups are used and facilitated by FunCat taxonomy. Protein Interaction data with annotation of MIPS functional Catalogue and FunCat Taxonomy is used for this work.