Yong-Cui Wang, X. Ren, Chunhua Zhang, N. Deng, Xiang-Sun Zhang
{"title":"蛋白质相互作用预测中的去噪技术评价","authors":"Yong-Cui Wang, X. Ren, Chunhua Zhang, N. Deng, Xiang-Sun Zhang","doi":"10.1109/ISB.2011.6033124","DOIUrl":null,"url":null,"abstract":"The past decades witnessed extensive efforts to study the relationships among proteins. Particularly, sequence-based protein-protein interactions (PPIs) prediction is fundamentally important in speeding up the process of mapping interactomes of organisms. The composition vectors are usually constructed to encode proteins as real-value vectors, which is feeding to a machine learning framework. However, the composition vector value might be highly correlated to the distribution of amino acids, i.e., amino acids which are frequently observed in nature tend to have a large value of composition vector. Thus formulation to estimate the noise may be needed during representations. Here, we introduce two kinds of denoising composition vectors, which are efficient in construction of phylogenetic trees, to eliminate the noise. When validating these two denoising composition vectors on Escherichia coli (E.coli) and Saccharomyces cerevisiae (S.cerevisiae) randomly and artificial negative datasets, respectively, the predictive performance is not improved, and even worse than non-denoised prediction. These results suggest that, the denoising formulation efficient in phylogenetic trees construction can not improve the PPIs prediction, that is, what is noise is dependent on the applications.","PeriodicalId":355056,"journal":{"name":"2011 IEEE International Conference on Systems Biology (ISB)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the denoising techniques in protein-protein interaction prediction\",\"authors\":\"Yong-Cui Wang, X. Ren, Chunhua Zhang, N. Deng, Xiang-Sun Zhang\",\"doi\":\"10.1109/ISB.2011.6033124\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The past decades witnessed extensive efforts to study the relationships among proteins. Particularly, sequence-based protein-protein interactions (PPIs) prediction is fundamentally important in speeding up the process of mapping interactomes of organisms. The composition vectors are usually constructed to encode proteins as real-value vectors, which is feeding to a machine learning framework. However, the composition vector value might be highly correlated to the distribution of amino acids, i.e., amino acids which are frequently observed in nature tend to have a large value of composition vector. Thus formulation to estimate the noise may be needed during representations. Here, we introduce two kinds of denoising composition vectors, which are efficient in construction of phylogenetic trees, to eliminate the noise. When validating these two denoising composition vectors on Escherichia coli (E.coli) and Saccharomyces cerevisiae (S.cerevisiae) randomly and artificial negative datasets, respectively, the predictive performance is not improved, and even worse than non-denoised prediction. These results suggest that, the denoising formulation efficient in phylogenetic trees construction can not improve the PPIs prediction, that is, what is noise is dependent on the applications.\",\"PeriodicalId\":355056,\"journal\":{\"name\":\"2011 IEEE International Conference on Systems Biology (ISB)\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE International Conference on Systems Biology (ISB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISB.2011.6033124\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Systems Biology (ISB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISB.2011.6033124","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluating the denoising techniques in protein-protein interaction prediction
The past decades witnessed extensive efforts to study the relationships among proteins. Particularly, sequence-based protein-protein interactions (PPIs) prediction is fundamentally important in speeding up the process of mapping interactomes of organisms. The composition vectors are usually constructed to encode proteins as real-value vectors, which is feeding to a machine learning framework. However, the composition vector value might be highly correlated to the distribution of amino acids, i.e., amino acids which are frequently observed in nature tend to have a large value of composition vector. Thus formulation to estimate the noise may be needed during representations. Here, we introduce two kinds of denoising composition vectors, which are efficient in construction of phylogenetic trees, to eliminate the noise. When validating these two denoising composition vectors on Escherichia coli (E.coli) and Saccharomyces cerevisiae (S.cerevisiae) randomly and artificial negative datasets, respectively, the predictive performance is not improved, and even worse than non-denoised prediction. These results suggest that, the denoising formulation efficient in phylogenetic trees construction can not improve the PPIs prediction, that is, what is noise is dependent on the applications.