{"title":"Detecting Phosphorylation Determined Active Protein Interaction Network during Cancer Development by Robust Network Component Analysis","authors":"T. Zeng, Ziming Wang, Luonan Chen","doi":"10.1145/2665970.2665991","DOIUrl":null,"url":null,"abstract":"Motivation: In recent disease study, many key pathogen genes/proteins are found to have not significant differential expressions, and thus, they tend to be disregarded in conventional differential expression analysis or network analysis. Meanwhile, the activity in dry-experiment rather than expression in wet-experiment have been proposed to effectively estimate the actual regulation power of such important biomolecules, e.g. transcriptional factors. But, it is still unknown what and how a hidden factor (e.g. phosphorylation) determines this kind of virtual regulation power as activity [1]. Especially, for the cancer development study, it is emergent to reconstruct the active protein interaction network and detect the underlying phosphorylation pattern in a dynamic manner [2-7]. Methods: Based on the c-Myc mouse model of liver cancer, we have first collected protein expression and protein phosphorylation data at several developmental time points. Then, we constructed a rough protein interaction network as background by conditional mutual information. Next, we improved the conventional network component analysis on its robustness, and used this advanced approach RNCA (Robust Network Component Analysis) to reconstruct the time-dependent protein interaction networks and estimate the activity of target protein at different times simultaneously. Finally, considering the different experiment-qualities of protein expression and phosphorylation data, we used canonical correlation analysis to detect the maximal correlation between the expression and phosphorylation of a group of proteins (e.g. protein network module), which could reveal the active protein sub-network and its determinate factor as phosphorylation. Results: In the preliminary study, we have evaluated the robustness of RNCA by comparing with other conventional methods. And on the real biological data, we have found the rewired protein interaction network during cancer development, its corresponding active proteins, and their drivers as protein phosphorylation. This work can be further used in early diagnosis of diseases by edge biomarkers [1-2], network biomarkers [3-4] and dynamical network biomarkers [5-7].","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data and Text Mining in Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2665970.2665991","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: In recent disease study, many key pathogen genes/proteins are found to have not significant differential expressions, and thus, they tend to be disregarded in conventional differential expression analysis or network analysis. Meanwhile, the activity in dry-experiment rather than expression in wet-experiment have been proposed to effectively estimate the actual regulation power of such important biomolecules, e.g. transcriptional factors. But, it is still unknown what and how a hidden factor (e.g. phosphorylation) determines this kind of virtual regulation power as activity [1]. Especially, for the cancer development study, it is emergent to reconstruct the active protein interaction network and detect the underlying phosphorylation pattern in a dynamic manner [2-7]. Methods: Based on the c-Myc mouse model of liver cancer, we have first collected protein expression and protein phosphorylation data at several developmental time points. Then, we constructed a rough protein interaction network as background by conditional mutual information. Next, we improved the conventional network component analysis on its robustness, and used this advanced approach RNCA (Robust Network Component Analysis) to reconstruct the time-dependent protein interaction networks and estimate the activity of target protein at different times simultaneously. Finally, considering the different experiment-qualities of protein expression and phosphorylation data, we used canonical correlation analysis to detect the maximal correlation between the expression and phosphorylation of a group of proteins (e.g. protein network module), which could reveal the active protein sub-network and its determinate factor as phosphorylation. Results: In the preliminary study, we have evaluated the robustness of RNCA by comparing with other conventional methods. And on the real biological data, we have found the rewired protein interaction network during cancer development, its corresponding active proteins, and their drivers as protein phosphorylation. This work can be further used in early diagnosis of diseases by edge biomarkers [1-2], network biomarkers [3-4] and dynamical network biomarkers [5-7].