Signaling pathways are often represented by networks where each node corresponds to a protein and each edge corresponds to a relationship between nodes such as activation, inhibition and binding. However, such signaling pathways in a cell may be affected by genetic and epigenetic alteration. Some edges may be deleted and some edges may be newly added. The current knowledge about known signaling pathways is available on some public databases, but most of the signaling pathways including changes upon the cell state alterations remain largely unknown. In this paper, we develop an integer programming-based method for inferring such changes by using gene expression data. We test our method on its ability to reconstruct the pathway of colorectal cancer in the KEGG database.
We propose a statistical model realizing simultaneous estimation of gene regulatory network and gene module identification from time series gene expression data from microarray experiments. Under the assumption that genes in the same module are densely connected, the proposed method detects gene modules based on the variational Bayesian technique. The model can also incorporate existing biological prior knowledge such as protein subcellular localization. We apply the proposed model to the time series data from a synthetically generated network and verified the effectiveness of the proposed model. The proposed model is also applied the time series microarray data from HeLa cell. Detected gene module information gives the great help on drawing the estimated gene network.
A popular means of modeling metabolic networks is through identifying frequently observed pathways. However the definition of what constitutes an observation of a pathway and how to evaluate the importance of identified pathways remains unclear. In this paper we investigate different methods for defining an observed pathway and evaluate their performance with pathway classification models. We use three methods for defining an observed pathway; a path in gene over-expression, a path in probable gene over-expression and a path of most accurate classification. The performance of each definition is evaluated with three classification models; a probabilistic pathway classifier - HME3M, logistic regression and SVM. The results show that defining pathways using the probability of gene over-expression creates stable and accurate classifiers. Conversely we also show defining pathways of most accurate classification finds a severely biased pathways that are unrepresentative of underlying microarray data structure.