DNA replication is a fundamental process that is tightly regulated during the cell cycle. In budding yeast it starts from multiple origins of replication and proceeds in a timely fashion according to a reproducible temporal program until the entire DNA is replicated exactly once per cell cycle. In this program an origin seems to have an inherent firing probability at a specific time in S-phase that is conserved over the population. However, what exactly determines the origin initiation time remains obscure. In this work, we analyze the gene content that clusters around replication origins following the assumption that inherent origin properties that determine staggered initiation times could potentially be mirrored in the close origin proximity. We perform a Gene Ontology term enrichment test and find that metabolic genes are significantly over-represented in the regions that are close to the starting points of DNA replication. Furthermore, functional analysis also reveals that catabolic genes cluster around early firing origins, whereas anabolic genes can rather be found in the proximity of late firing origins of replication. We speculate that, in budding yeast, gene function around replication origins correlates with their intrinsic probability to initiate DNA replication at a given point in S-phase.
{"title":"Different groups of metabolic genes cluster around early and late firing origins of replication in budding yeast.","authors":"Thomas W Spiesser, Edda Klipp","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>DNA replication is a fundamental process that is tightly regulated during the cell cycle. In budding yeast it starts from multiple origins of replication and proceeds in a timely fashion according to a reproducible temporal program until the entire DNA is replicated exactly once per cell cycle. In this program an origin seems to have an inherent firing probability at a specific time in S-phase that is conserved over the population. However, what exactly determines the origin initiation time remains obscure. In this work, we analyze the gene content that clusters around replication origins following the assumption that inherent origin properties that determine staggered initiation times could potentially be mirrored in the close origin proximity. We perform a Gene Ontology term enrichment test and find that metabolic genes are significantly over-represented in the regions that are close to the starting points of DNA replication. Furthermore, functional analysis also reveals that catabolic genes cluster around early firing origins, whereas anabolic genes can rather be found in the proximity of late firing origins of replication. We speculate that, in budding yeast, gene function around replication origins correlates with their intrinsic probability to initiate DNA replication at a given point in S-phase.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"179-92"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-01DOI: 10.1142/9781848166585_0012
Anupama Reddy, Conway C. Huang, Huiqing Liu, C. DeLisi, M. Nevalainen, S. Szalma, G. Bhanot
We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (<7) and high (≥7) Gleason grade tumors. A comparison of their major hubs with those of the network for normal samples identified two types of changes associated with disease: (i) Some hub genes increased their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with gain of regulatory control in cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-lα is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily be extended to identify and study networks associated with any two phenotypes.
{"title":"Robust gene network analysis reveals alteration of the STAT5a network as a hallmark of prostate cancer.","authors":"Anupama Reddy, Conway C. Huang, Huiqing Liu, C. DeLisi, M. Nevalainen, S. Szalma, G. Bhanot","doi":"10.1142/9781848166585_0012","DOIUrl":"https://doi.org/10.1142/9781848166585_0012","url":null,"abstract":"We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (<7) and high (≥7) Gleason grade tumors. A comparison of their major hubs with those of the network for normal samples identified two types of changes associated with disease: (i) Some hub genes increased their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with gain of regulatory control in cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-lα is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily be extended to identify and study networks associated with any two phenotypes.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"29 1","pages":"139-53"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75753201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-01DOI: 10.1142/9781848166585_0014
Teppei Shimamura, S. Imoto, Masao Nagasaki, Mai Yamauchi, R. Yamaguchi, André Fujita, Y. Tamada, N. Gotoh, S. Miyano
One of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches. The first problem is the tacit assumption that time lag is stationary. The second is the inseparability between measurement noise and process noise (unmeasured disturbances that pass through time process). To address these problems, we propose a stochastic differential equation model for inferring continuous-time dynamic gene networks under the situation in which both of the process noise and the observation noise exist. We present a collocation-based sparse estimation for simultaneous parameter estimation and model selection in the model. The collocation-based approach requires considerably less computational effort than traditional methods in ordinary stochastic differential equation models. We also incorporate various biological knowledge easily to refine the estimation accuracy with the proposed method. The results using simulated data and real time-series expression data of human primary small airway epithelial cells demonstrate that the proposed approach outperforms competing approaches and can provide significant genes influenced by gefitinib.
{"title":"Collocation-based sparse estimation for constructing dynamic gene networks.","authors":"Teppei Shimamura, S. Imoto, Masao Nagasaki, Mai Yamauchi, R. Yamaguchi, André Fujita, Y. Tamada, N. Gotoh, S. Miyano","doi":"10.1142/9781848166585_0014","DOIUrl":"https://doi.org/10.1142/9781848166585_0014","url":null,"abstract":"One of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches. The first problem is the tacit assumption that time lag is stationary. The second is the inseparability between measurement noise and process noise (unmeasured disturbances that pass through time process). To address these problems, we propose a stochastic differential equation model for inferring continuous-time dynamic gene networks under the situation in which both of the process noise and the observation noise exist. We present a collocation-based sparse estimation for simultaneous parameter estimation and model selection in the model. The collocation-based approach requires considerably less computational effort than traditional methods in ordinary stochastic differential equation models. We also incorporate various biological knowledge easily to refine the estimation accuracy with the proposed method. The results using simulated data and real time-series expression data of human primary small airway epithelial cells demonstrate that the proposed approach outperforms competing approaches and can provide significant genes influenced by gefitinib.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"9 1","pages":"164-78"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81957051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-01DOI: 10.1142/9781848165786_0012
T. Schendel, M. Falcke
We present here an efficient but detailed approach to modelling Ca(2+)-induced Ca(2+) release in the diadic cleft of cardiac ventricular myocytes. In this Framework we developed a spatial resolved Ca(2+) release unit (CaRU), consisting of the junctional sarcoplasmic reticulum and the diadic cleft, with a well defined channel placement. By taking advantage of time scale separation, the model could be finally reduced to only one ordinary differential equation for describing Ca(2+) fluxes and diffusion. Additionally the channel gating is described in a stochastic way. The resulting model is able to reproduce experimental findings like the gradedness of SR release, the voltage dependence of ECC gain and typical spark life time. Due to the numerical efficiency of the model, it is suitable to use for whole cell simulations. The approach we want to use extend the developed (CaRU) to such a whole cell model is already outlined in this work.
{"title":"Efficient and detailed model of the local Ca2+ release unit in the ventricular cardiac myocyte.","authors":"T. Schendel, M. Falcke","doi":"10.1142/9781848165786_0012","DOIUrl":"https://doi.org/10.1142/9781848165786_0012","url":null,"abstract":"We present here an efficient but detailed approach to modelling Ca(2+)-induced Ca(2+) release in the diadic cleft of cardiac ventricular myocytes. In this Framework we developed a spatial resolved Ca(2+) release unit (CaRU), consisting of the junctional sarcoplasmic reticulum and the diadic cleft, with a well defined channel placement. By taking advantage of time scale separation, the model could be finally reduced to only one ordinary differential equation for describing Ca(2+) fluxes and diffusion. Additionally the channel gating is described in a stochastic way. The resulting model is able to reproduce experimental findings like the gradedness of SR release, the voltage dependence of ECC gain and typical spark life time. Due to the numerical efficiency of the model, it is suitable to use for whole cell simulations. The approach we want to use extend the developed (CaRU) to such a whole cell model is already outlined in this work.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"114 1","pages":"142-55"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82159015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-01DOI: 10.1142/9781848165786_0004
Timothy Hancock, Hiroshi Mamitsuka
A popular means of modeling metabolic networks is through identifying frequently observed pathways. However the definition of what constitutes an observation of a pathway and how to evaluate the importance of identified pathways remains unclear. In this paper we investigate different methods for defining an observed pathway and evaluate their performance with pathway classification models. We use three methods for defining an observed pathway; a path in gene over-expression, a path in probable gene over-expression and a path of most accurate classification. The performance of each definition is evaluated with three classification models; a probabilistic pathway classifier - HME3M, logistic regression and SVM. The results show that defining pathways using the probability of gene over-expression creates stable and accurate classifiers. Conversely we also show defining pathways of most accurate classification finds a severely biased pathways that are unrepresentative of underlying microarray data structure.
{"title":"Active pathway identification and classification with probabilistic ensembles.","authors":"Timothy Hancock, Hiroshi Mamitsuka","doi":"10.1142/9781848165786_0004","DOIUrl":"https://doi.org/10.1142/9781848165786_0004","url":null,"abstract":"A popular means of modeling metabolic networks is through identifying frequently observed pathways. However the definition of what constitutes an observation of a pathway and how to evaluate the importance of identified pathways remains unclear. In this paper we investigate different methods for defining an observed pathway and evaluate their performance with pathway classification models. We use three methods for defining an observed pathway; a path in gene over-expression, a path in probable gene over-expression and a path of most accurate classification. The performance of each definition is evaluated with three classification models; a probabilistic pathway classifier - HME3M, logistic regression and SVM. The results show that defining pathways using the probability of gene over-expression creates stable and accurate classifiers. Conversely we also show defining pathways of most accurate classification finds a severely biased pathways that are unrepresentative of underlying microarray data structure.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"179 1","pages":"30-40"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80679974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Glycerophospholipids are major structural lipids in cellular membrane systems and play key roles as suppliers of the first and second messengers in the signal transduction and molecular recognition processes. The distribution of lipid components differs among organelles and cells. The distribution is controlled by two pathways in lipid metabolism: de nova and remodeling pathways. Glycerophospholipids including arachidonic and stearic acids are mostly produced in the remodeling pathway, whereas lipid chains are reconstructed from those synthesized in the de novo pathway. Recently lysophospholipid acyltransferases have been isolated as key enzymes in the remodeling pathway, and the substrate specificity has been investigated in terms of the chemical substructures of glycerophospholipids, such as the type of head groups and the length of aliphatic chains. These experimental studies have been reported for specific organisms, and only two representative sequence motifs are known for acyltransferases: a general pattern and the pattern for membrane-bound O-acyltransferase (MBOAT). Here we attempt to correlate the sequence patterns and the substrate specificity of lysophospholipid acyltransferases in 89 eukaryotic genomes in order to understand the roles of this enzyme family and underlying glycerophospholipid structural variations. Using phylogenetic and domain analyses, the lysophospholipid acyltransferase family was divided into 18 subtypes. Furthermore, we examined the occurrence of identified subtypes in eukaryotic genomes, and found the expansion of these subtypes in vertebrates. These findings may provide clues to understanding structural variations and distributions of glycerophospholipids in different organisms.
{"title":"Analysis of a lipid biosynthesis protein family and phospholipid structural variations.","authors":"Michihiro Tanaka, Yuki Moriya, Susumu Goto, Minoru Kanehisa","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Glycerophospholipids are major structural lipids in cellular membrane systems and play key roles as suppliers of the first and second messengers in the signal transduction and molecular recognition processes. The distribution of lipid components differs among organelles and cells. The distribution is controlled by two pathways in lipid metabolism: de nova and remodeling pathways. Glycerophospholipids including arachidonic and stearic acids are mostly produced in the remodeling pathway, whereas lipid chains are reconstructed from those synthesized in the de novo pathway. Recently lysophospholipid acyltransferases have been isolated as key enzymes in the remodeling pathway, and the substrate specificity has been investigated in terms of the chemical substructures of glycerophospholipids, such as the type of head groups and the length of aliphatic chains. These experimental studies have been reported for specific organisms, and only two representative sequence motifs are known for acyltransferases: a general pattern and the pattern for membrane-bound O-acyltransferase (MBOAT). Here we attempt to correlate the sequence patterns and the substrate specificity of lysophospholipid acyltransferases in 89 eukaryotic genomes in order to understand the roles of this enzyme family and underlying glycerophospholipid structural variations. Using phylogenetic and domain analyses, the lysophospholipid acyltransferase family was divided into 18 subtypes. Furthermore, we examined the occurrence of identified subtypes in eukaryotic genomes, and found the expansion of these subtypes in vertebrates. These findings may provide clues to understanding structural variations and distributions of glycerophospholipids in different organisms.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"22 ","pages":"191-201"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28783682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Teppei Shimamura, Seiya Imoto, Masao Nagasaki, Mai Yamauchi, Rui Yamaguchi, André Fujita, Yoshinori Tamada, Noriko Gotoh, Satoru Miyano
One of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches. The first problem is the tacit assumption that time lag is stationary. The second is the inseparability between measurement noise and process noise (unmeasured disturbances that pass through time process). To address these problems, we propose a stochastic differential equation model for inferring continuous-time dynamic gene networks under the situation in which both of the process noise and the observation noise exist. We present a collocation-based sparse estimation for simultaneous parameter estimation and model selection in the model. The collocation-based approach requires considerably less computational effort than traditional methods in ordinary stochastic differential equation models. We also incorporate various biological knowledge easily to refine the estimation accuracy with the proposed method. The results using simulated data and real time-series expression data of human primary small airway epithelial cells demonstrate that the proposed approach outperforms competing approaches and can provide significant genes influenced by gefitinib.
{"title":"Collocation-based sparse estimation for constructing dynamic gene networks.","authors":"Teppei Shimamura, Seiya Imoto, Masao Nagasaki, Mai Yamauchi, Rui Yamaguchi, André Fujita, Yoshinori Tamada, Noriko Gotoh, Satoru Miyano","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>One of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches. The first problem is the tacit assumption that time lag is stationary. The second is the inseparability between measurement noise and process noise (unmeasured disturbances that pass through time process). To address these problems, we propose a stochastic differential equation model for inferring continuous-time dynamic gene networks under the situation in which both of the process noise and the observation noise exist. We present a collocation-based sparse estimation for simultaneous parameter estimation and model selection in the model. The collocation-based approach requires considerably less computational effort than traditional methods in ordinary stochastic differential equation models. We also incorporate various biological knowledge easily to refine the estimation accuracy with the proposed method. The results using simulated data and real time-series expression data of human primary small airway epithelial cells demonstrate that the proposed approach outperforms competing approaches and can provide significant genes influenced by gefitinib.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"164-78"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Zhao, Takeyuki Tamura, Morihiro Hayashida, Tatsuya Akutsu
For several decades, many methods have been developed for predicting organic synthesis paths. However these methods have non-polynomial computational time. In this paper, we propose a bottom-up dynamic programming algorithm to predict synthesis paths of target tree-structured compounds. In this approach, we transform the synthesis problem of tree-structured compounds to the generation problem of unordered trees by regarding tree-structured compounds and chemical reactions as unordered trees and rules, respectively. In order to represent rules corresponding to chemical reactions, we employ a subclass of NLC (Node Label Controlled) grammars. We also give some computational results on this algorithm.
{"title":"A dynamic programming algorithm to predict synthesis processes of tree-structured compounds with graph grammar.","authors":"Yang Zhao, Takeyuki Tamura, Morihiro Hayashida, Tatsuya Akutsu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>For several decades, many methods have been developed for predicting organic synthesis paths. However these methods have non-polynomial computational time. In this paper, we propose a bottom-up dynamic programming algorithm to predict synthesis paths of target tree-structured compounds. In this approach, we transform the synthesis problem of tree-structured compounds to the generation problem of unordered trees by regarding tree-structured compounds and chemical reactions as unordered trees and rules, respectively. In order to represent rules corresponding to chemical reactions, we employ a subclass of NLC (Node Label Controlled) grammars. We also give some computational results on this algorithm.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"218-29"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-01DOI: 10.1142/9781848165786_0001
J. von Eichborn, S. Günther, R. Preissner
Solved structures of protein-protein complexes give fundamental insights into protein function and molecular recognition. Although the determination of protein-protein complexes is generally more difficult than solving individual proteins, the number of experimentally determined complexes increased conspicuously during the last decade. Here, the interfaces of 750 transient protein-protein interactions as well as 2,000 interactions between domains of the same protein chain (obligate interactions) were analyzed to obtain a better understanding of molecular recognition and to identify features applicable for protein binding site prediction. Calculation of knowledge-based potentials showed a preference of contacts between amino acids having complementary physicochemical properties. The analysis of amino acid conservation of the entire interface area showed a weak but significant tendency to a higher evolutionary conservation of protein binding sites compared to surface areas that are permanently exposed to solvent. Remarkably, contact frequencies between outstandingly conserved residues are much higher than expected confirming the so-called "hot spot" theory. The comparisons between obligate and transient domain contacts reveal differences and point out that structural diversification and molecular recognition of protein-protein interactions are subjected to other evolutionary aspects than obligate domain-domain interactions.
{"title":"Structural features and evolution of protein-protein interactions.","authors":"J. von Eichborn, S. Günther, R. Preissner","doi":"10.1142/9781848165786_0001","DOIUrl":"https://doi.org/10.1142/9781848165786_0001","url":null,"abstract":"Solved structures of protein-protein complexes give fundamental insights into protein function and molecular recognition. Although the determination of protein-protein complexes is generally more difficult than solving individual proteins, the number of experimentally determined complexes increased conspicuously during the last decade. Here, the interfaces of 750 transient protein-protein interactions as well as 2,000 interactions between domains of the same protein chain (obligate interactions) were analyzed to obtain a better understanding of molecular recognition and to identify features applicable for protein binding site prediction. Calculation of knowledge-based potentials showed a preference of contacts between amino acids having complementary physicochemical properties. The analysis of amino acid conservation of the entire interface area showed a weak but significant tendency to a higher evolutionary conservation of protein binding sites compared to surface areas that are permanently exposed to solvent. Remarkably, contact frequencies between outstandingly conserved residues are much higher than expected confirming the so-called \"hot spot\" theory. The comparisons between obligate and transient domain contacts reveal differences and point out that structural diversification and molecular recognition of protein-protein interactions are subjected to other evolutionary aspects than obligate domain-domain interactions.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"55 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83543220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DNA replication is restricted to a specific time window of the cell cycle, called S phase. Successful progression through S phase requires replication to be properly regulated to ensure that the entire genome is duplicated exactly once, without errors, in a timely fashion. As a result, DNA replication has evolved into a tightly regulated process involving the coordinated action of numerous factors that function in all phases of the cell cycle. Biochemical mechanisms driving the eukaryotic cell division cycle have been the subject of a number of mathematical models. However, cell cycle networks reported in literature so far have not addressed the steps of DNA replication events. In particular, the assembly of the replication machinery is crucial for the timing of S phase. This event, called "initiation", which occurs in late M / early G1 of the cell cycle, starts with the assembly of the pre-replicative complex (pre-RC) at the origins of replication on the DNA. Its activation depends on the availability of different kinase complexes, cyclin-dependent kinases (CDKs) and Dbf-dependent kinase (DDK), which phosphorylate specific components of the pre-RC to convert it into the pre-initiation complex (pre-IC). We have developed an ODE-based model of the network responsible for this process in budding yeast by using mass-action kinetics. We considered all steps from the assembly of the first components at the DNA replication origin up to the active replisome that recruits the polymerases and verified the computational dynamics with the available literature data. Our results highlighted the link between activation of CDK and DDK and the step-by-step formation of both pre-RC and pre-IC, suggesting S-CDK (Cdk1-Clb5,6) to be the main regulator of the process.
{"title":"Kinetic modelling of DNA replication initiation in budding yeast.","authors":"Matteo Barberis, Thomas W Spiesser, Edda Klipp","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>DNA replication is restricted to a specific time window of the cell cycle, called S phase. Successful progression through S phase requires replication to be properly regulated to ensure that the entire genome is duplicated exactly once, without errors, in a timely fashion. As a result, DNA replication has evolved into a tightly regulated process involving the coordinated action of numerous factors that function in all phases of the cell cycle. Biochemical mechanisms driving the eukaryotic cell division cycle have been the subject of a number of mathematical models. However, cell cycle networks reported in literature so far have not addressed the steps of DNA replication events. In particular, the assembly of the replication machinery is crucial for the timing of S phase. This event, called \"initiation\", which occurs in late M / early G1 of the cell cycle, starts with the assembly of the pre-replicative complex (pre-RC) at the origins of replication on the DNA. Its activation depends on the availability of different kinase complexes, cyclin-dependent kinases (CDKs) and Dbf-dependent kinase (DDK), which phosphorylate specific components of the pre-RC to convert it into the pre-initiation complex (pre-IC). We have developed an ODE-based model of the network responsible for this process in budding yeast by using mass-action kinetics. We considered all steps from the assembly of the first components at the DNA replication origin up to the active replisome that recruits the polymerases and verified the computational dynamics with the available literature data. Our results highlighted the link between activation of CDK and DDK and the step-by-step formation of both pre-RC and pre-IC, suggesting S-CDK (Cdk1-Clb5,6) to be the main regulator of the process.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"1-20"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}