首页 > 最新文献

Genome informatics. International Conference on Genome Informatics最新文献

英文 中文
Different groups of metabolic genes cluster around early and late firing origins of replication in budding yeast. 在出芽酵母中,不同的代谢基因群围绕着早期和晚期的复制起始点聚集。
Thomas W Spiesser, Edda Klipp

DNA replication is a fundamental process that is tightly regulated during the cell cycle. In budding yeast it starts from multiple origins of replication and proceeds in a timely fashion according to a reproducible temporal program until the entire DNA is replicated exactly once per cell cycle. In this program an origin seems to have an inherent firing probability at a specific time in S-phase that is conserved over the population. However, what exactly determines the origin initiation time remains obscure. In this work, we analyze the gene content that clusters around replication origins following the assumption that inherent origin properties that determine staggered initiation times could potentially be mirrored in the close origin proximity. We perform a Gene Ontology term enrichment test and find that metabolic genes are significantly over-represented in the regions that are close to the starting points of DNA replication. Furthermore, functional analysis also reveals that catabolic genes cluster around early firing origins, whereas anabolic genes can rather be found in the proximity of late firing origins of replication. We speculate that, in budding yeast, gene function around replication origins correlates with their intrinsic probability to initiate DNA replication at a given point in S-phase.

DNA复制是细胞周期中受到严格调控的基本过程。在出芽酵母中,它从多个复制起点开始,并根据可复制的时间程序及时进行,直到整个DNA在每个细胞周期中精确复制一次。在这个程序中,原点似乎在s期的特定时间有一个固有的发射概率,这个概率在总体上是守恒的。然而,究竟是什么决定了起源起始时间仍然是一个谜。在这项工作中,我们分析了围绕复制起点聚集的基因内容,假设固有的起点属性决定了交错起始时间,可能反映在接近的起点上。我们进行了基因本体术语富集测试,发现代谢基因在接近DNA复制起点的区域显着过度代表。此外,功能分析还表明,分解代谢基因聚集在早期发射起源周围,而合成代谢基因则可以在复制的晚期发射起源附近找到。我们推测,在出芽酵母中,围绕复制起点的基因功能与它们在s期某一给定点启动DNA复制的内在概率相关。
{"title":"Different groups of metabolic genes cluster around early and late firing origins of replication in budding yeast.","authors":"Thomas W Spiesser,&nbsp;Edda Klipp","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>DNA replication is a fundamental process that is tightly regulated during the cell cycle. In budding yeast it starts from multiple origins of replication and proceeds in a timely fashion according to a reproducible temporal program until the entire DNA is replicated exactly once per cell cycle. In this program an origin seems to have an inherent firing probability at a specific time in S-phase that is conserved over the population. However, what exactly determines the origin initiation time remains obscure. In this work, we analyze the gene content that clusters around replication origins following the assumption that inherent origin properties that determine staggered initiation times could potentially be mirrored in the close origin proximity. We perform a Gene Ontology term enrichment test and find that metabolic genes are significantly over-represented in the regions that are close to the starting points of DNA replication. Furthermore, functional analysis also reveals that catabolic genes cluster around early firing origins, whereas anabolic genes can rather be found in the proximity of late firing origins of replication. We speculate that, in budding yeast, gene function around replication origins correlates with their intrinsic probability to initiate DNA replication at a given point in S-phase.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"179-92"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust gene network analysis reveals alteration of the STAT5a network as a hallmark of prostate cancer. 稳健的基因网络分析显示STAT5a网络的改变是前列腺癌的一个标志。
Pub Date : 2010-01-01 DOI: 10.1142/9781848166585_0012
Anupama Reddy, Conway C. Huang, Huiqing Liu, C. DeLisi, M. Nevalainen, S. Szalma, G. Bhanot
We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (<7) and high (≥7) Gleason grade tumors. A comparison of their major hubs with those of the network for normal samples identified two types of changes associated with disease: (i) Some hub genes increased their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with gain of regulatory control in cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-lα is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily be extended to identify and study networks associated with any two phenotypes.
我们开发了一种通用方法,从微阵列数据集中基因之间的配对相关性中识别基因网络,并将其应用于来自69个原发性前列腺肿瘤的公共前列腺癌基因表达数据。我们将节点度定义为与节点显著相关的基因数量,并将枢纽基因定义为与节点度最高的基因。在VisANT (http://visant.bu.edu/)中使用转录因子结合信息作为生物过滤器来修剪相关网络。采用严格的排列检验确定枢纽基因的可靠性。正常前列腺样本、非裔美国人(AA)和欧裔美国人(EA)前列腺癌样本的单独网络被生成并进行了比较。我们发现相同的中枢控制着AA和EA网络的疾病进展。结合AA和EA样本,我们建立了低低(<7)和高(≥7)Gleason分级肿瘤的网络。将其主要枢纽与正常样本的网络进行比较,发现了两种与疾病相关的变化:(i)与正常网络中的程度相比,一些枢纽基因在肿瘤网络中的程度增加,这表明这些基因与癌症调节控制的获得有关(例如,可能开启致癌基因)。(ii)与在正常网络中的程度相比,一些枢纽在肿瘤网络中的程度降低了,这表明这些基因与癌症中调节控制的丧失有关(例如,可能的肿瘤抑制基因的丧失)。一个惊人的结果是,在AA和EA肿瘤样本中,与正常前列腺网络相比,STAT5a、CEBPB和EGR1是获得邻居的主要枢纽。相反,与正常前列腺网络相比,HIF-lα是前列腺癌网络中失去连接的主要枢纽。我们还发现这些中心的程度从正常到低级别到高级别疾病逐渐变化,这表明这些中心是前列腺癌的主要调节因子,并标志着疾病进展。STAT5a被确定为一个中心枢纽,在前列腺癌网络中有大约120个邻居,而在正常前列腺网络中只有81个邻居。STAT5a的120个邻居中,有57个是已知的癌症相关基因,已知参与与肿瘤发生相关的功能通路。我们的方法是通用的,可以很容易地扩展到识别和研究与任何两种表型相关的网络。
{"title":"Robust gene network analysis reveals alteration of the STAT5a network as a hallmark of prostate cancer.","authors":"Anupama Reddy, Conway C. Huang, Huiqing Liu, C. DeLisi, M. Nevalainen, S. Szalma, G. Bhanot","doi":"10.1142/9781848166585_0012","DOIUrl":"https://doi.org/10.1142/9781848166585_0012","url":null,"abstract":"We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (<7) and high (≥7) Gleason grade tumors. A comparison of their major hubs with those of the network for normal samples identified two types of changes associated with disease: (i) Some hub genes increased their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with gain of regulatory control in cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-lα is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily be extended to identify and study networks associated with any two phenotypes.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"29 1","pages":"139-53"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75753201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Collocation-based sparse estimation for constructing dynamic gene networks. 基于配位的稀疏估计构建动态基因网络。
Pub Date : 2010-01-01 DOI: 10.1142/9781848166585_0014
Teppei Shimamura, S. Imoto, Masao Nagasaki, Mai Yamauchi, R. Yamaguchi, André Fujita, Y. Tamada, N. Gotoh, S. Miyano
One of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches. The first problem is the tacit assumption that time lag is stationary. The second is the inseparability between measurement noise and process noise (unmeasured disturbances that pass through time process). To address these problems, we propose a stochastic differential equation model for inferring continuous-time dynamic gene networks under the situation in which both of the process noise and the observation noise exist. We present a collocation-based sparse estimation for simultaneous parameter estimation and model selection in the model. The collocation-based approach requires considerably less computational effort than traditional methods in ordinary stochastic differential equation models. We also incorporate various biological knowledge easily to refine the estimation accuracy with the proposed method. The results using simulated data and real time-series expression data of human primary small airway epithelial cells demonstrate that the proposed approach outperforms competing approaches and can provide significant genes influenced by gefitinib.
系统生物学的一个开放问题是用数学、统计和计算方法来推断动态基因网络,描述潜在的生物过程。基于一阶差分方程的动态贝叶斯网络模型和矢量自回归模型从时间序列微阵列数据中推断出基因之间的时滞关系。然而,两个主要问题大大降低了当前方法的有效性。第一个问题是默认的假设,即时间滞后是固定的。其次是测量噪声和过程噪声(通过时间过程的未测量干扰)之间的不可分割性。为了解决这些问题,我们提出了一个过程噪声和观测噪声同时存在的连续时间动态基因网络的随机微分方程模型。提出了一种基于并置的稀疏估计方法,用于模型中参数估计和模型选择的同时进行。在普通随机微分方程模型中,基于配位的方法比传统方法的计算量要少得多。我们还可以很容易地结合各种生物学知识来提高该方法的估计精度。使用模拟数据和人类原代小气道上皮细胞的实时时序表达数据的结果表明,所提出的方法优于竞争方法,并且可以提供受吉非替尼影响的重要基因。
{"title":"Collocation-based sparse estimation for constructing dynamic gene networks.","authors":"Teppei Shimamura, S. Imoto, Masao Nagasaki, Mai Yamauchi, R. Yamaguchi, André Fujita, Y. Tamada, N. Gotoh, S. Miyano","doi":"10.1142/9781848166585_0014","DOIUrl":"https://doi.org/10.1142/9781848166585_0014","url":null,"abstract":"One of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches. The first problem is the tacit assumption that time lag is stationary. The second is the inseparability between measurement noise and process noise (unmeasured disturbances that pass through time process). To address these problems, we propose a stochastic differential equation model for inferring continuous-time dynamic gene networks under the situation in which both of the process noise and the observation noise exist. We present a collocation-based sparse estimation for simultaneous parameter estimation and model selection in the model. The collocation-based approach requires considerably less computational effort than traditional methods in ordinary stochastic differential equation models. We also incorporate various biological knowledge easily to refine the estimation accuracy with the proposed method. The results using simulated data and real time-series expression data of human primary small airway epithelial cells demonstrate that the proposed approach outperforms competing approaches and can provide significant genes influenced by gefitinib.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"9 1","pages":"164-78"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81957051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Efficient and detailed model of the local Ca2+ release unit in the ventricular cardiac myocyte. 心室心肌细胞局部Ca2+释放单元的高效和详细模型。
Pub Date : 2010-01-01 DOI: 10.1142/9781848165786_0012
T. Schendel, M. Falcke
We present here an efficient but detailed approach to modelling Ca(2+)-induced Ca(2+) release in the diadic cleft of cardiac ventricular myocytes. In this Framework we developed a spatial resolved Ca(2+) release unit (CaRU), consisting of the junctional sarcoplasmic reticulum and the diadic cleft, with a well defined channel placement. By taking advantage of time scale separation, the model could be finally reduced to only one ordinary differential equation for describing Ca(2+) fluxes and diffusion. Additionally the channel gating is described in a stochastic way. The resulting model is able to reproduce experimental findings like the gradedness of SR release, the voltage dependence of ECC gain and typical spark life time. Due to the numerical efficiency of the model, it is suitable to use for whole cell simulations. The approach we want to use extend the developed (CaRU) to such a whole cell model is already outlined in this work.
我们在这里提出了一种有效但详细的方法来模拟Ca(2+)诱导的Ca(2+)释放在心室肌细胞双裂中。在这个框架中,我们开发了一个空间分辨率的Ca(2+)释放单元(CaRU),由连接肌浆网和斜裂组成,具有明确的通道放置。利用时间尺度的分离,模型最终可以简化为一个描述Ca(2+)通量和扩散的常微分方程。此外,通道门控以随机方式描述。所得到的模型能够再现实验结果,如SR释放的梯度,ECC增益的电压依赖性和典型的火花寿命。由于该模型的数值效率高,适合用于整个细胞的模拟。我们想要使用的方法是将开发的(CaRU)扩展到这样一个完整的细胞模型,在这项工作中已经概述了。
{"title":"Efficient and detailed model of the local Ca2+ release unit in the ventricular cardiac myocyte.","authors":"T. Schendel, M. Falcke","doi":"10.1142/9781848165786_0012","DOIUrl":"https://doi.org/10.1142/9781848165786_0012","url":null,"abstract":"We present here an efficient but detailed approach to modelling Ca(2+)-induced Ca(2+) release in the diadic cleft of cardiac ventricular myocytes. In this Framework we developed a spatial resolved Ca(2+) release unit (CaRU), consisting of the junctional sarcoplasmic reticulum and the diadic cleft, with a well defined channel placement. By taking advantage of time scale separation, the model could be finally reduced to only one ordinary differential equation for describing Ca(2+) fluxes and diffusion. Additionally the channel gating is described in a stochastic way. The resulting model is able to reproduce experimental findings like the gradedness of SR release, the voltage dependence of ECC gain and typical spark life time. Due to the numerical efficiency of the model, it is suitable to use for whole cell simulations. The approach we want to use extend the developed (CaRU) to such a whole cell model is already outlined in this work.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"114 1","pages":"142-55"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82159015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Active pathway identification and classification with probabilistic ensembles. 基于概率集成的主动路径识别与分类。
Pub Date : 2010-01-01 DOI: 10.1142/9781848165786_0004
Timothy Hancock, Hiroshi Mamitsuka
A popular means of modeling metabolic networks is through identifying frequently observed pathways. However the definition of what constitutes an observation of a pathway and how to evaluate the importance of identified pathways remains unclear. In this paper we investigate different methods for defining an observed pathway and evaluate their performance with pathway classification models. We use three methods for defining an observed pathway; a path in gene over-expression, a path in probable gene over-expression and a path of most accurate classification. The performance of each definition is evaluated with three classification models; a probabilistic pathway classifier - HME3M, logistic regression and SVM. The results show that defining pathways using the probability of gene over-expression creates stable and accurate classifiers. Conversely we also show defining pathways of most accurate classification finds a severely biased pathways that are unrepresentative of underlying microarray data structure.
一种流行的代谢网络建模方法是通过识别经常观察到的途径。然而,什么是观察途径的定义以及如何评估已确定途径的重要性仍然不清楚。在本文中,我们研究了不同的方法来定义一个观察路径,并评价其性能与路径分类模型。我们使用三种方法来定义观察到的路径;一条基因过表达路径,一条基因可能过表达路径和一条最准确分类路径。用三种分类模型对每个定义的性能进行评价;一个概率路径分类器HME3M,逻辑回归和支持向量机。结果表明,使用基因过表达的概率来定义途径可以创建稳定和准确的分类器。相反,我们还显示定义最准确分类的路径发现严重偏差的路径,这些路径不代表底层微阵列数据结构。
{"title":"Active pathway identification and classification with probabilistic ensembles.","authors":"Timothy Hancock, Hiroshi Mamitsuka","doi":"10.1142/9781848165786_0004","DOIUrl":"https://doi.org/10.1142/9781848165786_0004","url":null,"abstract":"A popular means of modeling metabolic networks is through identifying frequently observed pathways. However the definition of what constitutes an observation of a pathway and how to evaluate the importance of identified pathways remains unclear. In this paper we investigate different methods for defining an observed pathway and evaluate their performance with pathway classification models. We use three methods for defining an observed pathway; a path in gene over-expression, a path in probable gene over-expression and a path of most accurate classification. The performance of each definition is evaluated with three classification models; a probabilistic pathway classifier - HME3M, logistic regression and SVM. The results show that defining pathways using the probability of gene over-expression creates stable and accurate classifiers. Conversely we also show defining pathways of most accurate classification finds a severely biased pathways that are unrepresentative of underlying microarray data structure.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"179 1","pages":"30-40"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80679974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Analysis of a lipid biosynthesis protein family and phospholipid structural variations. 脂质生物合成蛋白家族及磷脂结构变异分析。
Michihiro Tanaka, Yuki Moriya, Susumu Goto, Minoru Kanehisa

Glycerophospholipids are major structural lipids in cellular membrane systems and play key roles as suppliers of the first and second messengers in the signal transduction and molecular recognition processes. The distribution of lipid components differs among organelles and cells. The distribution is controlled by two pathways in lipid metabolism: de nova and remodeling pathways. Glycerophospholipids including arachidonic and stearic acids are mostly produced in the remodeling pathway, whereas lipid chains are reconstructed from those synthesized in the de novo pathway. Recently lysophospholipid acyltransferases have been isolated as key enzymes in the remodeling pathway, and the substrate specificity has been investigated in terms of the chemical substructures of glycerophospholipids, such as the type of head groups and the length of aliphatic chains. These experimental studies have been reported for specific organisms, and only two representative sequence motifs are known for acyltransferases: a general pattern and the pattern for membrane-bound O-acyltransferase (MBOAT). Here we attempt to correlate the sequence patterns and the substrate specificity of lysophospholipid acyltransferases in 89 eukaryotic genomes in order to understand the roles of this enzyme family and underlying glycerophospholipid structural variations. Using phylogenetic and domain analyses, the lysophospholipid acyltransferase family was divided into 18 subtypes. Furthermore, we examined the occurrence of identified subtypes in eukaryotic genomes, and found the expansion of these subtypes in vertebrates. These findings may provide clues to understanding structural variations and distributions of glycerophospholipids in different organisms.

甘油磷脂是细胞膜系统中主要的结构脂类,在信号转导和分子识别过程中作为第一和第二信使的提供者发挥着关键作用。脂质成分在细胞器和细胞中的分布是不同的。其分布受脂质代谢的两条途径控制:新生途径和重塑途径。包括花生四烯酸和硬脂酸在内的甘油磷脂主要是在重塑途径中产生的,而脂链是在新生途径中合成的。近年来,溶血磷脂酰基转移酶已被分离出来作为重构途径中的关键酶,并根据甘油磷脂的化学亚结构(如头基团类型和脂肪链长度)研究了底物特异性。这些实验研究已经报道了特定生物体,只有两个代表性的序列基序是已知的酰基转移酶:一般模式和膜结合o -酰基转移酶(MBOAT)模式。在这里,我们试图将89个真核生物基因组中溶血磷脂酰基转移酶的序列模式和底物特异性联系起来,以了解该酶家族的作用和潜在的甘油磷脂结构变化。通过系统发育和结构域分析,将溶血磷脂酰基转移酶家族划分为18个亚型。此外,我们检查了真核生物基因组中已鉴定亚型的发生,并发现这些亚型在脊椎动物中扩展。这些发现可能为理解不同生物体中甘油磷脂的结构变化和分布提供线索。
{"title":"Analysis of a lipid biosynthesis protein family and phospholipid structural variations.","authors":"Michihiro Tanaka,&nbsp;Yuki Moriya,&nbsp;Susumu Goto,&nbsp;Minoru Kanehisa","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Glycerophospholipids are major structural lipids in cellular membrane systems and play key roles as suppliers of the first and second messengers in the signal transduction and molecular recognition processes. The distribution of lipid components differs among organelles and cells. The distribution is controlled by two pathways in lipid metabolism: de nova and remodeling pathways. Glycerophospholipids including arachidonic and stearic acids are mostly produced in the remodeling pathway, whereas lipid chains are reconstructed from those synthesized in the de novo pathway. Recently lysophospholipid acyltransferases have been isolated as key enzymes in the remodeling pathway, and the substrate specificity has been investigated in terms of the chemical substructures of glycerophospholipids, such as the type of head groups and the length of aliphatic chains. These experimental studies have been reported for specific organisms, and only two representative sequence motifs are known for acyltransferases: a general pattern and the pattern for membrane-bound O-acyltransferase (MBOAT). Here we attempt to correlate the sequence patterns and the substrate specificity of lysophospholipid acyltransferases in 89 eukaryotic genomes in order to understand the roles of this enzyme family and underlying glycerophospholipid structural variations. Using phylogenetic and domain analyses, the lysophospholipid acyltransferase family was divided into 18 subtypes. Furthermore, we examined the occurrence of identified subtypes in eukaryotic genomes, and found the expansion of these subtypes in vertebrates. These findings may provide clues to understanding structural variations and distributions of glycerophospholipids in different organisms.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"22 ","pages":"191-201"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28783682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collocation-based sparse estimation for constructing dynamic gene networks. 基于配位的稀疏估计构建动态基因网络。
Teppei Shimamura, Seiya Imoto, Masao Nagasaki, Mai Yamauchi, Rui Yamaguchi, André Fujita, Yoshinori Tamada, Noriko Gotoh, Satoru Miyano

One of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches. The first problem is the tacit assumption that time lag is stationary. The second is the inseparability between measurement noise and process noise (unmeasured disturbances that pass through time process). To address these problems, we propose a stochastic differential equation model for inferring continuous-time dynamic gene networks under the situation in which both of the process noise and the observation noise exist. We present a collocation-based sparse estimation for simultaneous parameter estimation and model selection in the model. The collocation-based approach requires considerably less computational effort than traditional methods in ordinary stochastic differential equation models. We also incorporate various biological knowledge easily to refine the estimation accuracy with the proposed method. The results using simulated data and real time-series expression data of human primary small airway epithelial cells demonstrate that the proposed approach outperforms competing approaches and can provide significant genes influenced by gefitinib.

系统生物学的一个开放问题是用数学、统计和计算方法来推断动态基因网络,描述潜在的生物过程。基于一阶差分方程的动态贝叶斯网络模型和矢量自回归模型从时间序列微阵列数据中推断出基因之间的时滞关系。然而,两个主要问题大大降低了当前方法的有效性。第一个问题是默认的假设,即时间滞后是固定的。其次是测量噪声和过程噪声(通过时间过程的未测量干扰)之间的不可分割性。为了解决这些问题,我们提出了一个过程噪声和观测噪声同时存在的连续时间动态基因网络的随机微分方程模型。提出了一种基于并置的稀疏估计方法,用于模型中参数估计和模型选择的同时进行。在普通随机微分方程模型中,基于配位的方法比传统方法的计算量要少得多。我们还可以很容易地结合各种生物学知识来提高该方法的估计精度。使用模拟数据和人类原代小气道上皮细胞的实时时序表达数据的结果表明,所提出的方法优于竞争方法,并且可以提供受吉非替尼影响的重要基因。
{"title":"Collocation-based sparse estimation for constructing dynamic gene networks.","authors":"Teppei Shimamura,&nbsp;Seiya Imoto,&nbsp;Masao Nagasaki,&nbsp;Mai Yamauchi,&nbsp;Rui Yamaguchi,&nbsp;André Fujita,&nbsp;Yoshinori Tamada,&nbsp;Noriko Gotoh,&nbsp;Satoru Miyano","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>One of the open problems in systems biology is to infer dynamic gene networks describing the underlying biological process with mathematical, statistical and computational methods. The first-order difference equation-based models such as dynamic Bayesian networks and vector autoregressive models were used to infer time-lagged relationships between genes from time-series microarray data. However, two primary problems greatly reduce the effectiveness of current approaches. The first problem is the tacit assumption that time lag is stationary. The second is the inseparability between measurement noise and process noise (unmeasured disturbances that pass through time process). To address these problems, we propose a stochastic differential equation model for inferring continuous-time dynamic gene networks under the situation in which both of the process noise and the observation noise exist. We present a collocation-based sparse estimation for simultaneous parameter estimation and model selection in the model. The collocation-based approach requires considerably less computational effort than traditional methods in ordinary stochastic differential equation models. We also incorporate various biological knowledge easily to refine the estimation accuracy with the proposed method. The results using simulated data and real time-series expression data of human primary small airway epithelial cells demonstrate that the proposed approach outperforms competing approaches and can provide significant genes influenced by gefitinib.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"164-78"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A dynamic programming algorithm to predict synthesis processes of tree-structured compounds with graph grammar. 基于图语法的树状化合物合成过程预测的动态规划算法。
Yang Zhao, Takeyuki Tamura, Morihiro Hayashida, Tatsuya Akutsu

For several decades, many methods have been developed for predicting organic synthesis paths. However these methods have non-polynomial computational time. In this paper, we propose a bottom-up dynamic programming algorithm to predict synthesis paths of target tree-structured compounds. In this approach, we transform the synthesis problem of tree-structured compounds to the generation problem of unordered trees by regarding tree-structured compounds and chemical reactions as unordered trees and rules, respectively. In order to represent rules corresponding to chemical reactions, we employ a subclass of NLC (Node Label Controlled) grammars. We also give some computational results on this algorithm.

几十年来,人们开发了许多预测有机合成路径的方法。然而,这些方法的计算时间都是非多项式的。本文提出了一种自下而上的动态规划算法来预测目标树状结构化合物的合成路径。在这种方法中,我们将树状结构化合物和化学反应分别视为无序树和规则,将树状结构化合物的合成问题转化为无序树的生成问题。为了表示与化学反应相对应的规则,我们使用了NLC(节点标签控制)语法的一个子类。最后给出了该算法的一些计算结果。
{"title":"A dynamic programming algorithm to predict synthesis processes of tree-structured compounds with graph grammar.","authors":"Yang Zhao,&nbsp;Takeyuki Tamura,&nbsp;Morihiro Hayashida,&nbsp;Tatsuya Akutsu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>For several decades, many methods have been developed for predicting organic synthesis paths. However these methods have non-polynomial computational time. In this paper, we propose a bottom-up dynamic programming algorithm to predict synthesis paths of target tree-structured compounds. In this approach, we transform the synthesis problem of tree-structured compounds to the generation problem of unordered trees by regarding tree-structured compounds and chemical reactions as unordered trees and rules, respectively. In order to represent rules corresponding to chemical reactions, we employ a subclass of NLC (Node Label Controlled) grammars. We also give some computational results on this algorithm.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"218-29"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural features and evolution of protein-protein interactions. 蛋白质-蛋白质相互作用的结构特征和演化。
Pub Date : 2010-01-01 DOI: 10.1142/9781848165786_0001
J. von Eichborn, S. Günther, R. Preissner
Solved structures of protein-protein complexes give fundamental insights into protein function and molecular recognition. Although the determination of protein-protein complexes is generally more difficult than solving individual proteins, the number of experimentally determined complexes increased conspicuously during the last decade. Here, the interfaces of 750 transient protein-protein interactions as well as 2,000 interactions between domains of the same protein chain (obligate interactions) were analyzed to obtain a better understanding of molecular recognition and to identify features applicable for protein binding site prediction. Calculation of knowledge-based potentials showed a preference of contacts between amino acids having complementary physicochemical properties. The analysis of amino acid conservation of the entire interface area showed a weak but significant tendency to a higher evolutionary conservation of protein binding sites compared to surface areas that are permanently exposed to solvent. Remarkably, contact frequencies between outstandingly conserved residues are much higher than expected confirming the so-called "hot spot" theory. The comparisons between obligate and transient domain contacts reveal differences and point out that structural diversification and molecular recognition of protein-protein interactions are subjected to other evolutionary aspects than obligate domain-domain interactions.
蛋白质复合物结构的解决为蛋白质功能和分子识别提供了基本的见解。虽然测定蛋白质-蛋白质复合物通常比测定单个蛋白质更困难,但在过去十年中,实验测定的复合物的数量显著增加。本文分析了750个瞬时蛋白质-蛋白质相互作用的界面以及同一蛋白质链结构域之间的2000个相互作用(专性相互作用),以更好地了解分子识别并确定适用于蛋白质结合位点预测的特征。基于知识的电位计算表明,具有互补物理化学性质的氨基酸之间的接触是优先的。整个界面区域的氨基酸守恒分析表明,与永久暴露于溶剂的表面区域相比,蛋白质结合位点的进化守恒倾向较弱但显著。值得注意的是,非常保守的残基之间的接触频率比预期的要高得多,这证实了所谓的“热点”理论。专性结构域与瞬态结构域接触的比较揭示了蛋白质相互作用的差异,并指出蛋白质相互作用的结构多样化和分子识别受到其他进化方面的影响,而不是专性结构域与域相互作用。
{"title":"Structural features and evolution of protein-protein interactions.","authors":"J. von Eichborn, S. Günther, R. Preissner","doi":"10.1142/9781848165786_0001","DOIUrl":"https://doi.org/10.1142/9781848165786_0001","url":null,"abstract":"Solved structures of protein-protein complexes give fundamental insights into protein function and molecular recognition. Although the determination of protein-protein complexes is generally more difficult than solving individual proteins, the number of experimentally determined complexes increased conspicuously during the last decade. Here, the interfaces of 750 transient protein-protein interactions as well as 2,000 interactions between domains of the same protein chain (obligate interactions) were analyzed to obtain a better understanding of molecular recognition and to identify features applicable for protein binding site prediction. Calculation of knowledge-based potentials showed a preference of contacts between amino acids having complementary physicochemical properties. The analysis of amino acid conservation of the entire interface area showed a weak but significant tendency to a higher evolutionary conservation of protein binding sites compared to surface areas that are permanently exposed to solvent. Remarkably, contact frequencies between outstandingly conserved residues are much higher than expected confirming the so-called \"hot spot\" theory. The comparisons between obligate and transient domain contacts reveal differences and point out that structural diversification and molecular recognition of protein-protein interactions are subjected to other evolutionary aspects than obligate domain-domain interactions.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"55 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83543220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Kinetic modelling of DNA replication initiation in budding yeast. 出芽酵母DNA复制起始的动力学模拟。
Matteo Barberis, Thomas W Spiesser, Edda Klipp

DNA replication is restricted to a specific time window of the cell cycle, called S phase. Successful progression through S phase requires replication to be properly regulated to ensure that the entire genome is duplicated exactly once, without errors, in a timely fashion. As a result, DNA replication has evolved into a tightly regulated process involving the coordinated action of numerous factors that function in all phases of the cell cycle. Biochemical mechanisms driving the eukaryotic cell division cycle have been the subject of a number of mathematical models. However, cell cycle networks reported in literature so far have not addressed the steps of DNA replication events. In particular, the assembly of the replication machinery is crucial for the timing of S phase. This event, called "initiation", which occurs in late M / early G1 of the cell cycle, starts with the assembly of the pre-replicative complex (pre-RC) at the origins of replication on the DNA. Its activation depends on the availability of different kinase complexes, cyclin-dependent kinases (CDKs) and Dbf-dependent kinase (DDK), which phosphorylate specific components of the pre-RC to convert it into the pre-initiation complex (pre-IC). We have developed an ODE-based model of the network responsible for this process in budding yeast by using mass-action kinetics. We considered all steps from the assembly of the first components at the DNA replication origin up to the active replisome that recruits the polymerases and verified the computational dynamics with the available literature data. Our results highlighted the link between activation of CDK and DDK and the step-by-step formation of both pre-RC and pre-IC, suggesting S-CDK (Cdk1-Clb5,6) to be the main regulator of the process.

DNA复制被限制在细胞周期的一个特定时间窗口,称为S期。成功地通过S期需要适当地调节复制,以确保整个基因组精确地复制一次,没有错误,及时。因此,DNA复制已经演变成一个严格调控的过程,涉及在细胞周期的所有阶段起作用的许多因素的协调作用。驱动真核细胞分裂周期的生化机制已经成为许多数学模型的主题。然而,迄今为止,文献报道的细胞周期网络尚未解决DNA复制事件的步骤。特别是,复制机制的组装对S期的时间至关重要。这一事件被称为“起始”,发生在细胞周期的M晚期/ G1早期,始于DNA复制起始处的复制前复合体(pre-RC)的组装。它的激活取决于不同激酶复合物的可用性,细胞周期蛋白依赖性激酶(CDKs)和dbf依赖性激酶(DDK),它们磷酸化pre-RC的特定组分,将其转化为pre-起始复合物(pre-IC)。我们已经开发了一个基于ode的网络模型,通过使用质量作用动力学负责出芽酵母的这一过程。我们考虑了从DNA复制起点的第一个组件组装到招募聚合酶的活性复制体的所有步骤,并用可用的文献数据验证了计算动力学。我们的研究结果强调了CDK和DDK的激活与pre-RC和pre-IC的逐步形成之间的联系,表明S-CDK (cdk1 - clb5,6)是这一过程的主要调节剂。
{"title":"Kinetic modelling of DNA replication initiation in budding yeast.","authors":"Matteo Barberis,&nbsp;Thomas W Spiesser,&nbsp;Edda Klipp","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>DNA replication is restricted to a specific time window of the cell cycle, called S phase. Successful progression through S phase requires replication to be properly regulated to ensure that the entire genome is duplicated exactly once, without errors, in a timely fashion. As a result, DNA replication has evolved into a tightly regulated process involving the coordinated action of numerous factors that function in all phases of the cell cycle. Biochemical mechanisms driving the eukaryotic cell division cycle have been the subject of a number of mathematical models. However, cell cycle networks reported in literature so far have not addressed the steps of DNA replication events. In particular, the assembly of the replication machinery is crucial for the timing of S phase. This event, called \"initiation\", which occurs in late M / early G1 of the cell cycle, starts with the assembly of the pre-replicative complex (pre-RC) at the origins of replication on the DNA. Its activation depends on the availability of different kinase complexes, cyclin-dependent kinases (CDKs) and Dbf-dependent kinase (DDK), which phosphorylate specific components of the pre-RC to convert it into the pre-initiation complex (pre-IC). We have developed an ODE-based model of the network responsible for this process in budding yeast by using mass-action kinetics. We considered all steps from the assembly of the first components at the DNA replication origin up to the active replisome that recruits the polymerases and verified the computational dynamics with the available literature data. Our results highlighted the link between activation of CDK and DDK and the step-by-step formation of both pre-RC and pre-IC, suggesting S-CDK (Cdk1-Clb5,6) to be the main regulator of the process.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"1-20"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome informatics. International Conference on Genome Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1