首页 > 最新文献

Genome informatics. International Conference on Genome Informatics最新文献

英文 中文
A novel meta-analysis approach of cancer transcriptomes reveals prevailing transcriptional networks in cancer cells. 一种新的癌症转录组荟萃分析方法揭示了癌细胞中普遍存在的转录网络。
Pub Date : 2010-01-01 DOI: 10.1142/9781848165786_0010
A. Niida, S. Imoto, Masao Nagasaki, R. Yamaguchi, S. Miyano
Although microarray technology has revealed transcriptomic diversities underlining various cancer phenotypes, transcriptional programs controlling them have not been well elucidated. To decode transcriptional programs governing cancer transcriptomes, we have recently developed a computational method termed EEM, which searches for expression modules from prescribed gene sets defined by prior biological knowledge like TF binding motifs. In this paper, we extend our EEM approach to predict cancer transcriptional networks. Starting from functional TF binding motifs and expression modules identified by EEM, we predict cancer transcriptional networks containing regulatory TFs, associated GO terms, and interactions between TF binding motifs. To systematically analyze transcriptional programs in broad types of cancer, we applied our EEM-based network prediction method to 122 microarray datasets collected from public databases. The data sets contain about 15000 experiments for tumor samples of various tissue origins including breast, colon, lung etc. This EEM based meta-analysis successfully revealed a prevailing cancer transcriptional network which functions in a large fraction of cancer transcriptomes; they include cell-cycle and immune related sub-networks. This study demonstrates broad applicability of EEM, and opens a way to comprehensive understanding of transcriptional networks in cancer cells.
尽管微阵列技术已经揭示了各种癌症表型的转录组多样性,但控制它们的转录程序尚未得到很好的阐明。为了解码控制癌症转录组的转录程序,我们最近开发了一种称为EEM的计算方法,该方法从由先前生物学知识(如TF结合基序)定义的指定基因集中搜索表达模块。在本文中,我们扩展了EEM方法来预测癌症转录网络。从功能性TF结合基序和EEM识别的表达模块开始,我们预测了包含调控TF、相关GO术语以及TF结合基序之间相互作用的癌症转录网络。为了系统地分析广泛类型癌症的转录程序,我们将基于eem的网络预测方法应用于从公共数据库收集的122个微阵列数据集。数据集包含约15000个不同组织来源的肿瘤样本,包括乳腺、结肠、肺等。这项基于EEM的荟萃分析成功地揭示了在大部分癌症转录组中起作用的普遍癌症转录网络;它们包括细胞周期和免疫相关子网络。本研究证明了EEM的广泛适用性,并为全面了解癌细胞的转录网络开辟了一条道路。
{"title":"A novel meta-analysis approach of cancer transcriptomes reveals prevailing transcriptional networks in cancer cells.","authors":"A. Niida, S. Imoto, Masao Nagasaki, R. Yamaguchi, S. Miyano","doi":"10.1142/9781848165786_0010","DOIUrl":"https://doi.org/10.1142/9781848165786_0010","url":null,"abstract":"Although microarray technology has revealed transcriptomic diversities underlining various cancer phenotypes, transcriptional programs controlling them have not been well elucidated. To decode transcriptional programs governing cancer transcriptomes, we have recently developed a computational method termed EEM, which searches for expression modules from prescribed gene sets defined by prior biological knowledge like TF binding motifs. In this paper, we extend our EEM approach to predict cancer transcriptional networks. Starting from functional TF binding motifs and expression modules identified by EEM, we predict cancer transcriptional networks containing regulatory TFs, associated GO terms, and interactions between TF binding motifs. To systematically analyze transcriptional programs in broad types of cancer, we applied our EEM-based network prediction method to 122 microarray datasets collected from public databases. The data sets contain about 15000 experiments for tumor samples of various tissue origins including breast, colon, lung etc. This EEM based meta-analysis successfully revealed a prevailing cancer transcriptional network which functions in a large fraction of cancer transcriptomes; they include cell-cycle and immune related sub-networks. This study demonstrates broad applicability of EEM, and opens a way to comprehensive understanding of transcriptional networks in cancer cells.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"637 1","pages":"121-31"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76816021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Different groups of metabolic genes cluster around early and late firing origins of replication in budding yeast. 在出芽酵母中,不同的代谢基因群围绕着早期和晚期的复制起始点聚集。
Pub Date : 2010-01-01 DOI: 10.1142/9781848166585_0015
T. Spiesser, E. Klipp
DNA replication is a fundamental process that is tightly regulated during the cell cycle. In budding yeast it starts from multiple origins of replication and proceeds in a timely fashion according to a reproducible temporal program until the entire DNA is replicated exactly once per cell cycle. In this program an origin seems to have an inherent firing probability at a specific time in S-phase that is conserved over the population. However, what exactly determines the origin initiation time remains obscure. In this work, we analyze the gene content that clusters around replication origins following the assumption that inherent origin properties that determine staggered initiation times could potentially be mirrored in the close origin proximity. We perform a Gene Ontology term enrichment test and find that metabolic genes are significantly over-represented in the regions that are close to the starting points of DNA replication. Furthermore, functional analysis also reveals that catabolic genes cluster around early firing origins, whereas anabolic genes can rather be found in the proximity of late firing origins of replication. We speculate that, in budding yeast, gene function around replication origins correlates with their intrinsic probability to initiate DNA replication at a given point in S-phase.
DNA复制是细胞周期中受到严格调控的基本过程。在出芽酵母中,它从多个复制起点开始,并根据可复制的时间程序及时进行,直到整个DNA在每个细胞周期中精确复制一次。在这个程序中,原点似乎在s期的特定时间有一个固有的发射概率,这个概率在总体上是守恒的。然而,究竟是什么决定了起源起始时间仍然是一个谜。在这项工作中,我们分析了围绕复制起点聚集的基因内容,假设固有的起点属性决定了交错起始时间,可能反映在接近的起点上。我们进行了基因本体术语富集测试,发现代谢基因在接近DNA复制起点的区域显着过度代表。此外,功能分析还表明,分解代谢基因聚集在早期发射起源周围,而合成代谢基因则可以在复制的晚期发射起源附近找到。我们推测,在出芽酵母中,围绕复制起点的基因功能与它们在s期某一给定点启动DNA复制的内在概率相关。
{"title":"Different groups of metabolic genes cluster around early and late firing origins of replication in budding yeast.","authors":"T. Spiesser, E. Klipp","doi":"10.1142/9781848166585_0015","DOIUrl":"https://doi.org/10.1142/9781848166585_0015","url":null,"abstract":"DNA replication is a fundamental process that is tightly regulated during the cell cycle. In budding yeast it starts from multiple origins of replication and proceeds in a timely fashion according to a reproducible temporal program until the entire DNA is replicated exactly once per cell cycle. In this program an origin seems to have an inherent firing probability at a specific time in S-phase that is conserved over the population. However, what exactly determines the origin initiation time remains obscure. In this work, we analyze the gene content that clusters around replication origins following the assumption that inherent origin properties that determine staggered initiation times could potentially be mirrored in the close origin proximity. We perform a Gene Ontology term enrichment test and find that metabolic genes are significantly over-represented in the regions that are close to the starting points of DNA replication. Furthermore, functional analysis also reveals that catabolic genes cluster around early firing origins, whereas anabolic genes can rather be found in the proximity of late firing origins of replication. We speculate that, in budding yeast, gene function around replication origins correlates with their intrinsic probability to initiate DNA replication at a given point in S-phase.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"56 1","pages":"179-92"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90923687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Analysis and prediction of nutritional requirements using structural properties of metabolic networks and support vector machines. 利用代谢网络和支持向量机的结构特性分析和预测营养需求。
Pub Date : 2010-01-01 DOI: 10.1142/9781848165786_0015
Takeyuki Tamura, Nils Christian, Kazuhiro Takemoto, O. Ebenhöh, T. Akutsu
Properties of graph representation of genome scale metabolic networks have been extensively studied. However, the relationship between these structural properties and functional properties of the networks are still very unclear. In this paper, we focus on nutritional requirements of organisms as a functional property and study the relationship with structural properties of a graph representation of metabolic networks. In order to examine the relationship, we study to what extent the nutritional requirements can be predicted by using support vector machines from structural properties, which include degree exponent, edge density, clustering coefficient, degree centrality, closeness centrality, betweenness centrality and eigenvector centrality. Furthermore, we study which properties are influential to the nutritional requirements.
基因组尺度代谢网络的图表示特性已经得到了广泛的研究。然而,这些网络的结构性质和功能性质之间的关系仍然非常不清楚。在本文中,我们将生物体的营养需求作为一种功能属性,并研究了代谢网络图表示与结构属性的关系。为了检验这种关系,我们研究了利用支持向量机从结构属性(度指数、边缘密度、聚类系数、度中心性、亲密中心性、中间中心性和特征向量中心性)预测营养需求的程度。此外,我们还研究了哪些特性对营养需求有影响。
{"title":"Analysis and prediction of nutritional requirements using structural properties of metabolic networks and support vector machines.","authors":"Takeyuki Tamura, Nils Christian, Kazuhiro Takemoto, O. Ebenhöh, T. Akutsu","doi":"10.1142/9781848165786_0015","DOIUrl":"https://doi.org/10.1142/9781848165786_0015","url":null,"abstract":"Properties of graph representation of genome scale metabolic networks have been extensively studied. However, the relationship between these structural properties and functional properties of the networks are still very unclear. In this paper, we focus on nutritional requirements of organisms as a functional property and study the relationship with structural properties of a graph representation of metabolic networks. In order to examine the relationship, we study to what extent the nutritional requirements can be predicted by using support vector machines from structural properties, which include degree exponent, edge density, clustering coefficient, degree centrality, closeness centrality, betweenness centrality and eigenvector centrality. Furthermore, we study which properties are influential to the nutritional requirements.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"72 1","pages":"176-90"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91040267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Analysis of a lipid biosynthesis protein family and phospholipid structural variations. 脂质生物合成蛋白家族及磷脂结构变异分析。
Pub Date : 2010-01-01 DOI: 10.1142/9781848165786_0016
Michihiro Tanaka, Yuki Moriya, S. Goto, M. Kanehisa
Glycerophospholipids are major structural lipids in cellular membrane systems and play key roles as suppliers of the first and second messengers in the signal transduction and molecular recognition processes. The distribution of lipid components differs among organelles and cells. The distribution is controlled by two pathways in lipid metabolism: de nova and remodeling pathways. Glycerophospholipids including arachidonic and stearic acids are mostly produced in the remodeling pathway, whereas lipid chains are reconstructed from those synthesized in the de novo pathway. Recently lysophospholipid acyltransferases have been isolated as key enzymes in the remodeling pathway, and the substrate specificity has been investigated in terms of the chemical substructures of glycerophospholipids, such as the type of head groups and the length of aliphatic chains. These experimental studies have been reported for specific organisms, and only two representative sequence motifs are known for acyltransferases: a general pattern and the pattern for membrane-bound O-acyltransferase (MBOAT). Here we attempt to correlate the sequence patterns and the substrate specificity of lysophospholipid acyltransferases in 89 eukaryotic genomes in order to understand the roles of this enzyme family and underlying glycerophospholipid structural variations. Using phylogenetic and domain analyses, the lysophospholipid acyltransferase family was divided into 18 subtypes. Furthermore, we examined the occurrence of identified subtypes in eukaryotic genomes, and found the expansion of these subtypes in vertebrates. These findings may provide clues to understanding structural variations and distributions of glycerophospholipids in different organisms.
甘油磷脂是细胞膜系统中主要的结构脂类,在信号转导和分子识别过程中作为第一和第二信使的提供者发挥着关键作用。脂质成分在细胞器和细胞中的分布是不同的。其分布受脂质代谢的两条途径控制:新生途径和重塑途径。包括花生四烯酸和硬脂酸在内的甘油磷脂主要是在重塑途径中产生的,而脂链是在新生途径中合成的。近年来,溶血磷脂酰基转移酶已被分离出来作为重构途径中的关键酶,并根据甘油磷脂的化学亚结构(如头基团类型和脂肪链长度)研究了底物特异性。这些实验研究已经报道了特定生物体,只有两个代表性的序列基序是已知的酰基转移酶:一般模式和膜结合o -酰基转移酶(MBOAT)模式。在这里,我们试图将89个真核生物基因组中溶血磷脂酰基转移酶的序列模式和底物特异性联系起来,以了解该酶家族的作用和潜在的甘油磷脂结构变化。通过系统发育和结构域分析,将溶血磷脂酰基转移酶家族划分为18个亚型。此外,我们检查了真核生物基因组中已鉴定亚型的发生,并发现这些亚型在脊椎动物中扩展。这些发现可能为理解不同生物体中甘油磷脂的结构变化和分布提供线索。
{"title":"Analysis of a lipid biosynthesis protein family and phospholipid structural variations.","authors":"Michihiro Tanaka, Yuki Moriya, S. Goto, M. Kanehisa","doi":"10.1142/9781848165786_0016","DOIUrl":"https://doi.org/10.1142/9781848165786_0016","url":null,"abstract":"Glycerophospholipids are major structural lipids in cellular membrane systems and play key roles as suppliers of the first and second messengers in the signal transduction and molecular recognition processes. The distribution of lipid components differs among organelles and cells. The distribution is controlled by two pathways in lipid metabolism: de nova and remodeling pathways. Glycerophospholipids including arachidonic and stearic acids are mostly produced in the remodeling pathway, whereas lipid chains are reconstructed from those synthesized in the de novo pathway. Recently lysophospholipid acyltransferases have been isolated as key enzymes in the remodeling pathway, and the substrate specificity has been investigated in terms of the chemical substructures of glycerophospholipids, such as the type of head groups and the length of aliphatic chains. These experimental studies have been reported for specific organisms, and only two representative sequence motifs are known for acyltransferases: a general pattern and the pattern for membrane-bound O-acyltransferase (MBOAT). Here we attempt to correlate the sequence patterns and the substrate specificity of lysophospholipid acyltransferases in 89 eukaryotic genomes in order to understand the roles of this enzyme family and underlying glycerophospholipid structural variations. Using phylogenetic and domain analyses, the lysophospholipid acyltransferase family was divided into 18 subtypes. Furthermore, we examined the occurrence of identified subtypes in eukaryotic genomes, and found the expansion of these subtypes in vertebrates. These findings may provide clues to understanding structural variations and distributions of glycerophospholipids in different organisms.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"70 1","pages":"191-201"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86195115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Characterizing common substructures of ligands for GPCR protein subfamilies. 表征GPCR蛋白亚家族配体的共同亚结构。
Pub Date : 2010-01-01 DOI: 10.1142/9781848166585_0003
Bekir Erguner, M. Hattori, S. Goto, M. Kanehisa
The G-protein coupled receptor (GPCR) superfamily is the largest class of proteins with therapeutic value. More than 40% of present prescription drugs are GPCR ligands. The high therapeutic value of GPCR proteins and recent advancements in virtual screening methods gave rise to many virtual screening studies for GPCR ligands. However, in spite of vast amounts of research studying their functions and characteristics, 3D structures of most GPCRs are still unknown. This makes target-based virtual screenings of GPCR ligands extremely difficult, and successful virtual screening techniques rely heavily on ligand information. These virtual screening methods focus on specific features of ligands on GPCR protein level, and common features of ligands on higher levels of GPCR classification are yet to be studied. Here we extracted common substructures of GPCR ligands of GPCR protein subfamilies. We used the SIMCOMP, a graph-based chemical structure comparison program, and hierarchical clustering to reveal common substructures. We applied our method to 850 GPCR ligands and we found 53 common substructures covering 439 ligands. These substructures contribute to deeper understanding of structural features of GPCR ligands which can be used in new drug discovery methods.
g蛋白偶联受体(GPCR)超家族是最大的一类具有治疗价值的蛋白质。目前超过40%的处方药是GPCR配体。GPCR蛋白的高治疗价值和虚拟筛选方法的最新进展,引起了许多GPCR配体的虚拟筛选研究。然而,尽管对其功能和特性进行了大量的研究,但大多数gpcr的三维结构仍然未知。这使得基于靶标的GPCR配体虚拟筛选非常困难,而成功的虚拟筛选技术在很大程度上依赖于配体信息。这些虚拟筛选方法侧重于配体在GPCR蛋白水平上的特异性特征,配体在更高水平GPCR分类上的共性特征尚待研究。本文提取了GPCR蛋白亚家族中GPCR配体的共同亚结构。我们使用SIMCOMP(一个基于图的化学结构比较程序)和分层聚类来揭示共同的子结构。我们将该方法应用于850个GPCR配体,发现53个共同亚结构覆盖439个配体。这些亚结构有助于更深入地了解GPCR配体的结构特征,可用于新药物发现方法。
{"title":"Characterizing common substructures of ligands for GPCR protein subfamilies.","authors":"Bekir Erguner, M. Hattori, S. Goto, M. Kanehisa","doi":"10.1142/9781848166585_0003","DOIUrl":"https://doi.org/10.1142/9781848166585_0003","url":null,"abstract":"The G-protein coupled receptor (GPCR) superfamily is the largest class of proteins with therapeutic value. More than 40% of present prescription drugs are GPCR ligands. The high therapeutic value of GPCR proteins and recent advancements in virtual screening methods gave rise to many virtual screening studies for GPCR ligands. However, in spite of vast amounts of research studying their functions and characteristics, 3D structures of most GPCRs are still unknown. This makes target-based virtual screenings of GPCR ligands extremely difficult, and successful virtual screening techniques rely heavily on ligand information. These virtual screening methods focus on specific features of ligands on GPCR protein level, and common features of ligands on higher levels of GPCR classification are yet to be studied. Here we extracted common substructures of GPCR ligands of GPCR protein subfamilies. We used the SIMCOMP, a graph-based chemical structure comparison program, and hierarchical clustering to reveal common substructures. We applied our method to 850 GPCR ligands and we found 53 common substructures covering 439 ligands. These substructures contribute to deeper understanding of structural features of GPCR ligands which can be used in new drug discovery methods.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"10 3","pages":"31-41"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1142/9781848166585_0003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72475192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Integer programming-based method for completing signaling pathways and its application to analysis of colorectal cancer. 基于整数规划的信号通路完成方法及其在结直肠癌分析中的应用。
Takeyuki Tamura, Yoshihiro Yamanishi, Mao Tanabe, Susumu Goto, Minoru Kanehisa, Katsuhisa Horimoto, Tatsuya Akutsu

Signaling pathways are often represented by networks where each node corresponds to a protein and each edge corresponds to a relationship between nodes such as activation, inhibition and binding. However, such signaling pathways in a cell may be affected by genetic and epigenetic alteration. Some edges may be deleted and some edges may be newly added. The current knowledge about known signaling pathways is available on some public databases, but most of the signaling pathways including changes upon the cell state alterations remain largely unknown. In this paper, we develop an integer programming-based method for inferring such changes by using gene expression data. We test our method on its ability to reconstruct the pathway of colorectal cancer in the KEGG database.

信号通路通常由网络表示,其中每个节点对应一个蛋白质,每个边缘对应节点之间的关系,如激活、抑制和结合。然而,细胞中的这种信号通路可能受到遗传和表观遗传改变的影响。一些边可能被删除,一些边可能被新添加。目前关于已知信号通路的知识可以从一些公共数据库中获得,但大多数信号通路包括细胞状态改变的变化在很大程度上仍然未知。在本文中,我们开发了一种基于整数规划的方法,通过使用基因表达数据来推断这种变化。我们测试了我们的方法在KEGG数据库中重建结直肠癌通路的能力。
{"title":"Integer programming-based method for completing signaling pathways and its application to analysis of colorectal cancer.","authors":"Takeyuki Tamura,&nbsp;Yoshihiro Yamanishi,&nbsp;Mao Tanabe,&nbsp;Susumu Goto,&nbsp;Minoru Kanehisa,&nbsp;Katsuhisa Horimoto,&nbsp;Tatsuya Akutsu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Signaling pathways are often represented by networks where each node corresponds to a protein and each edge corresponds to a relationship between nodes such as activation, inhibition and binding. However, such signaling pathways in a cell may be affected by genetic and epigenetic alteration. Some edges may be deleted and some edges may be newly added. The current knowledge about known signaling pathways is available on some public databases, but most of the signaling pathways including changes upon the cell state alterations remain largely unknown. In this paper, we develop an integer programming-based method for inferring such changes by using gene expression data. We test our method on its ability to reconstruct the pathway of colorectal cancer in the KEGG database.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"193-203"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gene regulatory network clustering for graph layout based on microarray gene expression data. 基于微阵列基因表达数据的基因调控网络聚类图布局。
Kaname Kojima, Seiya Imoto, Masao Nagasaki, Satoru Miyano

We propose a statistical model realizing simultaneous estimation of gene regulatory network and gene module identification from time series gene expression data from microarray experiments. Under the assumption that genes in the same module are densely connected, the proposed method detects gene modules based on the variational Bayesian technique. The model can also incorporate existing biological prior knowledge such as protein subcellular localization. We apply the proposed model to the time series data from a synthetically generated network and verified the effectiveness of the proposed model. The proposed model is also applied the time series microarray data from HeLa cell. Detected gene module information gives the great help on drawing the estimated gene network.

我们提出了一种统计模型,可以从微阵列实验的时间序列基因表达数据中同时估计基因调控网络和基因模块识别。该方法基于变分贝叶斯技术,在假设同一模块内的基因紧密连接的前提下,对基因模块进行检测。该模型还可以结合现有的生物学先验知识,如蛋白质亚细胞定位。我们将所提出的模型应用于来自一个综合生成网络的时间序列数据,并验证了所提出模型的有效性。该模型还应用于HeLa细胞的时间序列微阵列数据。检测到的基因模块信息对绘制估计的基因网络有很大的帮助。
{"title":"Gene regulatory network clustering for graph layout based on microarray gene expression data.","authors":"Kaname Kojima,&nbsp;Seiya Imoto,&nbsp;Masao Nagasaki,&nbsp;Satoru Miyano","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We propose a statistical model realizing simultaneous estimation of gene regulatory network and gene module identification from time series gene expression data from microarray experiments. Under the assumption that genes in the same module are densely connected, the proposed method detects gene modules based on the variational Bayesian technique. The model can also incorporate existing biological prior knowledge such as protein subcellular localization. We apply the proposed model to the time series data from a synthetically generated network and verified the effectiveness of the proposed model. The proposed model is also applied the time series microarray data from HeLa cell. Detected gene module information gives the great help on drawing the estimated gene network.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"84-95"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30252338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systems biology approach: modelling of Aquaporin-2 trafficking. 系统生物学方法:水通道蛋白-2运输的建模。
Pub Date : 2010-01-01 DOI: 10.1142/9781848166585_0004
M. Fröhlich, P. Deen, E. Klipp
In healthy individuals, dehydration of the body leads to release of the hormone vasopressin from the pituitary. Via the bloodstream, vasopressin reaches the collecting duct cells in the kidney, where the water channel Aquaporin-2 (AQP2) is expressed. After stimulation of the vasopressin V2 receptor by vasopressin, intracellular AQP2-containing vesicles fuse with the apical plasma membrane of the collecting duct cells. This leads to increased water reabsorption from the pro-urine into the blood and therefore to enhanced retention of water within the body. Using existing biological data we propose a mathematical model of AQP-2 trafficking and regulation in collecting duct cells. Our model includes the vasopressin receptor, adenylate cyclase, protein kinase A, and intracellular as well as membrane located AQP2. To model the chemical reactions we used ordinary differential equations (ODEs) based on mass action kinetics. We employ known protein concentrations and time series data to estimate the kinetic parameters of our model and demonstrate its validity. Through generating, testing and ranking different versions of the model, we show that some model versions can describe the data well as soon as important regulatory parts such as the reduction of the signal by internalization of the vasopressin-receptor or the negative feedback loop representing phosphodiesterase activity are included. We perform time-dependent sensitivity analysis to identify the reactions that have the greatest influence on the cAMP and membrane located AQP2 levels over time. We predict the time courses for membrane located AQP2 at different vasopressin concentrations, compare them with newly generated data and discuss the competencies of the model.
在健康个体中,身体脱水导致垂体释放激素抗利尿激素。加压素通过血液到达肾脏的集管细胞,在那里水通道通道蛋白-2 (AQP2)表达。后叶加压素刺激后叶加压素V2受体后,细胞内含有aqp2的囊泡与集管细胞的顶质膜融合。这就会增加尿液前体对血液的水分再吸收,从而增强体内水分的潴留。利用现有的生物学数据,我们提出了AQP-2在收集管细胞中的运输和调控的数学模型。我们的模型包括抗利尿激素受体,腺苷酸环化酶,蛋白激酶A,以及位于细胞内和膜上的AQP2。为了模拟化学反应,我们采用了基于质量作用动力学的常微分方程(ode)。我们使用已知的蛋白质浓度和时间序列数据来估计模型的动力学参数,并证明了其有效性。通过生成、测试和排序不同版本的模型,我们表明,只要包括重要的调节部分,如抗利尿激素受体内化信号的减少或代表磷酸二酯酶活性的负反馈回路,一些模型版本就可以很好地描述数据。我们进行了时间依赖的敏感性分析,以确定随着时间的推移对cAMP和膜位置AQP2水平影响最大的反应。我们预测了不同抗利尿激素浓度下位于AQP2的膜的时间过程,并将其与新生成的数据进行了比较,并讨论了模型的能力。
{"title":"A systems biology approach: modelling of Aquaporin-2 trafficking.","authors":"M. Fröhlich, P. Deen, E. Klipp","doi":"10.1142/9781848166585_0004","DOIUrl":"https://doi.org/10.1142/9781848166585_0004","url":null,"abstract":"In healthy individuals, dehydration of the body leads to release of the hormone vasopressin from the pituitary. Via the bloodstream, vasopressin reaches the collecting duct cells in the kidney, where the water channel Aquaporin-2 (AQP2) is expressed. After stimulation of the vasopressin V2 receptor by vasopressin, intracellular AQP2-containing vesicles fuse with the apical plasma membrane of the collecting duct cells. This leads to increased water reabsorption from the pro-urine into the blood and therefore to enhanced retention of water within the body. Using existing biological data we propose a mathematical model of AQP-2 trafficking and regulation in collecting duct cells. Our model includes the vasopressin receptor, adenylate cyclase, protein kinase A, and intracellular as well as membrane located AQP2. To model the chemical reactions we used ordinary differential equations (ODEs) based on mass action kinetics. We employ known protein concentrations and time series data to estimate the kinetic parameters of our model and demonstrate its validity. Through generating, testing and ranking different versions of the model, we show that some model versions can describe the data well as soon as important regulatory parts such as the reduction of the signal by internalization of the vasopressin-receptor or the negative feedback loop representing phosphodiesterase activity are included. We perform time-dependent sensitivity analysis to identify the reactions that have the greatest influence on the cAMP and membrane located AQP2 levels over time. We predict the time courses for membrane located AQP2 at different vasopressin concentrations, compare them with newly generated data and discuss the competencies of the model.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"1 1","pages":"42-55"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79455469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Analyzing gene coexpression data by an evolutionary model. 用进化模型分析基因共表达数据。
Pub Date : 2010-01-01 DOI: 10.1142/9781848166585_0013
M. Schütte, M. Mutwil, S. Persson, O. Ebenhöh
Coexpressed genes are tentatively translated into proteins that are involved in similar biological functions. Here, we constructed gene coexpression networks from collected microarray data of the organisms Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli. Their degree distributions show the common property of an overrepresentation of highly connected nodes followed by a sudden truncation. In order to analyze this behavior, we present an evolutionary model simulating the genetic evolution. This model assumes that new genes emerge by duplication from a small initial set of primordial genes. Our model does not include the removal of unused genes but selective pressure is indirectly taken into account by preferentially duplicating the old genes. Thus, gene duplication represents the emergence of a new gene and its successful establishment. After a duplication event, all genes are slightly but iteratively mutated, thus altering their expression patterns. Our model is capable of reproducing global properties of the investigated coexpression networks. We show that our model reflects the mean inter-node distances and especially the characteristic humps in the degree distribution that, in the biological examples, result from functionally related genes.
共表达基因被暂时翻译成参与类似生物功能的蛋白质。在这里,我们利用收集到的拟南芥、酿酒酵母和大肠杆菌的微阵列数据构建了基因共表达网络。它们的度分布显示了高度连接节点的过度表示和突然截断的共同特性。为了分析这种行为,我们提出了一个模拟遗传进化的进化模型。这个模型假定新的基因是由一小部分原始基因的复制而产生的。我们的模型不包括去除不使用的基因,但通过优先复制旧基因间接考虑了选择压力。因此,基因复制代表了新基因的出现和成功建立。在重复事件发生后,所有的基因都发生了轻微但反复的突变,从而改变了它们的表达模式。我们的模型能够再现所研究的共表达网络的全局属性。我们表明,我们的模型反映了平均节点间距离,特别是度分布中的特征峰,在生物学示例中,这是由功能相关基因引起的。
{"title":"Analyzing gene coexpression data by an evolutionary model.","authors":"M. Schütte, M. Mutwil, S. Persson, O. Ebenhöh","doi":"10.1142/9781848166585_0013","DOIUrl":"https://doi.org/10.1142/9781848166585_0013","url":null,"abstract":"Coexpressed genes are tentatively translated into proteins that are involved in similar biological functions. Here, we constructed gene coexpression networks from collected microarray data of the organisms Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli. Their degree distributions show the common property of an overrepresentation of highly connected nodes followed by a sudden truncation. In order to analyze this behavior, we present an evolutionary model simulating the genetic evolution. This model assumes that new genes emerge by duplication from a small initial set of primordial genes. Our model does not include the removal of unused genes but selective pressure is indirectly taken into account by preferentially duplicating the old genes. Thus, gene duplication represents the emergence of a new gene and its successful establishment. After a duplication event, all genes are slightly but iteratively mutated, thus altering their expression patterns. Our model is capable of reproducing global properties of the investigated coexpression networks. We show that our model reflects the mean inter-node distances and especially the characteristic humps in the degree distribution that, in the biological examples, result from functionally related genes.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"33 1","pages":"154-63"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83831842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active pathway identification and classification with probabilistic ensembles. 基于概率集成的主动路径识别与分类。
Timothy Hancock, Hiroshi Mamitsuka

A popular means of modeling metabolic networks is through identifying frequently observed pathways. However the definition of what constitutes an observation of a pathway and how to evaluate the importance of identified pathways remains unclear. In this paper we investigate different methods for defining an observed pathway and evaluate their performance with pathway classification models. We use three methods for defining an observed pathway; a path in gene over-expression, a path in probable gene over-expression and a path of most accurate classification. The performance of each definition is evaluated with three classification models; a probabilistic pathway classifier - HME3M, logistic regression and SVM. The results show that defining pathways using the probability of gene over-expression creates stable and accurate classifiers. Conversely we also show defining pathways of most accurate classification finds a severely biased pathways that are unrepresentative of underlying microarray data structure.

一种流行的代谢网络建模方法是通过识别经常观察到的途径。然而,什么是观察途径的定义以及如何评估已确定途径的重要性仍然不清楚。在本文中,我们研究了不同的方法来定义一个观察路径,并评价其性能与路径分类模型。我们使用三种方法来定义观察到的路径;一条基因过表达路径,一条基因可能过表达路径和一条最准确分类路径。用三种分类模型对每个定义的性能进行评价;一个概率路径分类器HME3M,逻辑回归和支持向量机。结果表明,使用基因过表达的概率来定义途径可以创建稳定和准确的分类器。相反,我们还显示定义最准确分类的路径发现严重偏差的路径,这些路径不代表底层微阵列数据结构。
{"title":"Active pathway identification and classification with probabilistic ensembles.","authors":"Timothy Hancock,&nbsp;Hiroshi Mamitsuka","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>A popular means of modeling metabolic networks is through identifying frequently observed pathways. However the definition of what constitutes an observation of a pathway and how to evaluate the importance of identified pathways remains unclear. In this paper we investigate different methods for defining an observed pathway and evaluate their performance with pathway classification models. We use three methods for defining an observed pathway; a path in gene over-expression, a path in probable gene over-expression and a path of most accurate classification. The performance of each definition is evaluated with three classification models; a probabilistic pathway classifier - HME3M, logistic regression and SVM. The results show that defining pathways using the probability of gene over-expression creates stable and accurate classifiers. Conversely we also show defining pathways of most accurate classification finds a severely biased pathways that are unrepresentative of underlying microarray data structure.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"22 ","pages":"30-40"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28783007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome informatics. International Conference on Genome Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1