首页 > 最新文献

EURASIP journal on bioinformatics & systems biology最新文献

英文 中文
A bayesian analysis for identifying DNA copy number variations using a compound poisson process. 用复合泊松过程鉴定DNA拷贝数变异的贝叶斯分析。
Pub Date : 2010-01-01 Epub Date: 2010-09-27 DOI: 10.1155/2010/268513
Jie Chen, Ayten Yiğiter, Yu-Ping Wang, Hong-Wen Deng

To study chromosomal aberrations that may lead to cancer formation or genetic diseases, the array-based Comparative Genomic Hybridization (aCGH) technique is often used for detecting DNA copy number variants (CNVs). Various methods have been developed for gaining CNVs information based on aCGH data. However, most of these methods make use of the log-intensity ratios in aCGH data without taking advantage of other information such as the DNA probe (e.g., biomarker) positions/distances contained in the data. Motivated by the specific features of aCGH data, we developed a novel method that takes into account the estimation of a change point or locus of the CNV in aCGH data with its associated biomarker position on the chromosome using a compound Poisson process. We used a Bayesian approach to derive the posterior probability for the estimation of the CNV locus. To detect loci of multiple CNVs in the data, a sliding window process combined with our derived Bayesian posterior probability was proposed. To evaluate the performance of the method in the estimation of the CNV locus, we first performed simulation studies. Finally, we applied our approach to real data from aCGH experiments, demonstrating its applicability.

为了研究可能导致癌症形成或遗传疾病的染色体畸变,基于阵列的比较基因组杂交(aCGH)技术经常用于检测DNA拷贝数变异(CNVs)。基于aCGH数据获取CNVs信息的方法多种多样。然而,这些方法大多利用aCGH数据中的对数强度比,而没有利用数据中包含的DNA探针(如生物标志物)位置/距离等其他信息。基于aCGH数据的特定特征,我们开发了一种新的方法,该方法使用复合泊松过程来估计aCGH数据中CNV的变化点或位点及其在染色体上的相关生物标志物位置。我们使用贝叶斯方法来推导CNV轨迹估计的后验概率。为了检测数据中多个cnv的位点,提出了一种结合贝叶斯后验概率的滑动窗口过程。为了评估该方法在CNV位点估计中的性能,我们首先进行了仿真研究。最后,我们将该方法应用于aCGH实验的真实数据,证明了其适用性。
{"title":"A bayesian analysis for identifying DNA copy number variations using a compound poisson process.","authors":"Jie Chen,&nbsp;Ayten Yiğiter,&nbsp;Yu-Ping Wang,&nbsp;Hong-Wen Deng","doi":"10.1155/2010/268513","DOIUrl":"https://doi.org/10.1155/2010/268513","url":null,"abstract":"<p><p>To study chromosomal aberrations that may lead to cancer formation or genetic diseases, the array-based Comparative Genomic Hybridization (aCGH) technique is often used for detecting DNA copy number variants (CNVs). Various methods have been developed for gaining CNVs information based on aCGH data. However, most of these methods make use of the log-intensity ratios in aCGH data without taking advantage of other information such as the DNA probe (e.g., biomarker) positions/distances contained in the data. Motivated by the specific features of aCGH data, we developed a novel method that takes into account the estimation of a change point or locus of the CNV in aCGH data with its associated biomarker position on the chromosome using a compound Poisson process. We used a Bayesian approach to derive the posterior probability for the estimation of the CNV locus. To detect loci of multiple CNVs in the data, a sliding window process combined with our derived Bayesian posterior probability was proposed. To evaluate the performance of the method in the estimation of the CNV locus, we first performed simulation studies. Finally, we applied our approach to real data from aCGH experiments, demonstrating its applicability.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"268513"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2010/268513","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29376054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
TRII: A Probabilistic Scoring of Drosophila melanogaster Translation Initiation Sites. 黑腹果蝇翻译起始位点的概率评分。
Pub Date : 2010-01-01 DOI: 10.1155/2010/814127
Michael P Weir, Michael D Rice

Relative individual information is a measurement that scores the quality of DNA- and RNA-binding sites for biological machines. The development of analytical approaches to increase the power of this scoring method will improve its utility in evaluating the functions of motifs. In this study, the scoring method was applied to potential translation initiation sites in Drosophila to compute Translation Relative Individual Information (TRII) scores. The weight matrix at the core of the scoring method was optimized based on high-confidence translation initiation sites identified by using a progressive partitioning approach. Comparing the distributions of TRII scores for sites of interest with those for high-confidence translation initiation sites and random sequences provides a new methodology for assessing the quality of translation initiation sites. The optimized weight matrices can also be used to describe the consensus at translation initiation sites, providing a quantitative measure of preferred and avoided nucleotides at each position.

相对个体信息是一种衡量生物机器DNA和rna结合位点质量的方法。分析方法的发展增加了这种评分方法的力量,将提高其在评估基序功能方面的效用。本研究将该评分方法应用于果蝇潜在的翻译起始位点,计算翻译相对个体信息(translation Relative Individual Information, TRII)评分。评分方法的核心权重矩阵是基于采用渐进式划分方法确定的高置信度翻译起始位点进行优化的。比较感兴趣位点的TRII分数分布与高置信度翻译起始位点和随机序列的分布,为评估翻译起始位点的质量提供了一种新的方法。优化的权重矩阵也可用于描述翻译起始位点的一致性,提供每个位置上首选和避免的核苷酸的定量测量。
{"title":"TRII: A Probabilistic Scoring of Drosophila melanogaster Translation Initiation Sites.","authors":"Michael P Weir,&nbsp;Michael D Rice","doi":"10.1155/2010/814127","DOIUrl":"https://doi.org/10.1155/2010/814127","url":null,"abstract":"<p><p>Relative individual information is a measurement that scores the quality of DNA- and RNA-binding sites for biological machines. The development of analytical approaches to increase the power of this scoring method will improve its utility in evaluating the functions of motifs. In this study, the scoring method was applied to potential translation initiation sites in Drosophila to compute Translation Relative Individual Information (TRII) scores. The weight matrix at the core of the scoring method was optimized based on high-confidence translation initiation sites identified by using a progressive partitioning approach. Comparing the distributions of TRII scores for sites of interest with those for high-confidence translation initiation sites and random sequences provides a new methodology for assessing the quality of translation initiation sites. The optimized weight matrices can also be used to describe the consensus at translation initiation sites, providing a quantitative measure of preferred and avoided nucleotides at each position.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":"2010 ","pages":"814127"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2010/814127","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9481963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A hypothesis test for equality of bayesian network models. 贝叶斯网络模型等价性的假设检验。
Pub Date : 2010-01-01 Epub Date: 2010-10-12 DOI: 10.1155/2010/947564
Anthony Almudevar

Bayesian network models are commonly used to model gene expression data. Some applications require a comparison of the network structure of a set of genes between varying phenotypes. In principle, separately fit models can be directly compared, but it is difficult to assign statistical significance to any observed differences. There would therefore be an advantage to the development of a rigorous hypothesis test for homogeneity of network structure. In this paper, a generalized likelihood ratio test based on Bayesian network models is developed, with significance level estimated using permutation replications. In order to be computationally feasible, a number of algorithms are introduced. First, a method for approximating multivariate distributions due to Chow and Liu (1968) is adapted, permitting the polynomial-time calculation of a maximum likelihood Bayesian network with maximum indegree of one. Second, sequential testing principles are applied to the permutation test, allowing significant reduction of computation time while preserving reported error rates used in multiple testing. The method is applied to gene-set analysis, using two sets of experimental data, and some advantage to a pathway modelling approach to this problem is reported.

贝叶斯网络模型通常用于基因表达数据的建模。一些应用需要在不同表型之间对一组基因的网络结构进行比较。原则上,单独拟合的模型可以直接比较,但很难对任何观察到的差异赋予统计显著性。因此,对网络结构的同质性进行严格的假设检验将是有利的。本文提出了一种基于贝叶斯网络模型的广义似然比检验方法,利用排列重复估计显著性水平。为了在计算上可行,引入了许多算法。首先,采用Chow和Liu(1968)提出的一种近似多元分布的方法,允许最大似然度为1的最大贝叶斯网络的多项式时间计算。其次,顺序测试原则应用于排列测试,允许显著减少计算时间,同时保留在多次测试中使用的报告错误率。将该方法应用于两组实验数据的基因集分析,并报道了途径建模方法在这一问题上的一些优势。
{"title":"A hypothesis test for equality of bayesian network models.","authors":"Anthony Almudevar","doi":"10.1155/2010/947564","DOIUrl":"https://doi.org/10.1155/2010/947564","url":null,"abstract":"<p><p>Bayesian network models are commonly used to model gene expression data. Some applications require a comparison of the network structure of a set of genes between varying phenotypes. In principle, separately fit models can be directly compared, but it is difficult to assign statistical significance to any observed differences. There would therefore be an advantage to the development of a rigorous hypothesis test for homogeneity of network structure. In this paper, a generalized likelihood ratio test based on Bayesian network models is developed, with significance level estimated using permutation replications. In order to be computationally feasible, a number of algorithms are introduced. First, a method for approximating multivariate distributions due to Chow and Liu (1968) is adapted, permitting the polynomial-time calculation of a maximum likelihood Bayesian network with maximum indegree of one. Second, sequential testing principles are applied to the permutation test, allowing significant reduction of computation time while preserving reported error rates used in multiple testing. The method is applied to gene-set analysis, using two sets of experimental data, and some advantage to a pathway modelling approach to this problem is reported.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"947564"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2010/947564","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29390854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
How to improve postgenomic knowledge discovery using imputation. 如何利用归算改进后基因组知识发现。
Pub Date : 2009-01-01 DOI: 10.1155/2009/717136
Muhammad Shoaib B Sehgal, Iqbal Gondal, Laurence S Dooley, Ross Coppel

While microarrays make it feasible to rapidly investigate many complex biological problems, their multistep fabrication has the proclivity for error at every stage. The standard tactic has been to either ignore or regard erroneous gene readings as missing values, though this assumption can exert a major influence upon postgenomic knowledge discovery methods like gene selection and gene regulatory network (GRN) reconstruction. This has been the catalyst for a raft of new flexible imputation algorithms including local least square impute and the recent heuristic collateral missing value imputation, which exploit the biological transactional behaviour of functionally correlated genes to afford accurate missing value estimation. This paper examines the influence of missing value imputation techniques upon postgenomic knowledge inference methods with results for various algorithms consistently corroborating that instead of ignoring missing values, recycling microarray data by flexible and robust imputation can provide substantial performance benefits for subsequent downstream procedures.

虽然微阵列使得快速研究许多复杂的生物学问题成为可能,但它们的多步骤制造在每个阶段都有出错的倾向。标准的策略是要么忽略,要么将错误的基因读数视为缺失值,尽管这种假设可以对基因选择和基因调控网络(GRN)重建等后基因组知识发现方法产生重大影响。这是一系列新的灵活的输入算法的催化剂,包括局部最小二乘输入和最近的启发式抵押品缺失值输入,它们利用功能相关基因的生物交易行为来提供准确的缺失值估计。本文研究了缺失值归算技术对基因组后知识推断方法的影响,各种算法的结果一致地证实,通过灵活和稳健的归算来回收微阵列数据,而不是忽略缺失值,可以为后续的下游程序提供实质性的性能优势。
{"title":"How to improve postgenomic knowledge discovery using imputation.","authors":"Muhammad Shoaib B Sehgal,&nbsp;Iqbal Gondal,&nbsp;Laurence S Dooley,&nbsp;Ross Coppel","doi":"10.1155/2009/717136","DOIUrl":"https://doi.org/10.1155/2009/717136","url":null,"abstract":"<p><p>While microarrays make it feasible to rapidly investigate many complex biological problems, their multistep fabrication has the proclivity for error at every stage. The standard tactic has been to either ignore or regard erroneous gene readings as missing values, though this assumption can exert a major influence upon postgenomic knowledge discovery methods like gene selection and gene regulatory network (GRN) reconstruction. This has been the catalyst for a raft of new flexible imputation algorithms including local least square impute and the recent heuristic collateral missing value imputation, which exploit the biological transactional behaviour of functionally correlated genes to afford accurate missing value estimation. This paper examines the influence of missing value imputation techniques upon postgenomic knowledge inference methods with results for various algorithms consistently corroborating that instead of ignoring missing values, recycling microarray data by flexible and robust imputation can provide substantial performance benefits for subsequent downstream procedures.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"717136"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/717136","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9785776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Functional classification of genome-scale metabolic networks. 基因组尺度代谢网络的功能分类。
Pub Date : 2009-01-01 Epub Date: 2009-03-17 DOI: 10.1155/2009/570456
Oliver Ebenhöh, Thomas Handorf

We propose two strategies to characterize organisms with respect to their metabolic capabilities. The first, investigative, strategy describes metabolic networks in terms of their capability to utilize different carbon sources, resulting in the concept of carbon utilization spectra. In the second, predictive, approach minimal nutrient combinations are predicted from the structure of the metabolic networks, resulting in a characteristic nutrient profile. Both strategies allow for a quantification of functional properties of metabolic networks, allowing to identify groups of organisms with similar functions. We investigate whether the functional description reflects the typical environments of the corresponding organisms by dividing all species into disjoint groups based on whether they are aerotolerant and/or photosynthetic. Despite differences in the underlying concepts, both measures display some common features. Closely related organisms often display a similar functional behavior and in both cases the functional measures appear to correlate with the considered classes of environments. Carbon utilization spectra and nutrient profiles are complementary approaches toward a functional classification of organism-wide metabolic networks. Both approaches contain different information and thus yield different clusterings, which are both different from the classical taxonomy of organisms. Our results indicate that a sophisticated combination of our approaches will allow for a quantitative description reflecting the lifestyles of organisms.

我们提出了两种策略来表征生物体的代谢能力。首先,调查性策略描述代谢网络利用不同碳源的能力,从而产生碳利用谱的概念。在第二种预测方法中,根据代谢网络的结构预测最小的营养组合,从而产生特征营养概况。这两种策略都可以量化代谢网络的功能特性,从而识别具有相似功能的生物体群。我们研究了功能描述是否反映了相应生物的典型环境,通过根据它们是否耐氧和/或光合作用将所有物种划分为不相关的组。尽管在基本概念上存在差异,但这两种测量方法显示出一些共同特征。密切相关的生物经常表现出相似的功能行为,在这两种情况下,功能测量似乎与所考虑的环境类别有关。碳利用光谱和营养剖面是对全生物代谢网络功能分类的互补方法。这两种方法包含不同的信息,因此产生不同的聚类,这两种方法都不同于经典的生物分类。我们的结果表明,我们的方法的一个复杂的组合将允许定量描述反映生物体的生活方式。
{"title":"Functional classification of genome-scale metabolic networks.","authors":"Oliver Ebenhöh,&nbsp;Thomas Handorf","doi":"10.1155/2009/570456","DOIUrl":"https://doi.org/10.1155/2009/570456","url":null,"abstract":"<p><p>We propose two strategies to characterize organisms with respect to their metabolic capabilities. The first, investigative, strategy describes metabolic networks in terms of their capability to utilize different carbon sources, resulting in the concept of carbon utilization spectra. In the second, predictive, approach minimal nutrient combinations are predicted from the structure of the metabolic networks, resulting in a characteristic nutrient profile. Both strategies allow for a quantification of functional properties of metabolic networks, allowing to identify groups of organisms with similar functions. We investigate whether the functional description reflects the typical environments of the corresponding organisms by dividing all species into disjoint groups based on whether they are aerotolerant and/or photosynthetic. Despite differences in the underlying concepts, both measures display some common features. Closely related organisms often display a similar functional behavior and in both cases the functional measures appear to correlate with the considered classes of environments. Carbon utilization spectra and nutrient profiles are complementary approaches toward a functional classification of organism-wide metabolic networks. Both approaches contain different information and thus yield different clusterings, which are both different from the classical taxonomy of organisms. Our results indicate that a sophisticated combination of our approaches will allow for a quantitative description reflecting the lifestyles of organisms.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"570456"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/570456","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28056102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Spectral preprocessing for clustering time-series gene expressions. 聚类时间序列基因表达的光谱预处理。
Pub Date : 2009-01-01 Epub Date: 2009-04-08 DOI: 10.1155/2009/713248
Wentao Zhao, Erchin Serpedin, Edward R Dougherty

Based on gene expression profiles, genes can be partitioned into clusters, which might be associated with biological processes or functions, for example, cell cycle, circadian rhythm, and so forth. This paper proposes a novel clustering preprocessing strategy which combines clustering with spectral estimation techniques so that the time information present in time series gene expressions is fully exploited. By comparing the clustering results with a set of biologically annotated yeast cell-cycle genes, the proposed clustering strategy is corroborated to yield significantly different clusters from those created by the traditional expression-based schemes. The proposed technique is especially helpful in grouping genes participating in time-regulated processes.

基于基因表达谱,可以将基因划分成簇,簇可能与生物过程或功能有关,例如细胞周期、昼夜节律等。本文提出了一种新的聚类预处理策略,将聚类与谱估计技术相结合,充分利用时间序列基因表达中的时间信息。通过将聚类结果与一组酵母细胞周期基因的生物学注释进行比较,证实了所提出的聚类策略产生的聚类与传统基于表达的方案产生的聚类有显著不同。所提出的技术尤其有助于对参与时间调节过程的基因进行分组。
{"title":"Spectral preprocessing for clustering time-series gene expressions.","authors":"Wentao Zhao,&nbsp;Erchin Serpedin,&nbsp;Edward R Dougherty","doi":"10.1155/2009/713248","DOIUrl":"https://doi.org/10.1155/2009/713248","url":null,"abstract":"<p><p>Based on gene expression profiles, genes can be partitioned into clusters, which might be associated with biological processes or functions, for example, cell cycle, circadian rhythm, and so forth. This paper proposes a novel clustering preprocessing strategy which combines clustering with spectral estimation techniques so that the time information present in time series gene expressions is fully exploited. By comparing the clustering results with a set of biologically annotated yeast cell-cycle genes, the proposed clustering strategy is corroborated to yield significantly different clusters from those created by the traditional expression-based schemes. The proposed technique is especially helpful in grouping genes participating in time-regulated processes.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"713248"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/713248","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28120688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Assessing the exceptionality of coloured motifs in networks. 网络中有色图案的异常性评估。
Pub Date : 2009-01-01 Epub Date: 2009-01-26 DOI: 10.1155/2009/616234
Sophie Schbath, Vincent Lacroix, Marie-France Sagot

Various methods have been recently employed to characterise the structure of biological networks. In particular, the concept of network motif and the related one of coloured motif have proven useful to model the notion of a functional/evolutionary building block. However, algorithms that enumerate all the motifs of a network may produce a very large output, and methods to decide which motifs should be selected for downstream analysis are needed. A widely used method is to assess if the motif is exceptional, that is, over- or under-represented with respect to a null hypothesis. Much effort has been put in the last thirty years to derive P-values for the frequencies of topological motifs, that is, fixed subgraphs. They rely either on (compound) Poisson and Gaussian approximations for the motif count distribution in Erdös-Rényi random graphs or on simulations in other models. We focus on a different definition of graph motifs that corresponds to coloured motifs. A coloured motif is a connected subgraph with fixed vertex colours but unspecified topology. Our work is the first analytical attempt to assess the exceptionality of coloured motifs in networks without any simulation. We first establish analytical formulae for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random graph model. Using simulations under this model, we further show that a Pólya-Aeppli distribution better approximates the distribution of the motif count compared to Gaussian or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound Poisson distributions, are indeed well designed to model counts of clumping events. Altogether, these results enable to derive a P-value for a coloured motif, without spending time on simulations.

最近已经采用了各种方法来描述生物网络的结构。特别是,网络基序的概念和相关的彩色基序的概念已被证明有助于建模功能/进化构建块的概念。然而,枚举网络中所有基序的算法可能产生非常大的输出,并且需要确定应该选择哪些基序进行下游分析的方法。一种广泛使用的方法是评估基序是否异常,即相对于零假设而言,是否过度或不足代表。在过去的三十年里,人们花了很多精力来推导拓扑基元(即固定子图)频率的p值。他们要么依靠(复合)泊松和高斯近似在Erdös-Rényi随机图中的基序计数分布,要么依靠其他模型的模拟。我们将重点讨论与彩色图案对应的图形图案的不同定义。彩色母题是顶点颜色固定但拓扑结构未指定的连通子图。我们的工作是首次在没有任何模拟的情况下分析评估网络中有色图案的异常性。我们首先建立了一个Erdös-Rényi随机图模型中彩色图案计数的均值和方差的解析公式。通过在该模型下的模拟,我们进一步表明,与高斯或泊松分布相比,Pólya-Aeppli分布更接近基序计数的分布。Pólya-Aeppli分布,以及更普遍的复合泊松分布,确实被很好地设计用于模拟团块事件的计数。总之,这些结果使我们能够推导出彩色图案的p值,而无需花费时间进行模拟。
{"title":"Assessing the exceptionality of coloured motifs in networks.","authors":"Sophie Schbath,&nbsp;Vincent Lacroix,&nbsp;Marie-France Sagot","doi":"10.1155/2009/616234","DOIUrl":"https://doi.org/10.1155/2009/616234","url":null,"abstract":"<p><p>Various methods have been recently employed to characterise the structure of biological networks. In particular, the concept of network motif and the related one of coloured motif have proven useful to model the notion of a functional/evolutionary building block. However, algorithms that enumerate all the motifs of a network may produce a very large output, and methods to decide which motifs should be selected for downstream analysis are needed. A widely used method is to assess if the motif is exceptional, that is, over- or under-represented with respect to a null hypothesis. Much effort has been put in the last thirty years to derive P-values for the frequencies of topological motifs, that is, fixed subgraphs. They rely either on (compound) Poisson and Gaussian approximations for the motif count distribution in Erdös-Rényi random graphs or on simulations in other models. We focus on a different definition of graph motifs that corresponds to coloured motifs. A coloured motif is a connected subgraph with fixed vertex colours but unspecified topology. Our work is the first analytical attempt to assess the exceptionality of coloured motifs in networks without any simulation. We first establish analytical formulae for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random graph model. Using simulations under this model, we further show that a Pólya-Aeppli distribution better approximates the distribution of the motif count compared to Gaussian or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound Poisson distributions, are indeed well designed to model counts of clumping events. Altogether, these results enable to derive a P-value for a coloured motif, without spending time on simulations.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"616234"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/616234","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"27964692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Selection of statistical thresholds in graphical models. 图形模型中统计阈值的选择。
Pub Date : 2009-01-01 Epub Date: 2010-03-04 DOI: 10.1155/2009/878013
Anthony Almudevar

Reconstruction of gene regulatory networks based on experimental data usually relies on statistical evidence, necessitating the choice of a statistical threshold which defines a significant biological effect. Approaches to this problem found in the literature range from rigorous multiple testing procedures to ad hoc P-value cut-off points. However, when the data implies graphical structure, it should be possible to exploit this feature in the threshold selection process. In this article we propose a procedure based on this principle. Using coding theory we devise a measure of graphical structure, for example, highly connected nodes or chain structure. The measure for a particular graph can be compared to that of a random graph and structure inferred on that basis. By varying the statistical threshold the maximum deviation from random structure can be estimated, and the threshold is then chosen on that basis. A global test for graph structure follows naturally.

基于实验数据的基因调控网络重建通常依赖于统计证据,因此需要选择一个定义显著生物效应的统计阈值。在文献中发现的解决这个问题的方法范围从严格的多重测试程序到特别的p值截止点。然而,当数据包含图形结构时,应该可以在阈值选择过程中利用这一特征。在本文中,我们提出了一个基于这一原则的程序。利用编码理论,我们设计了图形结构的度量,例如,高度连接的节点或链结构。可以将特定图的度量与随机图的度量进行比较,并在此基础上推断出结构。通过改变统计阈值,可以估计出与随机结构的最大偏差,然后在此基础上选择阈值。图结构的全局测试自然随之而来。
{"title":"Selection of statistical thresholds in graphical models.","authors":"Anthony Almudevar","doi":"10.1155/2009/878013","DOIUrl":"https://doi.org/10.1155/2009/878013","url":null,"abstract":"<p><p>Reconstruction of gene regulatory networks based on experimental data usually relies on statistical evidence, necessitating the choice of a statistical threshold which defines a significant biological effect. Approaches to this problem found in the literature range from rigorous multiple testing procedures to ad hoc P-value cut-off points. However, when the data implies graphical structure, it should be possible to exploit this feature in the threshold selection process. In this article we propose a procedure based on this principle. Using coding theory we devise a measure of graphical structure, for example, highly connected nodes or chain structure. The measure for a particular graph can be compared to that of a random graph and structure inferred on that basis. By varying the statistical threshold the maximum deviation from random structure can be estimated, and the threshold is then chosen on that basis. A global test for graph structure follows naturally.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":"2009 ","pages":"878013"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/878013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28771966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Stochastic simulation of delay-induced circadian rhythms in Drosophila. 果蝇延迟诱导的昼夜节律的随机模拟。
Pub Date : 2009-01-01 Epub Date: 2009-07-19 DOI: 10.1155/2009/386853
Zhouyi Xu, Xiaodong Cai

Circadian rhythms are ubiquitous in all eukaryotes and some prokaryotes. Several computational models with or without time delays have been developed for circadian rhythms. Exact stochastic simulations have been carried out for several models without time delays, but no exact stochastic simulation has been done for models with delays. In this paper, we proposed a detailed and a reduced stochastic model with delays for circadian rhythms in Drosophila based on two deterministic models of Smolen et al. and employed exact stochastic simulation to simulate circadian oscillations. Our simulations showed that both models can produce sustained oscillations and that the oscillation is robust to noise in the sense that there is very little variability in oscillation period although there are significant random fluctuations in oscillation peaks. Moreover, although average time delays are essential to simulation of oscillation, random changes in time delays within certain range around fixed average time delay cause little variability in the oscillation period. Our simulation results also showed that both models are robust to parameter variations and that oscillation can be entrained by light/dark circles. Our simulations further demonstrated that within a reasonable range around the experimental result, the rates that dclock and per promoters switch back and forth between activated and repressed sites have little impact on oscillation period.

昼夜节律在所有真核生物和一些原核生物中普遍存在。有或没有时间延迟的几种计算模型已经被开发出来用于昼夜节律。对几种不存在时滞的模型进行了精确的随机模拟,但对存在时滞的模型还没有进行精确的随机模拟。本文在Smolen等人的两种确定性模型的基础上,提出了果蝇昼夜节律的详细和简化的随机延迟模型,并采用精确的随机模拟来模拟昼夜节律振荡。我们的模拟表明,这两种模型都可以产生持续的振荡,并且振荡对噪声具有鲁棒性,即振荡周期的可变性很小,尽管振荡峰值存在显著的随机波动。此外,虽然平均时延对振荡模拟至关重要,但在固定平均时延周围一定范围内的随机时延变化对振荡周期的可变性很小。我们的模拟结果还表明,这两个模型对参数变化具有鲁棒性,并且振荡可以被光/黑圈所携带。我们的模拟进一步表明,在实验结果的合理范围内,dclock和per启动子在激活和抑制位点之间来回切换的速率对振荡周期的影响很小。
{"title":"Stochastic simulation of delay-induced circadian rhythms in Drosophila.","authors":"Zhouyi Xu,&nbsp;Xiaodong Cai","doi":"10.1155/2009/386853","DOIUrl":"https://doi.org/10.1155/2009/386853","url":null,"abstract":"<p><p>Circadian rhythms are ubiquitous in all eukaryotes and some prokaryotes. Several computational models with or without time delays have been developed for circadian rhythms. Exact stochastic simulations have been carried out for several models without time delays, but no exact stochastic simulation has been done for models with delays. In this paper, we proposed a detailed and a reduced stochastic model with delays for circadian rhythms in Drosophila based on two deterministic models of Smolen et al. and employed exact stochastic simulation to simulate circadian oscillations. Our simulations showed that both models can produce sustained oscillations and that the oscillation is robust to noise in the sense that there is very little variability in oscillation period although there are significant random fluctuations in oscillation peaks. Moreover, although average time delays are essential to simulation of oscillation, random changes in time delays within certain range around fixed average time delay cause little variability in the oscillation period. Our simulation results also showed that both models are robust to parameter variations and that oscillation can be entrained by light/dark circles. Our simulations further demonstrated that within a reasonable range around the experimental result, the rates that dclock and per promoters switch back and forth between activated and repressed sites have little impact on oscillation period.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"386853"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/386853","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28407846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Network structure and biological function: reconstruction, modeling, and statistical approaches. 网络结构和生物功能:重建、建模和统计方法。
Pub Date : 2009-01-01 Epub Date: 2009-06-11 DOI: 10.1155/2009/714985
Joachim Selbig, Matthias Steinfath, Dirk Repsilber
Network structure and biological function : reconstruction, modeling, and statistical approaches
{"title":"Network structure and biological function: reconstruction, modeling, and statistical approaches.","authors":"Joachim Selbig,&nbsp;Matthias Steinfath,&nbsp;Dirk Repsilber","doi":"10.1155/2009/714985","DOIUrl":"https://doi.org/10.1155/2009/714985","url":null,"abstract":"Network structure and biological function : reconstruction, modeling, and statistical approaches","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"714985"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/714985","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28474004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
EURASIP journal on bioinformatics & systems biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1