网络中有色图案的异常性评估。

Sophie Schbath, Vincent Lacroix, Marie-France Sagot
{"title":"网络中有色图案的异常性评估。","authors":"Sophie Schbath,&nbsp;Vincent Lacroix,&nbsp;Marie-France Sagot","doi":"10.1155/2009/616234","DOIUrl":null,"url":null,"abstract":"<p><p>Various methods have been recently employed to characterise the structure of biological networks. In particular, the concept of network motif and the related one of coloured motif have proven useful to model the notion of a functional/evolutionary building block. However, algorithms that enumerate all the motifs of a network may produce a very large output, and methods to decide which motifs should be selected for downstream analysis are needed. A widely used method is to assess if the motif is exceptional, that is, over- or under-represented with respect to a null hypothesis. Much effort has been put in the last thirty years to derive P-values for the frequencies of topological motifs, that is, fixed subgraphs. They rely either on (compound) Poisson and Gaussian approximations for the motif count distribution in Erdös-Rényi random graphs or on simulations in other models. We focus on a different definition of graph motifs that corresponds to coloured motifs. A coloured motif is a connected subgraph with fixed vertex colours but unspecified topology. Our work is the first analytical attempt to assess the exceptionality of coloured motifs in networks without any simulation. We first establish analytical formulae for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random graph model. Using simulations under this model, we further show that a Pólya-Aeppli distribution better approximates the distribution of the motif count compared to Gaussian or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound Poisson distributions, are indeed well designed to model counts of clumping events. Altogether, these results enable to derive a P-value for a coloured motif, without spending time on simulations.</p>","PeriodicalId":72957,"journal":{"name":"EURASIP journal on bioinformatics & systems biology","volume":" ","pages":"616234"},"PeriodicalIF":0.0000,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2009/616234","citationCount":"0","resultStr":"{\"title\":\"Assessing the exceptionality of coloured motifs in networks.\",\"authors\":\"Sophie Schbath,&nbsp;Vincent Lacroix,&nbsp;Marie-France Sagot\",\"doi\":\"10.1155/2009/616234\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Various methods have been recently employed to characterise the structure of biological networks. In particular, the concept of network motif and the related one of coloured motif have proven useful to model the notion of a functional/evolutionary building block. However, algorithms that enumerate all the motifs of a network may produce a very large output, and methods to decide which motifs should be selected for downstream analysis are needed. A widely used method is to assess if the motif is exceptional, that is, over- or under-represented with respect to a null hypothesis. Much effort has been put in the last thirty years to derive P-values for the frequencies of topological motifs, that is, fixed subgraphs. They rely either on (compound) Poisson and Gaussian approximations for the motif count distribution in Erdös-Rényi random graphs or on simulations in other models. We focus on a different definition of graph motifs that corresponds to coloured motifs. A coloured motif is a connected subgraph with fixed vertex colours but unspecified topology. Our work is the first analytical attempt to assess the exceptionality of coloured motifs in networks without any simulation. We first establish analytical formulae for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random graph model. Using simulations under this model, we further show that a Pólya-Aeppli distribution better approximates the distribution of the motif count compared to Gaussian or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound Poisson distributions, are indeed well designed to model counts of clumping events. Altogether, these results enable to derive a P-value for a coloured motif, without spending time on simulations.</p>\",\"PeriodicalId\":72957,\"journal\":{\"name\":\"EURASIP journal on bioinformatics & systems biology\",\"volume\":\" \",\"pages\":\"616234\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1155/2009/616234\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EURASIP journal on bioinformatics & systems biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1155/2009/616234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2009/1/26 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EURASIP journal on bioinformatics & systems biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2009/616234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2009/1/26 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

最近已经采用了各种方法来描述生物网络的结构。特别是,网络基序的概念和相关的彩色基序的概念已被证明有助于建模功能/进化构建块的概念。然而,枚举网络中所有基序的算法可能产生非常大的输出,并且需要确定应该选择哪些基序进行下游分析的方法。一种广泛使用的方法是评估基序是否异常,即相对于零假设而言,是否过度或不足代表。在过去的三十年里,人们花了很多精力来推导拓扑基元(即固定子图)频率的p值。他们要么依靠(复合)泊松和高斯近似在Erdös-Rényi随机图中的基序计数分布,要么依靠其他模型的模拟。我们将重点讨论与彩色图案对应的图形图案的不同定义。彩色母题是顶点颜色固定但拓扑结构未指定的连通子图。我们的工作是首次在没有任何模拟的情况下分析评估网络中有色图案的异常性。我们首先建立了一个Erdös-Rényi随机图模型中彩色图案计数的均值和方差的解析公式。通过在该模型下的模拟,我们进一步表明,与高斯或泊松分布相比,Pólya-Aeppli分布更接近基序计数的分布。Pólya-Aeppli分布,以及更普遍的复合泊松分布,确实被很好地设计用于模拟团块事件的计数。总之,这些结果使我们能够推导出彩色图案的p值,而无需花费时间进行模拟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Assessing the exceptionality of coloured motifs in networks.

Various methods have been recently employed to characterise the structure of biological networks. In particular, the concept of network motif and the related one of coloured motif have proven useful to model the notion of a functional/evolutionary building block. However, algorithms that enumerate all the motifs of a network may produce a very large output, and methods to decide which motifs should be selected for downstream analysis are needed. A widely used method is to assess if the motif is exceptional, that is, over- or under-represented with respect to a null hypothesis. Much effort has been put in the last thirty years to derive P-values for the frequencies of topological motifs, that is, fixed subgraphs. They rely either on (compound) Poisson and Gaussian approximations for the motif count distribution in Erdös-Rényi random graphs or on simulations in other models. We focus on a different definition of graph motifs that corresponds to coloured motifs. A coloured motif is a connected subgraph with fixed vertex colours but unspecified topology. Our work is the first analytical attempt to assess the exceptionality of coloured motifs in networks without any simulation. We first establish analytical formulae for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random graph model. Using simulations under this model, we further show that a Pólya-Aeppli distribution better approximates the distribution of the motif count compared to Gaussian or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound Poisson distributions, are indeed well designed to model counts of clumping events. Altogether, these results enable to derive a P-value for a coloured motif, without spending time on simulations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
From protein-protein interactions to protein co-expression networks: a new perspective to evaluate large-scale proteomic data. On biometric systems: electrocardiogram Gaussianity and data synthesis. BCC-NER: bidirectional, contextual clues named entity tagger for gene/protein mention recognition. Review of stochastic hybrid systems with applications in biological systems modeling and analysis. Bayesian inference for biomarker discovery in proteomics: an analytic solution.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1