序列对齐:用于数据库扫描的z值近似定律

J.N. Bacro , J.P. Comet
{"title":"序列对齐:用于数据库扫描的z值近似定律","authors":"J.N. Bacro ,&nbsp;J.P. Comet","doi":"10.1016/S0097-8485(01)00074-2","DOIUrl":null,"url":null,"abstract":"<div><p>The <em>Z</em>-value is an attempt to estimate the statistical significance of a Smith and Waterman dynamic programming alignment score (<em>H</em>-score) through the use of a Monte-Carlo procedure. In this paper, we give an approximation for the <em>Z</em>-value law deduced from the Poisson clumping heuristic developed by Waterman and Vingron (Stat. Sci. 9 (1994) 367) in the case of independent and identically distributed sequences comparison. As for non-gapped alignment scores, our approximation is of Gumbel type but with parameters that are sequence independent. This result makes clear the related experimental results mentioned by Comet et al. (Comput. Chem. 23 (1999) 317). Using ‘quasi-real’ sequences (i.e. randomly shuffled sequences of the same length and amino acid composition as the real ones) we investigate the relevance of our approximation result. Since the Monte-Carlo approach we use generates a bias for the Gumbel decay parameter estimation, a correction procedure is proposed. Applications to real sequences are considered and we show how our results can be used to detect the potential biological relationships between real sequences.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"25 4","pages":"Pages 401-410"},"PeriodicalIF":0.0000,"publicationDate":"2001-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00074-2","citationCount":"25","resultStr":"{\"title\":\"Sequence alignment: an approximation law for the Z-value with applications to databank scanning\",\"authors\":\"J.N. Bacro ,&nbsp;J.P. Comet\",\"doi\":\"10.1016/S0097-8485(01)00074-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The <em>Z</em>-value is an attempt to estimate the statistical significance of a Smith and Waterman dynamic programming alignment score (<em>H</em>-score) through the use of a Monte-Carlo procedure. In this paper, we give an approximation for the <em>Z</em>-value law deduced from the Poisson clumping heuristic developed by Waterman and Vingron (Stat. Sci. 9 (1994) 367) in the case of independent and identically distributed sequences comparison. As for non-gapped alignment scores, our approximation is of Gumbel type but with parameters that are sequence independent. This result makes clear the related experimental results mentioned by Comet et al. (Comput. Chem. 23 (1999) 317). Using ‘quasi-real’ sequences (i.e. randomly shuffled sequences of the same length and amino acid composition as the real ones) we investigate the relevance of our approximation result. Since the Monte-Carlo approach we use generates a bias for the Gumbel decay parameter estimation, a correction procedure is proposed. Applications to real sequences are considered and we show how our results can be used to detect the potential biological relationships between real sequences.</p></div>\",\"PeriodicalId\":79331,\"journal\":{\"name\":\"Computers & chemistry\",\"volume\":\"25 4\",\"pages\":\"Pages 401-410\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00074-2\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0097848501000742\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097848501000742","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

摘要

z值是通过使用蒙特卡罗程序来估计Smith和Waterman动态规划对齐分数(H-score)的统计显著性的尝试。在本文中,我们给出了由Waterman和Vingron (Stat. Sci. 9(1994) 367)开发的泊松聚类启发式在独立和同分布序列比较情况下推导出的z值定律的近似。对于非间隙比对分数,我们的近似是Gumbel型的,但参数是序列无关的。这一结果澄清了Comet等人(Comput.)的相关实验结果。化学。23(1999)317)。使用“准实”序列(即随机洗牌序列相同的长度和氨基酸组成作为真实的),我们研究了我们的近似结果的相关性。由于蒙特卡罗方法对Gumbel衰减参数估计产生偏差,提出了一种校正方法。应用到真实序列被考虑,我们展示了如何我们的结果可以用来检测真实序列之间潜在的生物学关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Sequence alignment: an approximation law for the Z-value with applications to databank scanning

The Z-value is an attempt to estimate the statistical significance of a Smith and Waterman dynamic programming alignment score (H-score) through the use of a Monte-Carlo procedure. In this paper, we give an approximation for the Z-value law deduced from the Poisson clumping heuristic developed by Waterman and Vingron (Stat. Sci. 9 (1994) 367) in the case of independent and identically distributed sequences comparison. As for non-gapped alignment scores, our approximation is of Gumbel type but with parameters that are sequence independent. This result makes clear the related experimental results mentioned by Comet et al. (Comput. Chem. 23 (1999) 317). Using ‘quasi-real’ sequences (i.e. randomly shuffled sequences of the same length and amino acid composition as the real ones) we investigate the relevance of our approximation result. Since the Monte-Carlo approach we use generates a bias for the Gumbel decay parameter estimation, a correction procedure is proposed. Applications to real sequences are considered and we show how our results can be used to detect the potential biological relationships between real sequences.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Instructions to authors Author Index Keyword Index Volume contents New molecular surface-based 3D-QSAR method using Kohonen neural network and 3-way PLS
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1