用于z值计算的增量算法

J.-C. Aude , A. Louis
{"title":"用于z值计算的增量算法","authors":"J.-C. Aude ,&nbsp;A. Louis","doi":"10.1016/S0097-8485(02)00003-7","DOIUrl":null,"url":null,"abstract":"<div><p>The <em>Z</em>-value (Comput. Chem. 23 (1999) 333) is an extension of the <em>Z</em>-score that is classically used to compare sets of biological sequences. The <em>Z</em>-value has been successfully used to handle complete genome studies as well as analyze large sets of proteins. The <em>Z</em>-value computation is based on a Monte Carlo approach to estimate the statistical significance of a Smith &amp; Waterman alignment score. Comet et al. (Comput. Chem. 23 (1999) 333) have shown that, in contrast to the alignment score, the <em>Z</em>-value largely reduces the bias due to the lengths and compositions of the sequences. They also described an estimator of the deviation of <em>Z</em>-values, that we extend in this paper in order to optimize <em>Z</em>-values computation. The <em>incremental</em> algorithm described here provides two characteristics which are usually incompatible: (i) it improves the accuracy of <em>Z</em>-values calculation; (ii) it reduces the time complexity (this algorithm has been named <em>incremental</em> because it iteratively adds random sequences to the Monte-Carlo process when needed). Results are presented, originating from the all-by-all comparison of the proteins from <em>Saccharomyces cerevisiae</em> and <em>Escherichia coli</em>.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 402-410"},"PeriodicalIF":0.0000,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00003-7","citationCount":"12","resultStr":"{\"title\":\"An incremental algorithm for Z-value computations\",\"authors\":\"J.-C. Aude ,&nbsp;A. Louis\",\"doi\":\"10.1016/S0097-8485(02)00003-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The <em>Z</em>-value (Comput. Chem. 23 (1999) 333) is an extension of the <em>Z</em>-score that is classically used to compare sets of biological sequences. The <em>Z</em>-value has been successfully used to handle complete genome studies as well as analyze large sets of proteins. The <em>Z</em>-value computation is based on a Monte Carlo approach to estimate the statistical significance of a Smith &amp; Waterman alignment score. Comet et al. (Comput. Chem. 23 (1999) 333) have shown that, in contrast to the alignment score, the <em>Z</em>-value largely reduces the bias due to the lengths and compositions of the sequences. They also described an estimator of the deviation of <em>Z</em>-values, that we extend in this paper in order to optimize <em>Z</em>-values computation. The <em>incremental</em> algorithm described here provides two characteristics which are usually incompatible: (i) it improves the accuracy of <em>Z</em>-values calculation; (ii) it reduces the time complexity (this algorithm has been named <em>incremental</em> because it iteratively adds random sequences to the Monte-Carlo process when needed). Results are presented, originating from the all-by-all comparison of the proteins from <em>Saccharomyces cerevisiae</em> and <em>Escherichia coli</em>.</p></div>\",\"PeriodicalId\":79331,\"journal\":{\"name\":\"Computers & chemistry\",\"volume\":\"26 5\",\"pages\":\"Pages 402-410\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00003-7\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0097848502000037\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097848502000037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

z值(计算。Chem. 23(1999) 333)是Z-score的扩展,通常用于比较生物序列集。z值已成功用于处理全基因组研究以及分析大量蛋白质。z值计算是基于蒙特卡罗方法来估计Smith &沃特曼对齐分数。彗星等人(计算。Chem. 23(1999) 333)表明,与对齐分数相反,z值在很大程度上减少了由于序列的长度和组成而产生的偏差。他们还描述了z值偏差的估计器,我们在本文中扩展了该估计器以优化z值计算。这里描述的增量算法提供了两个通常不兼容的特性:(i)它提高了z值计算的准确性;(ii)降低了时间复杂度(该算法被称为增量算法,因为它在需要时迭代地将随机序列添加到蒙特卡罗过程中)。通过对酿酒酵母和大肠杆菌的蛋白质进行全面比较,得出了这一结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An incremental algorithm for Z-value computations

The Z-value (Comput. Chem. 23 (1999) 333) is an extension of the Z-score that is classically used to compare sets of biological sequences. The Z-value has been successfully used to handle complete genome studies as well as analyze large sets of proteins. The Z-value computation is based on a Monte Carlo approach to estimate the statistical significance of a Smith & Waterman alignment score. Comet et al. (Comput. Chem. 23 (1999) 333) have shown that, in contrast to the alignment score, the Z-value largely reduces the bias due to the lengths and compositions of the sequences. They also described an estimator of the deviation of Z-values, that we extend in this paper in order to optimize Z-values computation. The incremental algorithm described here provides two characteristics which are usually incompatible: (i) it improves the accuracy of Z-values calculation; (ii) it reduces the time complexity (this algorithm has been named incremental because it iteratively adds random sequences to the Monte-Carlo process when needed). Results are presented, originating from the all-by-all comparison of the proteins from Saccharomyces cerevisiae and Escherichia coli.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Instructions to authors Author Index Keyword Index Volume contents New molecular surface-based 3D-QSAR method using Kohonen neural network and 3-way PLS
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1