利用序列对特定距离替代矩阵分析局部序列比对的两两统计显著性

Ankit Agrawal, Xiaoqiu Huang
{"title":"利用序列对特定距离替代矩阵分析局部序列比对的两两统计显著性","authors":"Ankit Agrawal, Xiaoqiu Huang","doi":"10.1109/ICIT.2008.63","DOIUrl":null,"url":null,"abstract":"Pairwise sequence alignment forms the basis of numerous other applications in bioinformatics. The quality of an alignment is gauged by statistical significance rather than by alignment score alone. Therefore, accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, it was shown that pairwise statistical significance does better in practice than database statistical significance, and also provides quicker individual pairwise estimates of statistical significance without having to perform time-consuming database search. Under an evolutionary model, a substitution matrix can be derived using a rate matrix and a fixed distance. Although the commonly used substitution matrices like BLOSUM62, etc. were not originally derived from a rate matrix under an evolutionary model, the corresponding rate matrices can be back calculated. Many researchers have derived different rate matrices using different methods and data. In this paper, we show that pairwise statistical significance using rate matrices with sequence-pair-specific distance performs significantly better compared to using a fixed distance. Pairwise statistical significance using sequence-pair-specific distanced substitution matrices also outperforms database statistical significance reported by BLAST.","PeriodicalId":184201,"journal":{"name":"2008 International Conference on Information Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Pairwise Statistical Significance of Local Sequence Alignment Using Substitution Matrices with Sequence-Pair-Specific Distance\",\"authors\":\"Ankit Agrawal, Xiaoqiu Huang\",\"doi\":\"10.1109/ICIT.2008.63\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pairwise sequence alignment forms the basis of numerous other applications in bioinformatics. The quality of an alignment is gauged by statistical significance rather than by alignment score alone. Therefore, accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, it was shown that pairwise statistical significance does better in practice than database statistical significance, and also provides quicker individual pairwise estimates of statistical significance without having to perform time-consuming database search. Under an evolutionary model, a substitution matrix can be derived using a rate matrix and a fixed distance. Although the commonly used substitution matrices like BLOSUM62, etc. were not originally derived from a rate matrix under an evolutionary model, the corresponding rate matrices can be back calculated. Many researchers have derived different rate matrices using different methods and data. In this paper, we show that pairwise statistical significance using rate matrices with sequence-pair-specific distance performs significantly better compared to using a fixed distance. Pairwise statistical significance using sequence-pair-specific distanced substitution matrices also outperforms database statistical significance reported by BLAST.\",\"PeriodicalId\":184201,\"journal\":{\"name\":\"2008 International Conference on Information Technology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Conference on Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIT.2008.63\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIT.2008.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

成对序列比对构成了生物信息学中许多其他应用的基础。校准的质量是通过统计显著性来衡量的,而不仅仅是校准分数。因此,准确估计两两比对的统计显著性是序列比较中的一个重要问题。最近,研究表明,在实践中,两两统计显著性比数据库统计显著性表现得更好,并且还提供了更快的个体两两统计显著性估计,而无需执行耗时的数据库搜索。在进化模型下,可以用速率矩阵和固定距离推导出替换矩阵。虽然BLOSUM62等常用的替代矩阵最初不是由进化模型下的速率矩阵推导出来的,但是相应的速率矩阵是可以反求的。许多研究者使用不同的方法和数据推导出了不同的速率矩阵。在本文中,我们证明了使用具有序列对特定距离的速率矩阵的两两统计显著性比使用固定距离的两两统计显著性表现得更好。使用序列对特定距离替代矩阵的两两统计显著性也优于BLAST报告的数据库统计显著性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Pairwise Statistical Significance of Local Sequence Alignment Using Substitution Matrices with Sequence-Pair-Specific Distance
Pairwise sequence alignment forms the basis of numerous other applications in bioinformatics. The quality of an alignment is gauged by statistical significance rather than by alignment score alone. Therefore, accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, it was shown that pairwise statistical significance does better in practice than database statistical significance, and also provides quicker individual pairwise estimates of statistical significance without having to perform time-consuming database search. Under an evolutionary model, a substitution matrix can be derived using a rate matrix and a fixed distance. Although the commonly used substitution matrices like BLOSUM62, etc. were not originally derived from a rate matrix under an evolutionary model, the corresponding rate matrices can be back calculated. Many researchers have derived different rate matrices using different methods and data. In this paper, we show that pairwise statistical significance using rate matrices with sequence-pair-specific distance performs significantly better compared to using a fixed distance. Pairwise statistical significance using sequence-pair-specific distanced substitution matrices also outperforms database statistical significance reported by BLAST.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Overheads and Mean Route Failure Time of a Hybrid Protocol for Node-Disjoint Multipath Routing in Mobile Ad Hoc Networks Integrated Genomic Island Prediction Tool (IGIPT) Assignment of Cells to Switches in a Cellular Mobile Environment Using Swarm Intelligence Prediction of Protein Functional Sites Using Novel String Kernels Pairwise DNA Alignment with Sequence Specific Transition-Transversion Ratio Using Multiple Parameter Sets
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1