基于连锁不平衡准则的全基因组标记snp选择算法。

Lan Liu, Yonghui Wu, S. Lonardi, Tao Jiang
{"title":"基于连锁不平衡准则的全基因组标记snp选择算法。","authors":"Lan Liu, Yonghui Wu, S. Lonardi, Tao Jiang","doi":"10.1142/9781860948732_0011","DOIUrl":null,"url":null,"abstract":"In this paper, we study the tagSNP selection problem on multiple populations using the pairwise r(2) linkage disequilibrium criterion. We propose a novel combinatorial optimization model for the tagSNP selection problem, called the minimum common tagSNP selection (MCTS) problem, and present efficient solutions for MCTS. Our approach consists of three main steps including (i) partitioning the SNP markers into small disjoint components, (ii) applying some data reduction rules to simplify the problem, and (iii) applying either a fast greedy algorithm or a Lagrangian relaxation algorithm to solve the remaining (general) MCTS. These algorithms also provide lower bounds on tagging (i.e. the minimum number of tagSNPs needed). The lower bounds allow us to evaluate how far our solution is from the optimum. To the best of our knowledge, it is the first time tagging lower bounds are discussed in the literature. We assess the performance of our algorithms on real HapMap data for genome-wide tagging. The experiments demonstrate that our algorithms run 3 to 4 orders of magnitude faster than the existing single-population tagging programs like FESTA, LD-Select and the multiple-population tagging method MultiPop-TagSelect. Our method also greatly reduces the required tagSNPs compared to LD-Select on a single population and MultiPop-TagSelect on multiple populations. Moreover, the numbers of tagSNPs selected by our algorithms are almost optimal since they are very close to the corresponding lower bounds obtained by our method.","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"6 1","pages":"67-78"},"PeriodicalIF":0.0000,"publicationDate":"2007-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Efficient algorithms for genome-wide tagSNP selection across populations via the linkage disequilibrium criterion.\",\"authors\":\"Lan Liu, Yonghui Wu, S. Lonardi, Tao Jiang\",\"doi\":\"10.1142/9781860948732_0011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study the tagSNP selection problem on multiple populations using the pairwise r(2) linkage disequilibrium criterion. We propose a novel combinatorial optimization model for the tagSNP selection problem, called the minimum common tagSNP selection (MCTS) problem, and present efficient solutions for MCTS. Our approach consists of three main steps including (i) partitioning the SNP markers into small disjoint components, (ii) applying some data reduction rules to simplify the problem, and (iii) applying either a fast greedy algorithm or a Lagrangian relaxation algorithm to solve the remaining (general) MCTS. These algorithms also provide lower bounds on tagging (i.e. the minimum number of tagSNPs needed). The lower bounds allow us to evaluate how far our solution is from the optimum. To the best of our knowledge, it is the first time tagging lower bounds are discussed in the literature. We assess the performance of our algorithms on real HapMap data for genome-wide tagging. The experiments demonstrate that our algorithms run 3 to 4 orders of magnitude faster than the existing single-population tagging programs like FESTA, LD-Select and the multiple-population tagging method MultiPop-TagSelect. Our method also greatly reduces the required tagSNPs compared to LD-Select on a single population and MultiPop-TagSelect on multiple populations. Moreover, the numbers of tagSNPs selected by our algorithms are almost optimal since they are very close to the corresponding lower bounds obtained by our method.\",\"PeriodicalId\":72665,\"journal\":{\"name\":\"Computational systems bioinformatics. Computational Systems Bioinformatics Conference\",\"volume\":\"6 1\",\"pages\":\"67-78\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational systems bioinformatics. Computational Systems Bioinformatics Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/9781860948732_0011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/9781860948732_0011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

本文利用配对r(2)连锁不平衡准则研究了多种群的标签snp选择问题。针对标签snp选择问题,我们提出了一种新的组合优化模型,称为最小共同标签snp选择(MCTS)问题,并给出了MCTS的有效解决方案。我们的方法包括三个主要步骤,包括(i)将SNP标记划分为小的不连接的组件,(ii)应用一些数据约简规则来简化问题,以及(iii)应用快速贪婪算法或拉格朗日松弛算法来解决剩余的(一般)MCTS。这些算法还提供了标签的下限(即所需的标签snp的最小数量)。下界允许我们计算解离最优解的距离。据我们所知,这是文献中第一次讨论标注下界。我们评估了我们的算法在全基因组标记的真实HapMap数据上的性能。实验表明,我们的算法运行速度比现有的单种群标记程序(如FESTA, LD-Select和多种群标记方法MultiPop-TagSelect)快3到4个数量级。与单种群的LD-Select和多种群的MultiPop-TagSelect相比,我们的方法也大大减少了所需的tagsnp。此外,我们的算法选择的标签snp数量几乎是最优的,因为它们非常接近我们的方法得到的相应下界。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Efficient algorithms for genome-wide tagSNP selection across populations via the linkage disequilibrium criterion.
In this paper, we study the tagSNP selection problem on multiple populations using the pairwise r(2) linkage disequilibrium criterion. We propose a novel combinatorial optimization model for the tagSNP selection problem, called the minimum common tagSNP selection (MCTS) problem, and present efficient solutions for MCTS. Our approach consists of three main steps including (i) partitioning the SNP markers into small disjoint components, (ii) applying some data reduction rules to simplify the problem, and (iii) applying either a fast greedy algorithm or a Lagrangian relaxation algorithm to solve the remaining (general) MCTS. These algorithms also provide lower bounds on tagging (i.e. the minimum number of tagSNPs needed). The lower bounds allow us to evaluate how far our solution is from the optimum. To the best of our knowledge, it is the first time tagging lower bounds are discussed in the literature. We assess the performance of our algorithms on real HapMap data for genome-wide tagging. The experiments demonstrate that our algorithms run 3 to 4 orders of magnitude faster than the existing single-population tagging programs like FESTA, LD-Select and the multiple-population tagging method MultiPop-TagSelect. Our method also greatly reduces the required tagSNPs compared to LD-Select on a single population and MultiPop-TagSelect on multiple populations. Moreover, the numbers of tagSNPs selected by our algorithms are almost optimal since they are very close to the corresponding lower bounds obtained by our method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Novel Gene Discovery in the Human Malaria Parasite using Nucleosome Positioning Data. Estimating support for protein-protein interaction data with applications to function prediction. On the accurate construction of consensus genetic maps. Efficient haplotype inference from pedigrees with missing data using linear systems with disjoint-set data structures. Knowledge representation and data mining for biological imaging.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1