使用个性化二倍体参考基因组进行RNA测序数据的读图定位,减少了检测等位基因特异性表达的偏差。

Shuai Yuan, Zhaohui Qin
{"title":"使用个性化二倍体参考基因组进行RNA测序数据的读图定位,减少了检测等位基因特异性表达的偏差。","authors":"Shuai Yuan,&nbsp;Zhaohui Qin","doi":"10.1109/BIBMW.2012.6470225","DOIUrl":null,"url":null,"abstract":"<p><p>Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into the consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. As an example, Degner et al. showed that detecting allele-specific expression from RNA sequencing data was biased toward the reference allele. In this study, we developed a method that utilize DirectX 11 enabled graphics processing unit (GPU)'s parallel computing power to produces a personalized diploid reference genome based on all known genetic variants of that particular individual. We show that using such a personalized diploid reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. Our method can be applied to any individual that has genotype information obtained either from array-based genotyping or resequencing. Besides the reference genome, no additional changes to alignment algorithm are needed for performing read mapping therefore one can utilize any of the existing read mapping tools and achieve the improved read mapping result. C++ and GPU compute shader source code of the software program is available at: http://code.google.com/p/diploid-mapping/downloads/list.</p>","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2012 ","pages":"718-724"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBMW.2012.6470225","citationCount":"15","resultStr":"{\"title\":\"Read-mapping using personalized diploid reference genome for RNA sequencing data reduced bias for detecting allele-specific expression.\",\"authors\":\"Shuai Yuan,&nbsp;Zhaohui Qin\",\"doi\":\"10.1109/BIBMW.2012.6470225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into the consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. As an example, Degner et al. showed that detecting allele-specific expression from RNA sequencing data was biased toward the reference allele. In this study, we developed a method that utilize DirectX 11 enabled graphics processing unit (GPU)'s parallel computing power to produces a personalized diploid reference genome based on all known genetic variants of that particular individual. We show that using such a personalized diploid reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. Our method can be applied to any individual that has genotype information obtained either from array-based genotyping or resequencing. Besides the reference genome, no additional changes to alignment algorithm are needed for performing read mapping therefore one can utilize any of the existing read mapping tools and achieve the improved read mapping result. C++ and GPU compute shader source code of the software program is available at: http://code.google.com/p/diploid-mapping/downloads/list.</p>\",\"PeriodicalId\":73283,\"journal\":{\"name\":\"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine\",\"volume\":\"2012 \",\"pages\":\"718-724\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/BIBMW.2012.6470225\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBMW.2012.6470225\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2012.6470225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

摘要

下一代测序技术已广泛应用于遗传学和基因组学研究的许多领域。在分析NGS数据时,一个基本问题是将短测序读数映射回参考基因组。大多数现有的软件包依赖于一个单一的统一的参考基因组,并没有自动考虑到遗传变异。另一方面,较大比例的错误reads影响了NGS实验结果的正确解释。例如,Degner等人表明,从RNA测序数据中检测等位基因特异性表达偏向于参考等位基因。在这项研究中,我们开发了一种方法,利用DirectX 11支持的图形处理单元(GPU)的并行计算能力,基于该特定个体的所有已知遗传变异产生个性化的二倍体参考基因组。我们发现,使用这种个性化的二倍体参考基因组可以提高定位精度,并显著减少等位基因特异性表达分析中对参考等位基因的偏倚。我们的方法可以应用于任何从基于阵列的基因分型或重测序获得基因型信息的个体。除了参考基因组外,不需要对比对算法进行额外的更改,因此可以利用任何现有的读映射工具来实现改进的读映射结果。c++和GPU计算着色器的软件程序源代码可在:http://code.google.com/p/diploid-mapping/downloads/list。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Read-mapping using personalized diploid reference genome for RNA sequencing data reduced bias for detecting allele-specific expression.

Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into the consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. As an example, Degner et al. showed that detecting allele-specific expression from RNA sequencing data was biased toward the reference allele. In this study, we developed a method that utilize DirectX 11 enabled graphics processing unit (GPU)'s parallel computing power to produces a personalized diploid reference genome based on all known genetic variants of that particular individual. We show that using such a personalized diploid reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. Our method can be applied to any individual that has genotype information obtained either from array-based genotyping or resequencing. Besides the reference genome, no additional changes to alignment algorithm are needed for performing read mapping therefore one can utilize any of the existing read mapping tools and achieve the improved read mapping result. C++ and GPU compute shader source code of the software program is available at: http://code.google.com/p/diploid-mapping/downloads/list.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Diurnal Pain Classification in Critically Ill Patients using Machine Learning on Accelerometry and Analgesic Data. Transmission cluster characteristics of global, regional, and lineage-specific SARS-CoV-2 phylogenies. Document-level DDI relation extraction with document-entity embedding The Network Pharmacological Mechanism of Yizhiningshen Oral Liquid in the Treatment of Tic Disorders Study on the Medication Law of Traditional Chinese medicine treating Lumbago based on TCM electronic medical record
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1