An Error Correction Algorithm for NGS Data

M. Kchouk, J. Gibrat, M. Elloumi
{"title":"An Error Correction Algorithm for NGS Data","authors":"M. Kchouk, J. Gibrat, M. Elloumi","doi":"10.1109/DEXA.2017.33","DOIUrl":null,"url":null,"abstract":"The Oxford Nanopore and Pacbio SMRT sequencing technologies has revolutionized the Next-Generation Sequencing (NGS) environment by producing long reads that exceed 60 kbp and helped to the completion of many biological projects. But, long reads are characterized by a high error rate which increases the difficulty of biological problems like the genome assembly problem. Error correction of long reads has become a challenge for bioinformaticians, which motivates the development of new approaches for error correction adapted to NGS technologies. In this paper, we present a new denovo self-error correction algorithm using only long reads. Our algorithm operates in two steps: First, we use a fast hashing method which allows to find alignments between the longest reads and other reads in a set of long reads. Next, we use the longest reads as seeds to obtain the final alignment of long reads by using a dynamic programming algorithm in a band of width w. Our error correction algorithm does not require high quality reads, in contrast to existing hybrid error correction ones.","PeriodicalId":127009,"journal":{"name":"2017 28th International Workshop on Database and Expert Systems Applications (DEXA)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 28th International Workshop on Database and Expert Systems Applications (DEXA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEXA.2017.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The Oxford Nanopore and Pacbio SMRT sequencing technologies has revolutionized the Next-Generation Sequencing (NGS) environment by producing long reads that exceed 60 kbp and helped to the completion of many biological projects. But, long reads are characterized by a high error rate which increases the difficulty of biological problems like the genome assembly problem. Error correction of long reads has become a challenge for bioinformaticians, which motivates the development of new approaches for error correction adapted to NGS technologies. In this paper, we present a new denovo self-error correction algorithm using only long reads. Our algorithm operates in two steps: First, we use a fast hashing method which allows to find alignments between the longest reads and other reads in a set of long reads. Next, we use the longest reads as seeds to obtain the final alignment of long reads by using a dynamic programming algorithm in a band of width w. Our error correction algorithm does not require high quality reads, in contrast to existing hybrid error correction ones.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种NGS数据纠错算法
牛津纳米孔和Pacbio SMRT测序技术通过产生超过60 kbp的长reads,彻底改变了下一代测序(NGS)环境,并帮助完成了许多生物项目。但是,长读取具有高错误率的特点,这增加了基因组组装等生物学问题的难度。长读段的纠错已成为生物信息学家面临的一个挑战,这促使了适应NGS技术的纠错新方法的发展。在本文中,我们提出了一种新的仅使用长读的denovo自纠错算法。我们的算法分两步操作:首先,我们使用快速哈希方法,该方法允许查找长读集合中最长读和其他读之间的对齐。接下来,我们将最长的读取作为种子,通过动态规划算法在宽度为w的频带内获得长读取的最终对齐。与现有的混合纠错算法相比,我们的纠错算法不需要高质量的读取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
MuMs: Energy-Aware VM Selection Scheme for Cloud Data Center Biclustering of Biological Sequences Global and Local Feature Learning for Ego-Network Analysis Evaluation of Contextualization and Diversification Approaches in Aggregated Search Towards a Cloud of Clouds Elasticity Management System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1