NextPolish2: A Repeat-aware Polishing Tool for Genomes Assembled Using HiFi Long Reads

IF 11.5 2区 生物学 Q1 GENETICS & HEREDITY Genomics, Proteomics & Bioinformatics Pub Date : 2024-01-04 DOI:10.1093/gpbjnl/qzad009
Jiang Hu, Zhuo Wang, Fan Liang, Shan-Lin Liu, Kai Ye, De-Peng Wang
{"title":"NextPolish2: A Repeat-aware Polishing Tool for Genomes Assembled Using HiFi Long Reads","authors":"Jiang Hu, Zhuo Wang, Fan Liang, Shan-Lin Liu, Kai Ye, De-Peng Wang","doi":"10.1093/gpbjnl/qzad009","DOIUrl":null,"url":null,"abstract":"<jats:title>Abstract</jats:title> The high-fidelity (HiFi) long-read sequencing technology developed by PacBio has greatly improved the base-level accuracy of genome assemblies. However, these assemblies still contain base-level errors, particularly within the error-prone regions of HiFi long reads. Existing genome polishing tools usually introduce overcorrections and haplotype switch errors when correcting errors in genomes assembled from HiFi long reads. Here we describe an upgraded genome polishing tool–NextPolish2, which can fix base errors remaining in those “highly accurate” genomes assembled from HiFi long reads without introducing excessive overcorrections and haplotype switch errors. We believe that NextPolish2 has a great significance to further improve the accuracy of telomere-to-telomere (T2T) genomes. NextPolish2 is freely available at https://github.com/Nextomics/NextPolish2.","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"1 1","pages":""},"PeriodicalIF":11.5000,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, Proteomics & Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gpbjnl/qzad009","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract The high-fidelity (HiFi) long-read sequencing technology developed by PacBio has greatly improved the base-level accuracy of genome assemblies. However, these assemblies still contain base-level errors, particularly within the error-prone regions of HiFi long reads. Existing genome polishing tools usually introduce overcorrections and haplotype switch errors when correcting errors in genomes assembled from HiFi long reads. Here we describe an upgraded genome polishing tool–NextPolish2, which can fix base errors remaining in those “highly accurate” genomes assembled from HiFi long reads without introducing excessive overcorrections and haplotype switch errors. We believe that NextPolish2 has a great significance to further improve the accuracy of telomere-to-telomere (T2T) genomes. NextPolish2 is freely available at https://github.com/Nextomics/NextPolish2.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
NextPolish2:使用 HiFi 长读数组装基因组的重复感知抛光工具
摘要 PacBio 公司开发的高保真(HiFi)长读数测序技术大大提高了基因组组装的碱基精确度。然而,这些装配仍然包含碱基水平错误,尤其是在 HiFi 长读数的易错区域。现有的基因组抛光工具在纠正由 HiFi 长读数组装的基因组中的错误时,通常会引入过校正和单倍型转换错误。在这里,我们介绍一种升级版基因组抛光工具--NextPolish2,它可以修正那些由 HiFi 长读数组装的 "高精度 "基因组中残留的碱基错误,而不会引入过多的过校正和单倍型转换错误。我们相信,NextPolish2 对进一步提高端粒到端粒(T2T)基因组的准确性具有重要意义。NextPolish2 可在 https://github.com/Nextomics/NextPolish2 免费获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Genomics, Proteomics & Bioinformatics
Genomics, Proteomics & Bioinformatics Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
14.30
自引率
4.20%
发文量
844
审稿时长
61 days
期刊介绍: Genomics, Proteomics and Bioinformatics (GPB) is the official journal of the Beijing Institute of Genomics, Chinese Academy of Sciences / China National Center for Bioinformation and Genetics Society of China. It aims to disseminate new developments in the field of omics and bioinformatics, publish high-quality discoveries quickly, and promote open access and online publication. GPB welcomes submissions in all areas of life science, biology, and biomedicine, with a focus on large data acquisition, analysis, and curation. Manuscripts covering omics and related bioinformatics topics are particularly encouraged. GPB is indexed/abstracted by PubMed/MEDLINE, PubMed Central, Scopus, BIOSIS Previews, Chemical Abstracts, CSCD, among others.
期刊最新文献
Review and Evaluate the Bioinformatics Analysis Strategies of ATAC-seq and CUT&Tag Data. Identification of highly repetitive barley enhancers with long-range regulation potential via STARR-seq CpG island definition and methylation mapping of the T2T-YAO genome Pindel-TD: a tandem duplication detector based on a pattern growth approach SMARTdb: An Integrated Database for Exploring Single-cell Multi-omics Data of Reproductive Medicine
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1