Ten common issues with reference sequence databases and how to mitigate them

Samuel D. Chorlton
{"title":"Ten common issues with reference sequence databases and how to mitigate them","authors":"Samuel D. Chorlton","doi":"10.3389/fbinf.2024.1278228","DOIUrl":null,"url":null,"abstract":"Metagenomic sequencing has revolutionized our understanding of microbiology. While metagenomic tools and approaches have been extensively evaluated and benchmarked, far less attention has been given to the reference sequence database used in metagenomic classification. Issues with reference sequence databases are pervasive. Database contamination is the most recognized issue in the literature; however, it remains relatively unmitigated in most analyses. Other common issues with reference sequence databases include taxonomic errors, inappropriate inclusion and exclusion criteria, and sequence content errors. This review covers ten common issues with reference sequence databases and the potential downstream consequences of these issues. Mitigation measures are discussed for each issue, including bioinformatic tools and database curation strategies. Together, these strategies present a path towards more accurate, reproducible and translatable metagenomic sequencing.","PeriodicalId":507586,"journal":{"name":"Frontiers in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fbinf.2024.1278228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Metagenomic sequencing has revolutionized our understanding of microbiology. While metagenomic tools and approaches have been extensively evaluated and benchmarked, far less attention has been given to the reference sequence database used in metagenomic classification. Issues with reference sequence databases are pervasive. Database contamination is the most recognized issue in the literature; however, it remains relatively unmitigated in most analyses. Other common issues with reference sequence databases include taxonomic errors, inappropriate inclusion and exclusion criteria, and sequence content errors. This review covers ten common issues with reference sequence databases and the potential downstream consequences of these issues. Mitigation measures are discussed for each issue, including bioinformatic tools and database curation strategies. Together, these strategies present a path towards more accurate, reproducible and translatable metagenomic sequencing.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
参考序列数据库的十个常见问题及解决方法
元基因组测序彻底改变了我们对微生物学的认识。虽然对元基因组工具和方法进行了广泛的评估和基准测试,但对用于元基因组分类的参考序列数据库的关注却少得多。参考序列数据库的问题普遍存在。数据库污染是文献中公认的最大问题,但在大多数分析中仍未得到解决。参考序列数据库的其他常见问题包括分类错误、纳入和排除标准不当以及序列内容错误。本综述涉及参考序列数据库的十个常见问题及其潜在的下游后果。针对每个问题讨论了缓解措施,包括生物信息学工具和数据库整理策略。这些策略共同构成了一条通往更准确、可重复和可转化的元基因组测序之路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Editorial: Machine learning approaches to antimicrobials: discovery and resistance RIPS (rapid intuitive pathogen surveillance): a tool for surveillance of genome sequence data from foodborne bacterial pathogens Editorial: Big data and artificial intelligence for genomics and therapeutics – Proceedings of the 19th Annual Meeting of the MidSouth Computational Biology and Bioinformatics Society (MCBIOS) In silico studies of benzothiazole derivatives as potential inhibitors of Anopheles funestus and Anopheles gambiae trehalase Predictive identification and design of potent inhibitors targeting resistance-inducing candidate genes from E. coli whole-genome sequences
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1