Managing false positives during detection of pathogen sequences in shotgun metagenomics datasets.

IF 2.9 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS BMC Bioinformatics Pub Date : 2024-12-03 DOI:10.1186/s12859-024-05952-x
Lauren M Bradford, Catherine Carrillo, Alex Wong
{"title":"Managing false positives during detection of pathogen sequences in shotgun metagenomics datasets.","authors":"Lauren M Bradford, Catherine Carrillo, Alex Wong","doi":"10.1186/s12859-024-05952-x","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Culture-independent diagnostic tests are gaining popularity as tools for detecting pathogens in food. Shotgun sequencing holds substantial promise for food testing as it provides abundant information on microbial communities, but the challenge is in analyzing large and complex sequencing datasets with a high degree of both sensitivity and specificity. Falsely classifying sequencing reads as originating from pathogens can lead to unnecessary food recalls or production shutdowns, while low sensitivity resulting in false negatives could lead to preventable illness.</p><p><strong>Results: </strong>We used simulated and published shotgun sequencing datasets containing Salmonella-derived reads to explore the appearance and mitigation of false positive results using the popular taxonomic annotation softwares Kraken2 and Metaphlan4. Using default parameters, Kraken2 is sensitive but prone to false positives, while Metaphlan4 is more specific but unable to detect Salmonella at low abundance. We then developed a bioinformatic pipeline for identifying and removing reads falsely identified as Salmonella by Kraken2 while retaining high sensitivity. Carefully considering software parameters and database choices is essential to avoiding false positive sample calls. With well-chosen parameters plus additional steps to confirm the taxonomic origin of reads, it is possible to detect pathogens with very high specificity and sensitivity.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"372"},"PeriodicalIF":2.9000,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11613480/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-024-05952-x","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Culture-independent diagnostic tests are gaining popularity as tools for detecting pathogens in food. Shotgun sequencing holds substantial promise for food testing as it provides abundant information on microbial communities, but the challenge is in analyzing large and complex sequencing datasets with a high degree of both sensitivity and specificity. Falsely classifying sequencing reads as originating from pathogens can lead to unnecessary food recalls or production shutdowns, while low sensitivity resulting in false negatives could lead to preventable illness.

Results: We used simulated and published shotgun sequencing datasets containing Salmonella-derived reads to explore the appearance and mitigation of false positive results using the popular taxonomic annotation softwares Kraken2 and Metaphlan4. Using default parameters, Kraken2 is sensitive but prone to false positives, while Metaphlan4 is more specific but unable to detect Salmonella at low abundance. We then developed a bioinformatic pipeline for identifying and removing reads falsely identified as Salmonella by Kraken2 while retaining high sensitivity. Carefully considering software parameters and database choices is essential to avoiding false positive sample calls. With well-chosen parameters plus additional steps to confirm the taxonomic origin of reads, it is possible to detect pathogens with very high specificity and sensitivity.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在霰弹枪宏基因组数据集中检测病原体序列时处理假阳性。
背景:培养无关的诊断测试作为检测食品中病原体的工具越来越受欢迎。霰弹枪测序为食品检测提供了巨大的希望,因为它提供了丰富的微生物群落信息,但挑战在于分析大型和复杂的测序数据集,具有高度的灵敏度和特异性。错误地将测序读数分类为来自病原体可能导致不必要的食品召回或停产,而低灵敏度导致的假阴性可能导致可预防的疾病。结果:我们使用模拟和公开的包含沙门氏菌衍生reads的霰弹枪测序数据集,使用流行的分类注释软件Kraken2和Metaphlan4来探索假阳性结果的出现和缓解。使用默认参数,Kraken2敏感但容易假阳性,而Metaphlan4更特异性但不能检测低丰度的沙门氏菌。然后,我们开发了一种生物信息学管道,用于识别和去除被Kraken2错误识别为沙门氏菌的reads,同时保持高灵敏度。仔细考虑软件参数和数据库选择对于避免误报样本调用至关重要。通过精心选择的参数和额外的步骤来确认reads的分类来源,可以以非常高的特异性和灵敏度检测病原体。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
BMC Bioinformatics
BMC Bioinformatics 生物-生化研究方法
CiteScore
5.70
自引率
3.30%
发文量
506
审稿时长
4.3 months
期刊介绍: BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.
期刊最新文献
Constructing multilayer PPI networks based on homologous proteins and integrating multiple PageRank to identify essential proteins. SNPeBoT: a tool for predicting transcription factor allele specific binding. An alignment-free method for phylogeny estimation using maximum likelihood. GoldPolish-target: targeted long-read genome assembly polishing. UTAP2: an enhanced user-friendly transcriptome and epigenome analysis pipeline.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1