217 closed Salmonella reference genomes using PacBio sequencing.

IF 2.5 Q3 GENETICS & HEREDITY BMC genomic data Pub Date : 2025-02-28 DOI:10.1186/s12863-025-01304-7
Yan Luo, Jae Hee Jang, Maria Balkey, Maria Hoffmann
{"title":"217 closed Salmonella reference genomes using PacBio sequencing.","authors":"Yan Luo, Jae Hee Jang, Maria Balkey, Maria Hoffmann","doi":"10.1186/s12863-025-01304-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Whole Genome Sequencing (WGS) is widely used in food safety for the detection, investigation, and control of foodborne bacterial pathogens. However, the WGS data in most public databases, such as the National Center for Biotechnology Information (NCBI), primarily consist of Illumina short reads which lack some important information for repetitive regions, structural variations, and mobile genetic elements, and the genomic location of certain important genes like antimicrobial resistance genes (AMR) and virulence genes. To address this limitation, we have contributed 217 closed circular Salmonella enterica genomes that were generated using PacBio sequencing to the NCBI Pathogen Detection (PD) database and GenBank. This dataset provides a higher level of accuracy to genome representations in the database.</p><p><strong>Data description: </strong>High-quality complete reference genomes generated from PacBio long reads can provide essential details that are not available in draft genomes from short reads. A complete reference genome allows for more accurate data analysis and researchers to establish connections between genome variations and known genes, regulatory elements, and other genomic features. The addition of 217 complete genomes from 78 different Salmonella serovars, each representing either a distinct SNP cluster within the NCBI PD database or a unique strain, significantly enriches the diversity of the reference genome database.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"15"},"PeriodicalIF":2.5000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11871702/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC genomic data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12863-025-01304-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Whole Genome Sequencing (WGS) is widely used in food safety for the detection, investigation, and control of foodborne bacterial pathogens. However, the WGS data in most public databases, such as the National Center for Biotechnology Information (NCBI), primarily consist of Illumina short reads which lack some important information for repetitive regions, structural variations, and mobile genetic elements, and the genomic location of certain important genes like antimicrobial resistance genes (AMR) and virulence genes. To address this limitation, we have contributed 217 closed circular Salmonella enterica genomes that were generated using PacBio sequencing to the NCBI Pathogen Detection (PD) database and GenBank. This dataset provides a higher level of accuracy to genome representations in the database.

Data description: High-quality complete reference genomes generated from PacBio long reads can provide essential details that are not available in draft genomes from short reads. A complete reference genome allows for more accurate data analysis and researchers to establish connections between genome variations and known genes, regulatory elements, and other genomic features. The addition of 217 complete genomes from 78 different Salmonella serovars, each representing either a distinct SNP cluster within the NCBI PD database or a unique strain, significantly enriches the diversity of the reference genome database.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用 PacBio 测序技术获得 217 个封闭的沙门氏菌参考基因组。
目的:全基因组测序(WGS)广泛应用于食品安全领域,用于食源性致病菌的检测、调查和控制。然而,大多数公共数据库,如国家生物技术信息中心(NCBI)的WGS数据主要由Illumina短读组成,缺乏重复区域、结构变异和移动遗传元件的一些重要信息,以及某些重要基因如抗菌素耐药基因(AMR)和毒力基因的基因组定位。为了解决这一限制,我们将使用PacBio测序产生的217个闭环肠沙门氏菌基因组提供给NCBI病原体检测(PD)数据库和GenBank。该数据集为数据库中的基因组表示提供了更高的准确性。数据描述:PacBio长读段生成的高质量完整参考基因组可以提供短读段基因组草稿中无法提供的基本细节。一个完整的参考基因组允许更准确的数据分析和研究人员建立基因组变异和已知基因,调控元件和其他基因组特征之间的联系。来自78个不同沙门氏菌血清型的217个完整基因组的加入,极大地丰富了参考基因组数据库的多样性,每个血清型代表NCBI PD数据库中的一个不同的SNP簇或一个独特的菌株。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.90
自引率
0.00%
发文量
0
期刊最新文献
IMPDH1 is a potential immune evasion-related oncoprotein in endometrial cancer. Draft genome sequence and annotation of fungal pathogen Cladosporium tenuissimum SHK1100 causing postharvest sooty spot of fig fruit. Complete genome sequence of Streptococcus periodonticum strain CRC221 isolated from human colorectal tumor tissue. Distinct physiological and transcriptomic responses between tolerant and susceptible rapeseed (Brassica napus) germplasm to flooding stress. Bacterial community profiling of Malaysian drinking water reservoirs using metagenomic amplicon sequencing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1