BMC genomic data最新文献_第7页

Genomic typing, antimicrobial resistance gene, virulence factor and plasmid replicon database for the important pathogenic bacteria Staphylococcus aureus. 重要致病菌金黄色葡萄球菌基因组分型、耐药基因、毒力因子及质粒复制子数据库。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-09-26 DOI: 10.1186/s12863-025-01363-w

Andrey Shelenkov, Anna Slavokhotova, Mariyam Yunusova, Vladimir Kulikov, Yulia Mikhaylova, Vasiliy Akimkin

Background: Bacterial infections pose a global health threat across clinical and community settings. Over the past decade, the alarming expansion of antimicrobial resistance (AMR) has progressively narrowed therapeutic options, particularly for healthcare-associated infections. This critical situation has been formally recognized by the World Health Organization as a major public health concern. Epidemiological studies have demonstrated that the dissemination of AMR is frequently mediated by specific high-risk bacterial lineages, often designated as "global clones" or "clonal complexes." Consequently, surveillance of these epidemic clones and elucidation of their pathogenic mechanisms and AMR acquisition pathways have become essential research priorities. The advent of whole genome sequencing has revolutionized these investigations, enabling comprehensive epidemiological tracking and detailed analysis of mobile genetic elements responsible for resistance gene transfer. However, despite the exponential increase in available bacterial genome sequences, significant challenges persist. Current genomic datasets often suffer from uneven representation of clinically relevant strains and inconsistent availability of accompanying metadata. These limitations create substantial obstacles for large-scale comparative studies and hinder effective surveillance efforts.

Description: This database represents a comprehensive genomic analysis of 98,950 Staphylococcus aureus isolates, a high-priority bacterial pathogen of global clinical significance. We provide detailed isolate characterization through several established typing schemes including multilocus sequence typing (MLST), clonal complex (CC) assignments, spa typing results, and core genome MLST (cgMLST) profiles. The dataset also documents the presence of CRISPR-Cas systems in these isolates. Beyond fundamental typing data, our resource incorporates the distribution of antimicrobial resistance determinants, virulence factors, and plasmid replicons. These systematically curated genomic features offer researchers valuable insights into isolate epidemiology, resistance mechanisms, and horizontal gene transfer patterns in this highly concerning pathogen.

Conclusion: This database is freely available under CC BY-NC-SA at https://doi.org/10.5281/zenodo.14833440 . The data provided enables researchers to identify optimal reference isolates for various genomic studies, supporting critical investigations into S. aureus epidemiology and antimicrobial resistance evolution. This resource will ultimately inform the development of more effective prevention and control measures against this high-priority pathogen.

背景：细菌感染在临床和社区环境中构成全球健康威胁。在过去十年中，抗菌素耐药性（AMR）的惊人扩张逐渐缩小了治疗选择，特别是针对卫生保健相关感染。这一危急情况已被世界卫生组织正式确认为一个重大公共卫生问题。流行病学研究表明，AMR的传播经常是由特定的高风险细菌谱系介导的，通常被称为“全球克隆”或“克隆复合物”。因此，监测这些流行病克隆并阐明其致病机制和抗菌素耐药性获得途径已成为重要的研究重点。全基因组测序的出现彻底改变了这些调查，使全面的流行病学跟踪和详细分析负责抗性基因转移的移动遗传元件成为可能。然而，尽管可用的细菌基因组序列呈指数增长，但重大挑战仍然存在。目前的基因组数据集通常存在临床相关菌株的不均匀代表和附带元数据的不一致可用性的问题。这些限制为大规模比较研究造成了重大障碍，并阻碍了有效的监测工作。描述：该数据库对98,950株金黄色葡萄球菌进行了全面的基因组分析，金黄色葡萄球菌是一种具有全球临床意义的高优先级细菌病原体。我们通过几种已建立的分型方案，包括多位点序列分型（MLST）、克隆复合体（CC）分配、spa分型结果和核心基因组MLST （cgMLST）谱，提供了详细的分离物特征。该数据集还记录了这些分离株中CRISPR-Cas系统的存在。除了基本的分型数据，我们的资源还包括抗菌素耐药性决定因素、毒力因子和质粒复制子的分布。这些系统整理的基因组特征为研究人员对这种高度关注的病原体的分离流行病学、耐药性机制和水平基因转移模式提供了有价值的见解。结论：该数据库在https://doi.org/10.5281/zenodo.14833440的CC BY-NC-SA下免费提供。提供的数据使研究人员能够确定各种基因组研究的最佳参考分离株，支持对金黄色葡萄球菌流行病学和抗菌素耐药性进化的关键调查。这一资源最终将为制定针对这一高度优先病原体的更有效的预防和控制措施提供信息。

{"title":"Genomic typing, antimicrobial resistance gene, virulence factor and plasmid replicon database for the important pathogenic bacteria Staphylococcus aureus.","authors":"Andrey Shelenkov, Anna Slavokhotova, Mariyam Yunusova, Vladimir Kulikov, Yulia Mikhaylova, Vasiliy Akimkin","doi":"10.1186/s12863-025-01363-w","DOIUrl":"10.1186/s12863-025-01363-w","url":null,"abstract":"Background: Bacterial infections pose a global health threat across clinical and community settings. Over the past decade, the alarming expansion of antimicrobial resistance (AMR) has progressively narrowed therapeutic options, particularly for healthcare-associated infections. This critical situation has been formally recognized by the World Health Organization as a major public health concern. Epidemiological studies have demonstrated that the dissemination of AMR is frequently mediated by specific high-risk bacterial lineages, often designated as \"global clones\" or \"clonal complexes.\" Consequently, surveillance of these epidemic clones and elucidation of their pathogenic mechanisms and AMR acquisition pathways have become essential research priorities. The advent of whole genome sequencing has revolutionized these investigations, enabling comprehensive epidemiological tracking and detailed analysis of mobile genetic elements responsible for resistance gene transfer. However, despite the exponential increase in available bacterial genome sequences, significant challenges persist. Current genomic datasets often suffer from uneven representation of clinically relevant strains and inconsistent availability of accompanying metadata. These limitations create substantial obstacles for large-scale comparative studies and hinder effective surveillance efforts.Description: This database represents a comprehensive genomic analysis of 98,950 Staphylococcus aureus isolates, a high-priority bacterial pathogen of global clinical significance. We provide detailed isolate characterization through several established typing schemes including multilocus sequence typing (MLST), clonal complex (CC) assignments, spa typing results, and core genome MLST (cgMLST) profiles. The dataset also documents the presence of CRISPR-Cas systems in these isolates. Beyond fundamental typing data, our resource incorporates the distribution of antimicrobial resistance determinants, virulence factors, and plasmid replicons. These systematically curated genomic features offer researchers valuable insights into isolate epidemiology, resistance mechanisms, and horizontal gene transfer patterns in this highly concerning pathogen.Conclusion: This database is freely available under CC BY-NC-SA at https://doi.org/10.5281/zenodo.14833440 . The data provided enables researchers to identify optimal reference isolates for various genomic studies, supporting critical investigations into S. aureus epidemiology and antimicrobial resistance evolution. This resource will ultimately inform the development of more effective prevention and control measures against this high-priority pathogen.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"65"},"PeriodicalIF":2.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145180607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-omics mediation pipeline reveals differential pathways of maternal SNPs affecting newborn adiposity outcomes. 多组学中介管道揭示了母亲snp影响新生儿肥胖结局的不同途径。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-09-26 DOI: 10.1186/s12863-025-01355-w

Nathan P Gill, Alan Kuang, Denise M Scholtens

Background: A great deal of previous research describes the impact of the maternal metabolic and genetic milieu on newborn adiposity outcomes. However, much of this research does not focus on all aspects of the problem simultaneously. Studies focusing on metabolic factors may not distinguish between maternal and fetal genetic pathways, while studies that do focus on these different genetic pathways may not incorporate metabolic information into effect estimates or variant classifications. In this paper, we introduce a novel multi-omics pipeline for maternal genetic variant selection and mediation effect testing that can handle all these pathways, and use it to investigate broad patterns in the effects of maternal genetic variants on newborn adiposity outcomes.

Results: A Bayesian network model is used to incorporate both metabolomic and genomic data into an initial filter for maternal variants likely to affect newborn adiposity outcomes through a direct maternal genetic effect, an indirect fetal genetic effect, a maternal metabolic effect, or some combination of these pathways. A mediation model is then fit to these candidate variants and associated outcomes to identify which of these pathways, if any, mediate the total effect. We then group maternal genetic variants according to the relative magnitudes of these three effect pathways. In an application to existing mother-newborn data from the HAPO study, we find that of 78 candidate variants, the majority influence newborn birthweight solely through either a direct maternal or indirect fetal genetic effect (37% and 40%, respectively), a smaller number through both of these (14%), relatively few exclusively through the maternal metabolic pathway (6%), and almost none through a combination of the maternal metabolic pathway with either of the two genetic pathways (3%). We also find that these overall patterns of mediation effects are similar across outcomes.

Conclusions: Our results reveal broad patterns in the effects of maternal genetic variants on newborn adiposity, and identify both new genetic loci and loci known from previous literature to influence newborn adiposity. These results demonstrate the potential for scientific discovery enabled by our multi-omics mediation pipeline, and the approach is broadly applicable for untangling path-specific contributions in the modern integrated multi-omics landscape.

背景：大量先前的研究描述了母体代谢和遗传环境对新生儿肥胖结局的影响。然而，很多研究并没有同时关注这个问题的所有方面。关注代谢因素的研究可能无法区分母体和胎儿的遗传途径，而关注这些不同遗传途径的研究可能不会将代谢信息纳入影响估计或变异分类。在本文中，我们介绍了一种新的多组学管道，用于母体遗传变异选择和中介效应测试，可以处理所有这些途径，并利用它来研究母体遗传变异对新生儿肥胖结局的影响的广泛模式。结果：使用贝叶斯网络模型将代谢组学和基因组学数据合并到母体变异的初始过滤器中，这些变异可能通过直接的母体遗传效应、间接的胎儿遗传效应、母体代谢效应或这些途径的某种组合影响新生儿肥胖结局。然后，将中介模型拟合到这些候选变体和相关结果中，以确定哪些途径（如果有的话）调解了总体效果。然后，我们根据这三种影响途径的相对大小对母体遗传变异进行分组。在对HAPO研究中现有的母婴数据的应用中，我们发现78个候选变异中，大多数仅通过直接母体或间接胎儿遗传效应影响新生儿出生体重（分别为37%和40%），通过这两种遗传效应影响新生儿出生体重的数量较少（14%），完全通过母体代谢途径影响新生儿出生体重的相对较少（6%），几乎没有通过母体代谢途径与两种遗传途径中的任何一种结合影响新生儿出生体重（3%）。我们还发现，这些中介效应的总体模式在不同的结果中是相似的。结论：我们的研究结果揭示了母体遗传变异对新生儿肥胖影响的广泛模式，并确定了新的遗传位点和先前文献中已知的影响新生儿肥胖的基因位点。这些结果表明，我们的多组学中介管道具有科学发现的潜力，并且该方法广泛适用于解开现代集成多组学领域中特定路径的贡献。

{"title":"Multi-omics mediation pipeline reveals differential pathways of maternal SNPs affecting newborn adiposity outcomes.","authors":"Nathan P Gill, Alan Kuang, Denise M Scholtens","doi":"10.1186/s12863-025-01355-w","DOIUrl":"10.1186/s12863-025-01355-w","url":null,"abstract":"Background: A great deal of previous research describes the impact of the maternal metabolic and genetic milieu on newborn adiposity outcomes. However, much of this research does not focus on all aspects of the problem simultaneously. Studies focusing on metabolic factors may not distinguish between maternal and fetal genetic pathways, while studies that do focus on these different genetic pathways may not incorporate metabolic information into effect estimates or variant classifications. In this paper, we introduce a novel multi-omics pipeline for maternal genetic variant selection and mediation effect testing that can handle all these pathways, and use it to investigate broad patterns in the effects of maternal genetic variants on newborn adiposity outcomes.Results: A Bayesian network model is used to incorporate both metabolomic and genomic data into an initial filter for maternal variants likely to affect newborn adiposity outcomes through a direct maternal genetic effect, an indirect fetal genetic effect, a maternal metabolic effect, or some combination of these pathways. A mediation model is then fit to these candidate variants and associated outcomes to identify which of these pathways, if any, mediate the total effect. We then group maternal genetic variants according to the relative magnitudes of these three effect pathways. In an application to existing mother-newborn data from the HAPO study, we find that of 78 candidate variants, the majority influence newborn birthweight solely through either a direct maternal or indirect fetal genetic effect (37% and 40%, respectively), a smaller number through both of these (14%), relatively few exclusively through the maternal metabolic pathway (6%), and almost none through a combination of the maternal metabolic pathway with either of the two genetic pathways (3%). We also find that these overall patterns of mediation effects are similar across outcomes.Conclusions: Our results reveal broad patterns in the effects of maternal genetic variants on newborn adiposity, and identify both new genetic loci and loci known from previous literature to influence newborn adiposity. These results demonstrate the potential for scientific discovery enabled by our multi-omics mediation pipeline, and the approach is broadly applicable for untangling path-specific contributions in the modern integrated multi-omics landscape.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"66"},"PeriodicalIF":2.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12466079/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145180589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genome-wide association study meta-analysis uncovers novel genetic variants associated with olfactory dysfunction. 全基因组关联研究荟萃分析揭示了与嗅觉功能障碍相关的新型遗传变异。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-09-17 DOI: 10.1186/s12863-025-01360-z

Mohammed Aslam Imtiaz, Konstantinos Melas, Adrienne Tin, Valentina Talevi, Honglei Chen, Myriam Fornage, Srishti Shrestha, Martin Gögele, David Emmert, Cristian Pattaro, Peter Pramstaller, Franz Förster, Katrin Horn, Thomas H Mosley, Christian Fuchsberger, Markus Scholz, Monique M B Breteler, N Ahmad Aziz

Background: Olfactory dysfunction is among the earliest signs of many age-related neurodegenerative diseases and has been associated with increased mortality in older adults; however, its genetic basis remains largely unknown. Therefore, here we aimed to elucidate its genetic architecture through a genome-wide association study meta-analysis (GWMA).

Methods: This GWMA included the participants of European ancestry (N = 22,730) enrolled in four different large population-based studies followed by a multi-ancestry GWMA including participants of African ancestry (N = 1,030). Olfactory dysfunction was assessed using a 12-item smell identification test.

Results: GWMA revealed a novel genome-wide significant locus (tagged by single nucleotide polymorphism rs11228623 at the 11q12 locus) associated with olfactory dysfunction. Gene-based analysis revealed a high enrichment for olfactory receptor genes in this region. Phenome-wide association studies demonstrated associations between genetic variants related to olfactory dysfunction and blood cell counts, kidney function, skeletal muscle mass, cholesterol levels and cardiovascular disease. Using individual-level data, we also confirmed and quantified the strength of these associations on a phenotypic level. Moreover, employing two-sample Mendelian Randomization analyses, we found evidence for causal associations between olfactory dysfunction and these phenotypes.

Conclusions: Our findings provide novel insights into the genetic architecture of the sense of smell and highlight its importance for many aspects of human health. Moreover, these findings could facilitate the identification and monitoring of individuals at increased risk of olfactory dysfunction and associated diseases.

背景：嗅觉功能障碍是许多与年龄相关的神经退行性疾病的早期症状之一，并与老年人死亡率增加有关；然而，其遗传基础在很大程度上仍然未知。因此，本研究旨在通过全基因组关联研究荟萃分析（GWMA）阐明其遗传结构。方法：该GWMA纳入了欧洲血统的参与者（N = 22730），他们参加了四项不同的基于人群的大型研究，随后是一项多血统的GWMA，包括非洲血统的参与者（N = 1030）。嗅觉功能障碍评估采用12项嗅觉识别测试。结果：GWMA发现了一个新的与嗅觉功能障碍相关的全基因组显著位点（在11q12位点上以单核苷酸多态性rs11228623标记）。基因分析显示该区域嗅觉受体基因高度富集。全现象关联研究表明，与嗅觉功能障碍相关的遗传变异与血细胞计数、肾功能、骨骼肌质量、胆固醇水平和心血管疾病之间存在关联。使用个体水平的数据，我们也在表型水平上证实并量化了这些关联的强度。此外，采用双样本孟德尔随机化分析，我们发现嗅觉功能障碍与这些表型之间存在因果关系的证据。结论：我们的发现为嗅觉的遗传结构提供了新的见解，并强调了嗅觉对人类健康的许多方面的重要性。此外，这些发现有助于识别和监测嗅觉功能障碍和相关疾病风险增加的个体。

{"title":"Genome-wide association study meta-analysis uncovers novel genetic variants associated with olfactory dysfunction.","authors":"Mohammed Aslam Imtiaz, Konstantinos Melas, Adrienne Tin, Valentina Talevi, Honglei Chen, Myriam Fornage, Srishti Shrestha, Martin Gögele, David Emmert, Cristian Pattaro, Peter Pramstaller, Franz Förster, Katrin Horn, Thomas H Mosley, Christian Fuchsberger, Markus Scholz, Monique M B Breteler, N Ahmad Aziz","doi":"10.1186/s12863-025-01360-z","DOIUrl":"10.1186/s12863-025-01360-z","url":null,"abstract":"Background: Olfactory dysfunction is among the earliest signs of many age-related neurodegenerative diseases and has been associated with increased mortality in older adults; however, its genetic basis remains largely unknown. Therefore, here we aimed to elucidate its genetic architecture through a genome-wide association study meta-analysis (GWMA).Methods: This GWMA included the participants of European ancestry (N = 22,730) enrolled in four different large population-based studies followed by a multi-ancestry GWMA including participants of African ancestry (N = 1,030). Olfactory dysfunction was assessed using a 12-item smell identification test.Results: GWMA revealed a novel genome-wide significant locus (tagged by single nucleotide polymorphism rs11228623 at the 11q12 locus) associated with olfactory dysfunction. Gene-based analysis revealed a high enrichment for olfactory receptor genes in this region. Phenome-wide association studies demonstrated associations between genetic variants related to olfactory dysfunction and blood cell counts, kidney function, skeletal muscle mass, cholesterol levels and cardiovascular disease. Using individual-level data, we also confirmed and quantified the strength of these associations on a phenotypic level. Moreover, employing two-sample Mendelian Randomization analyses, we found evidence for causal associations between olfactory dysfunction and these phenotypes.Conclusions: Our findings provide novel insights into the genetic architecture of the sense of smell and highlight its importance for many aspects of human health. Moreover, these findings could facilitate the identification and monitoring of individuals at increased risk of olfactory dysfunction and associated diseases.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"64"},"PeriodicalIF":2.5,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12445039/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145082371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Draft genome of the Cuban Painted Landsnail Polymita picta, International Mollusc of the year 2022. 古巴彩绘陆地蜗牛Polymita picta基因组草图，2022年国际软体动物。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-09-03 DOI: 10.1186/s12863-025-01356-9

Bernardo Reyes-Tur, Zeyuan Chen, Mario Juan Gordillo-Pérez, Alexander Ben Hamadou, Charlotte Gerheim, Carola Greve, Julia D Sigwart

Objective: The Cuban Painted Landsnail is an iconic endemic tree snail species with distinctive colourful shells used in traditional handicrafts. This species won the International Mollusc of the Year 2022 competition in an open public vote. As the competition prize, we have assembled the draft genome of this species.

Data description: Genomic DNA from Polymita picta (Born, 1778) was sequenced using PacBio HiFi sequencing with a yield of 5.3 million reads (41.4 Gb) and an N50 of 8.1 Kb. The genome size of P. picta was estimated to be 2.9 Gb, and the final assembly was 1.85 Gb, with a total of 22,619 contigs and a contig N50 of 124.2 Kb. BUSCO analysis of the genome assembly indicated a genome completeness of 88.4%, with 7% complete duplicated BUSCOs in metazoa_odb10. The draft genome will be a valuable resource for work on the endangered Cuban Painted Landsnail including monitoring genetic diversity and establishing captive breeding for conservation.

目的：古巴彩绘蜗牛是一种标志性的地方性树蜗牛，其独特的彩色外壳用于传统手工艺品。这个物种在公开投票中赢得了2022年国际软体动物大赛。作为比赛的奖品，我们已经组装了这个物种的基因组草图。数据描述：对Polymita picta（生于1778年）的基因组DNA进行PacBio HiFi测序，产率为530万reads (41.4 Gb)， N50为8.1 Kb。picta的基因组大小估计为2.9 Gb，最终组装量为1.85 Gb，共22,619个contigs， contigs N50为124.2 Kb。基因组组装的BUSCO分析表明，metazoa_odb10的基因组完整性为88.4%，其中有7%的基因组完全重复。基因组草案将成为研究濒临灭绝的古巴彩绘蜗牛的宝贵资源，包括监测遗传多样性和建立圈养繁殖保护。

{"title":"Draft genome of the Cuban Painted Landsnail Polymita picta, International Mollusc of the year 2022.","authors":"Bernardo Reyes-Tur, Zeyuan Chen, Mario Juan Gordillo-Pérez, Alexander Ben Hamadou, Charlotte Gerheim, Carola Greve, Julia D Sigwart","doi":"10.1186/s12863-025-01356-9","DOIUrl":"10.1186/s12863-025-01356-9","url":null,"abstract":"Objective: The Cuban Painted Landsnail is an iconic endemic tree snail species with distinctive colourful shells used in traditional handicrafts. This species won the International Mollusc of the Year 2022 competition in an open public vote. As the competition prize, we have assembled the draft genome of this species.Data description: Genomic DNA from Polymita picta (Born, 1778) was sequenced using PacBio HiFi sequencing with a yield of 5.3 million reads (41.4 Gb) and an N50 of 8.1 Kb. The genome size of P. picta was estimated to be 2.9 Gb, and the final assembly was 1.85 Gb, with a total of 22,619 contigs and a contig N50 of 124.2 Kb. BUSCO analysis of the genome assembly indicated a genome completeness of 88.4%, with 7% complete duplicated BUSCOs in metazoa_odb10. The draft genome will be a valuable resource for work on the endangered Cuban Painted Landsnail including monitoring genetic diversity and establishing captive breeding for conservation.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"63"},"PeriodicalIF":2.5,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12409939/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-quality genome assembly and annotation of live animal vaccine bacteria strains in South Korea. 韩国活体动物疫苗菌株的高质量基因组组装和注释。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-09-02 DOI: 10.1186/s12863-025-01357-8

Yeonkyeong Lee, Jin-Ju Nah, Hyun-Ok Ku, Il Jang

引用次数: 0

Complete genome sequence of the probiotic candidate strain Lacticaseibacillus rhamnosus B3421 isolated from Panax ginseng C. A. Meyer in South Korea. 韩国人参中益生菌候选菌株鼠李糖乳杆菌B3421的全基因组序列

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-08-28 DOI: 10.1186/s12863-025-01344-z

Gwi-Deuk Jin, Ho-Youn Kim, Eun Bae Kim, Bokyung Lee

Objectives: Lacticaseibacillus rhamnosus is a widely recognized probiotic bacteria with therapeutic applications in human and animal health. The L. rhamnosus B3421 strain, isolated from Panax ginseng, has been reported to be associated with antioxidant and anti-inflammatory properties, supporting its functional potential. We sequenced and analyzed the genome of L. rhamnosus B3421 to evaluate its probiotic potential for human healthcare and animal applications, focusing on genomic features related to safety and functionality.

Data description: In this study, we isolated L. rhamnosus B3421 from Panax ginseng C. A. Meyer (Ginseng) and performed whole-genome sequencing. The genome of L. rhamnosus B3421 consists of 3,000,051 base pairs (bp) with a guanine + cytosine (G + C) content of 46.70%. It encodes 59 transfer RNAs, 15 ribosomal RNAs, and 2,807 coding sequences (CDSs). Of these CDSs, 99.13% (2,758 proteins) were assigned to functional categories in the Clusters of Orthologous Group (COGs) classification system, while 49 proteins remained uncharacterized. Our genome analysis identified no antibiotic resistance (ABR) or antimicrobial resistance (AMR) genes, indicating that L. rhamnosus B3421 is a safe probiotic bacterium with minimal risk of contributing to the horizontal transfer of antibiotic resistance within the gut microbiome. Additionally, the genome contains genes associated with the ggmotif (PF10439), Enterocin X chain beta, and Carnocin CP52, as identified through BAGEL4 analysis, along with 24 other genes related to reductase or peroxidase activities. These genes may confer competitive advantages against pathogenic bacteria and oxidative stress. Our findings highlight the probiotic potential of L. rhamnosus B3421 and its prospective applications in promoting human and animal health.

目的：鼠李糖乳杆菌是一种广泛认可的益生菌，在人类和动物健康中具有治疗作用。L. rhamnosus B3421菌株是从人参中分离出来的，据报道具有抗氧化和抗炎特性，支持其功能潜力。我们对L. rhamnosus B3421的基因组进行了测序和分析，以评估其在人类保健和动物应用中的益生菌潜力，重点关注与安全性和功能相关的基因组特征。资料描述：本研究从人参中分离得到L. rhamnosus B3421，并进行全基因组测序。鼠李糖B3421基因组全长3,000,051个碱基对，鸟嘌呤+胞嘧啶（G + C）含量为46.70%。它编码59个转移rna， 15个核糖体rna和2807个编码序列（CDSs）。在这些CDSs中，99.13%（2,758个蛋白）在COGs分类系统中被分配到功能类别，而49个蛋白仍未被表征。我们的基因组分析未发现抗生素耐药（ABR）或抗菌素耐药（AMR）基因，这表明鼠李糖乳杆菌B3421是一种安全的益生菌，在肠道微生物群中导致抗生素耐药性水平转移的风险很小。此外，通过BAGEL4分析发现，该基因组包含与ggmotif （PF10439）、Enterocin X链β和Carnocin CP52相关的基因，以及其他24个与还原酶或过氧化物酶活性相关的基因。这些基因可能赋予对抗致病菌和氧化应激的竞争优势。我们的研究结果强调了鼠李糖B3421益生菌的潜力及其在促进人类和动物健康方面的潜在应用。

{"title":"Complete genome sequence of the probiotic candidate strain Lacticaseibacillus rhamnosus B3421 isolated from Panax ginseng C. A. Meyer in South Korea.","authors":"Gwi-Deuk Jin, Ho-Youn Kim, Eun Bae Kim, Bokyung Lee","doi":"10.1186/s12863-025-01344-z","DOIUrl":"https://doi.org/10.1186/s12863-025-01344-z","url":null,"abstract":"Objectives: Lacticaseibacillus rhamnosus is a widely recognized probiotic bacteria with therapeutic applications in human and animal health. The L. rhamnosus B3421 strain, isolated from Panax ginseng, has been reported to be associated with antioxidant and anti-inflammatory properties, supporting its functional potential. We sequenced and analyzed the genome of L. rhamnosus B3421 to evaluate its probiotic potential for human healthcare and animal applications, focusing on genomic features related to safety and functionality.Data description: In this study, we isolated L. rhamnosus B3421 from Panax ginseng C. A. Meyer (Ginseng) and performed whole-genome sequencing. The genome of L. rhamnosus B3421 consists of 3,000,051 base pairs (bp) with a guanine + cytosine (G + C) content of 46.70%. It encodes 59 transfer RNAs, 15 ribosomal RNAs, and 2,807 coding sequences (CDSs). Of these CDSs, 99.13% (2,758 proteins) were assigned to functional categories in the Clusters of Orthologous Group (COGs) classification system, while 49 proteins remained uncharacterized. Our genome analysis identified no antibiotic resistance (ABR) or antimicrobial resistance (AMR) genes, indicating that L. rhamnosus B3421 is a safe probiotic bacterium with minimal risk of contributing to the horizontal transfer of antibiotic resistance within the gut microbiome. Additionally, the genome contains genes associated with the ggmotif (PF10439), Enterocin X chain beta, and Carnocin CP52, as identified through BAGEL4 analysis, along with 24 other genes related to reductase or peroxidase activities. These genes may confer competitive advantages against pathogenic bacteria and oxidative stress. Our findings highlight the probiotic potential of L. rhamnosus B3421 and its prospective applications in promoting human and animal health.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"61"},"PeriodicalIF":2.5,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12395871/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dataset of 16S rRNA and ITS gene amplicon sequencing of celery and parsley rhizosphere soils. 芹菜和欧芹根际土壤16S rRNA和ITS基因扩增子测序数据集。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-08-25 DOI: 10.1186/s12863-025-01351-0

Olubukola Oluranti Babalola, Florence Oluwayemisi Ogundeji, Akinlolu Olalekan Akanmu

Objectives: This amplicon metagenomic study examines the relative abundance, taxonomic profiles and community structure of bacterial and fungal communities associated with the roots of parsley (Petroselinum crispum) and celery (Apium graveolens) under monocropping and intercropping systems. The study aims to provide a baseline understanding of how intercropping influences rhizosphere microbial dynamics.

Data description: The dataset provides insight into the effects of parsley-celery intercropping system on soil microbial richness, diversity and community structure. Amplicon metagenomic sequencing was performed on the DNA samples, targeting the 16S rRNA gene (V3-V4 region) and the ITS region for bacterial and fungal communities, respectively. The quantified libraries were pooled and sequenced using Illumina platforms, and the raw sequences were analyzed using the Quantitative Insights Into Microbial Ecology (QIIME 2 version 2019.1.) pipeline. The resulting Amplicon Sequence Variant (ASV) profiles revealed Actinobacteria and Protobacteria as the most predominant bacteria phyla, followed by Bacteroidota, Gemmatimonadota and Acidobacteriaota. The most predominant taxonomic distribution of fungi at the phylum level includes Ascomycota and Mortierellomycota. The dataset includes raw sequence reads in FASTQ format (.fastq.gz), which have been deposited in the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under the Bioproject Accession numbers; SRP540554 (16S rRNA) and SRP540675 (ITS).

目的：通过扩增子宏基因组研究，研究了单作和间作条件下欧芹（Petroselinum crispum）和芹菜（Apium graveolens）根系相关细菌和真菌群落的相对丰度、分类特征和群落结构。该研究旨在为间作如何影响根际微生物动力学提供一个基本的认识。数据说明：该数据集揭示了欧芹间作制度对土壤微生物丰富度、多样性和群落结构的影响。对DNA样本进行扩增子宏基因组测序，分别针对细菌群落的16S rRNA基因（V3-V4区）和真菌群落的ITS区。使用Illumina平台对定量文库进行汇总和测序，使用Quantitative Insights Into Microbial Ecology （QIIME 2 version 2019.1.）流水线对原始序列进行分析。扩增子序列变异（Amplicon Sequence Variant， ASV）显示放线菌门和原细菌门是最主要的菌门，其次是拟杆菌门、双歧杆菌门和酸杆菌门。在门水平上，真菌最主要的分类分布包括子囊菌门和Mortierellomycota门。该数据集包括FASTQ格式（.fastq.gz）的原始序列读取，已存放在国家生物技术信息中心（NCBI）的序列读取档案（SRA）中，编号为Bioproject Accession number；SRP540554 （16S rRNA）和SRP540675 （ITS）。

{"title":"Dataset of 16S rRNA and ITS gene amplicon sequencing of celery and parsley rhizosphere soils.","authors":"Olubukola Oluranti Babalola, Florence Oluwayemisi Ogundeji, Akinlolu Olalekan Akanmu","doi":"10.1186/s12863-025-01351-0","DOIUrl":"https://doi.org/10.1186/s12863-025-01351-0","url":null,"abstract":"Objectives: This amplicon metagenomic study examines the relative abundance, taxonomic profiles and community structure of bacterial and fungal communities associated with the roots of parsley (Petroselinum crispum) and celery (Apium graveolens) under monocropping and intercropping systems. The study aims to provide a baseline understanding of how intercropping influences rhizosphere microbial dynamics.Data description: The dataset provides insight into the effects of parsley-celery intercropping system on soil microbial richness, diversity and community structure. Amplicon metagenomic sequencing was performed on the DNA samples, targeting the 16S rRNA gene (V3-V4 region) and the ITS region for bacterial and fungal communities, respectively. The quantified libraries were pooled and sequenced using Illumina platforms, and the raw sequences were analyzed using the Quantitative Insights Into Microbial Ecology (QIIME 2 version 2019.1.) pipeline. The resulting Amplicon Sequence Variant (ASV) profiles revealed Actinobacteria and Protobacteria as the most predominant bacteria phyla, followed by Bacteroidota, Gemmatimonadota and Acidobacteriaota. The most predominant taxonomic distribution of fungi at the phylum level includes Ascomycota and Mortierellomycota. The dataset includes raw sequence reads in FASTQ format (.fastq.gz), which have been deposited in the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under the Bioproject Accession numbers; SRP540554 (16S rRNA) and SRP540675 (ITS).","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"60"},"PeriodicalIF":2.5,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12376418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Oar_Finn: the genome assembly and annotation of an exceptionally fertile Finnsheep (Ovis aries) Ewe. Oar_Finn：一个特别肥沃的芬兰羊（Ovis aries）母羊的基因组组装和注释。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-08-19 DOI: 10.1186/s12863-025-01354-x

Juha Kantanen, Melak Weldenegodguad, Kisun Pokharel

Objectives: Finnsheep, a highly prolific breed of sheep, has been globally exported for improving fertility traits of many sheep breeds. Published genomic studies of Finnsheep have been based on Texel and Rambouillet reference genomes, which may not capture its unique genetic features. Our main objective was to generate a high-quality Finnsheep genome assembly and its annotation that could serve as a breed-specific reference for studying fertility and adaptation in Finnsheep and other short-tailed northern European sheep breeds.

Data description: We generated a 2.53 Gb assembly using PacBio HiFi long-reads and Hi-C sequencing from a highly fertile Finnsheep ewe. The assembly, scaffolded with Hi-C data has a contig N50 of 35.5 Mb and scaffold N50 of 100.6 Mb. Gene annotation identified 42,533 genes spanning 46.5 Mb of coding region. BUSCO completeness for the assembly and annotation was 94.9% and 84.3%, respectively. This data, including raw reads, assembly, and annotations supports genomic studies of Finnsheep and other prolific breeds of sheep that are particularly adapted to northern European environments.

目的：Finnsheep是一种高产绵羊品种，已出口到全球，用于改善许多绵羊品种的肥力性状。已发表的Finnsheep基因组研究基于Texel和Rambouillet参考基因组，可能无法捕捉其独特的遗传特征。我们的主要目标是生成一个高质量的芬兰羊基因组组合及其注释，可以作为研究芬兰羊和其他短尾北欧羊品种的生育力和适应性的品种特异性参考。数据描述：我们使用PacBio HiFi长读和Hi-C测序从一只高生育能力的Finnsheep母羊中生成了2.53 Gb的组装。以Hi-C数据为支架的组装体N50为35.5 Mb， scaffold N50为100.6 Mb。基因注释鉴定出42533个基因，横跨46.5 Mb的编码区。装配和注释的BUSCO完备性分别为94.9%和84.3%。这些数据，包括原始读取、组装和注释，支持芬兰羊和其他特别适应北欧环境的多产绵羊品种的基因组研究。

{"title":"Oar_Finn: the genome assembly and annotation of an exceptionally fertile Finnsheep (Ovis aries) Ewe.","authors":"Juha Kantanen, Melak Weldenegodguad, Kisun Pokharel","doi":"10.1186/s12863-025-01354-x","DOIUrl":"10.1186/s12863-025-01354-x","url":null,"abstract":"Objectives: Finnsheep, a highly prolific breed of sheep, has been globally exported for improving fertility traits of many sheep breeds. Published genomic studies of Finnsheep have been based on Texel and Rambouillet reference genomes, which may not capture its unique genetic features. Our main objective was to generate a high-quality Finnsheep genome assembly and its annotation that could serve as a breed-specific reference for studying fertility and adaptation in Finnsheep and other short-tailed northern European sheep breeds.Data description: We generated a 2.53 Gb assembly using PacBio HiFi long-reads and Hi-C sequencing from a highly fertile Finnsheep ewe. The assembly, scaffolded with Hi-C data has a contig N50 of 35.5 Mb and scaffold N50 of 100.6 Mb. Gene annotation identified 42,533 genes spanning 46.5 Mb of coding region. BUSCO completeness for the assembly and annotation was 94.9% and 84.3%, respectively. This data, including raw reads, assembly, and annotations supports genomic studies of Finnsheep and other prolific breeds of sheep that are particularly adapted to northern European environments.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"59"},"PeriodicalIF":2.5,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12362895/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Complete genome sequence of the halophilic archaeon Haloferax volcanii PC0224, isolated from a solar saltern in Thailand. 从泰国太阳盐沼分离的嗜盐古菌Haloferax volcanii PC0224的全基因组序列。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-08-18 DOI: 10.1186/s12863-025-01353-y

Manassanan Phatcharaharikarn, Pattarawan Ruangsuj, Thunwarat Songngamsuk, Parweenuch Santaweesuk, Marut Tangwattanachuleeporn, Kanokporn Srisucharitpanit, Prapimpun Wongchitrat, Montri Yasawong

Objectives: Haloferax volcanii is an extreme halophile belonging to the Haloferacaceae family that thrives in hypersaline environments. This study presents the complete genome sequence of the H. volcanii strain PC0224 isolated from a Thai solar saltern. Genomic data will enhance our understanding of circadian rhythm mechanisms and their evolutionary significance in extremophilic archaea.

Data description: The H. volcanii PC0224 genome comprises four circular sequences containing 3,773,977 bp and 66.16% GC content. Sequenced using Illumina and PacBio technologies and assembled with Hybracter, the genome contained 3,731 CDS, 6 rRNAs, 54 tRNAs, 2 ncRNAs, and 4 CRISPR arrays. This dataset enables the investigation of circadian rhythm regulatory systems and comparative genomic studies of temporal adaptation mechanisms in extremophilic archaea.

目的：火山盐藻是一种极端嗜盐菌，属于盐藻科，在高盐环境中茁壮成长。本研究提出了从泰国太阳盐沼分离的H. volcanii菌株PC0224的全基因组序列。基因组数据将增强我们对嗜极古菌昼夜节律机制及其进化意义的理解。资料描述：H. volcanii PC0224基因组由4个圆形序列组成，包含3,773,977 bp， GC含量为66.16%。利用Illumina和PacBio技术进行测序，并用Hybracter进行组装，基因组包含3731个CDS、6个rnas、54个trna、2个ncrna和4个CRISPR阵列。该数据集可用于研究嗜极古菌的昼夜节律调节系统和时间适应机制的比较基因组研究。

引用次数: 0

Development of a composite core collection from 5,856 Sesame accessions being conserved in the Indian National Genebank. 从印度国家基因库保存的5,856份芝麻种质资源中建立复合核心馆藏。

IF 2.5 Q3 GENETICS & HEREDITY

BMC genomic data

Pub Date : 2025-08-18 DOI: 10.1186/s12863-025-01347-w

Pradeep Ruperao, Kapil Tiwari, Vandana Rai, Rashmi Yadav, Mahalingam Angamuthu, Anuj Kumar Singh, Bhemji P Galvadiya, Anshuman Shah, Nitin Gadol, Ajay Kumar, Rajkumar Subramani, Harinder Vishwakarma, Pradheep Kanakasabapathi, Senthilraja Govindasamy, Rasna Maurya, Tamanna Batra, Aravind Jayaraman, Senthil Ramachandran, Abhishek Rathore, Kuldeep Singh, Rakesh Singh, Sanjay Kalia, Ulavappa B Angadi, Sean Mayes, Gyanendra Pratap Singh, Parimalan Rangan

Objectives: A composite core collection (CCC) in sesame (Sesamum indicum L.) will help utilize genetic resources efficiently. This study reports, using genomics tools, a representative minimal set (CCC) that capture maximal genetic diversity from a set of 5,856 sesame accessions being conserved at the National Genebank (NGB) of the ICAR-NBPGR. The CCC will serve as a valuable resource for researchers and breeders to facilitate sesame improvement for traits such as yield, disease resistance, stress resilience, and nutritional content. Ultimately, this work contributes to the broader goal of improving sesame for an ever-increasing demand for vegetable oil, to meet our food security challenges.

Data description: This study presents ddRAD-seq data for a total of 5,856 sesame accessions that includes 2,496 accessions (a subset of 5,856 accessions) that was reported by us recently. Using next-generation sequencing (NGS) short-reads over 2.16 Terabases of sequence data were generated, with each sample averaging 1.2 million reads. The study identifies a set of 1,768 sesame accessions as the CCC that captures maximal diversity, genotypic and phenotypic. This will aid researchers in trait discovery, association studies, pre-breeding, and parental selection for complex traits viz., yield, disease resistance, stress resilience, and other economically important traits.

目的：建立芝麻（Sesamum indicum L.）复合核心种质资源，有助于有效利用芝麻遗传资源。本研究报告了使用基因组学工具，从ICAR-NBPGR国家基因库（NGB）保存的一组5,856份芝麻材料中捕获最大遗传多样性的代表性最小集（CCC）。该中心将为研究人员和育种者提供宝贵的资源，以促进芝麻在产量、抗病性、抗逆性和营养成分等性状方面的改进。最终，这项工作有助于改善芝麻的更广泛目标，以满足不断增长的植物油需求，以应对我们的粮食安全挑战。数据描述：本研究提供了总计5856份芝麻的ddRAD-seq数据，其中包括我们最近报道的2496份（5856份的一个子集）。利用下一代测序技术（NGS）生成了超过2.16 tb的序列数据，每个样本平均为120万reads。该研究确定了一组1768份芝麻材料作为捕获最大多样性，基因型和表型的CCC。这将有助于研究人员进行性状发现、关联研究、预育种和复杂性状的亲本选择，如产量、抗病性、抗逆性和其他经济上重要的性状。

{"title":"Development of a composite core collection from 5,856 Sesame accessions being conserved in the Indian National Genebank.","authors":"Pradeep Ruperao, Kapil Tiwari, Vandana Rai, Rashmi Yadav, Mahalingam Angamuthu, Anuj Kumar Singh, Bhemji P Galvadiya, Anshuman Shah, Nitin Gadol, Ajay Kumar, Rajkumar Subramani, Harinder Vishwakarma, Pradheep Kanakasabapathi, Senthilraja Govindasamy, Rasna Maurya, Tamanna Batra, Aravind Jayaraman, Senthil Ramachandran, Abhishek Rathore, Kuldeep Singh, Rakesh Singh, Sanjay Kalia, Ulavappa B Angadi, Sean Mayes, Gyanendra Pratap Singh, Parimalan Rangan","doi":"10.1186/s12863-025-01347-w","DOIUrl":"10.1186/s12863-025-01347-w","url":null,"abstract":"Objectives: A composite core collection (CCC) in sesame (Sesamum indicum L.) will help utilize genetic resources efficiently. This study reports, using genomics tools, a representative minimal set (CCC) that capture maximal genetic diversity from a set of 5,856 sesame accessions being conserved at the National Genebank (NGB) of the ICAR-NBPGR. The CCC will serve as a valuable resource for researchers and breeders to facilitate sesame improvement for traits such as yield, disease resistance, stress resilience, and nutritional content. Ultimately, this work contributes to the broader goal of improving sesame for an ever-increasing demand for vegetable oil, to meet our food security challenges.Data description: This study presents ddRAD-seq data for a total of 5,856 sesame accessions that includes 2,496 accessions (a subset of 5,856 accessions) that was reported by us recently. Using next-generation sequencing (NGS) short-reads over 2.16 Terabases of sequence data were generated, with each sample averaging 1.2 million reads. The study identifies a set of 1,768 sesame accessions as the CCC that captures maximal diversity, genotypic and phenotypic. This will aid researchers in trait discovery, association studies, pre-breeding, and parental selection for complex traits viz., yield, disease resistance, stress resilience, and other economically important traits.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"57"},"PeriodicalIF":2.5,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12363036/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0