Pub Date : 2025-09-29DOI: 10.1186/s12863-025-01366-7
Hongyu Chen, Ye Yang, Bo Wang, Ying Yu, Qingwen Sun
Background: Leafdevelopment represents a crucial stage in the plant life cycle, involving complex morphogenetic and physiological processes governed by evolving molecular mechanisms and metabolite profiles. The growth and maturation of Angiopteris fokiensis Hieron, a species used in traditional Chinese medicine, are characterized by fluctuating metabolite accumulation patterns regulated by largely unknown molecular pathways.
Results: Touncover these pathways, we employed next-generation sequencing to construct the A. fokiensis leaf transcriptome at two distinct developmental stages, allowing for a comprehensive analysis of gene expression dynamics while emphasizing the identification of genes that regulate leaf development and metabolite synthesis. The de novo assembly of high-quality sequencing reads generated 117,627 unigenes averaging 1,308 base pairs in length. FPKM analysis uncovered significant transcriptomic alterations during leaf development. Additionally, non-targeted metabolomics identified 1,494 distinct analytes, with lipids representing the most abundant metabolite class in both A. fokiensis samples. In the 'phenylalanine, tyrosine and tryptophan biosynthesis' pathway, two downregulated arogenate dehydrogenase (NADP+) genes (Unigene23378-S4 and Unigene47537-S2) in Stage1 correlated with reduced L-tyrosine levels. In the 'galactose metabolism' pathway, the upregulation of three beta-galactosidase genes (Unigene43641-S6, Unigene43648-S6, Unigene47074-S1) and the downregulation of one (Unigene28294-S2) corresponded to decreased alpha-lactose levels.
Conclusions: This study provides an in-depth examination of the dynamic transcriptomic and metabolomic changes occurring during A. fokiensis leaf development, revealing key regulatory networks and enhancing the annotation of theA. fokiensis genome. These findings lay a crucial groundwork for future research on this medicinal plant.
{"title":"Transcriptome characterization and metabolite accumulation: novel insights into metabolite biosynthesis during Angiopteris fokiensis leaf development.","authors":"Hongyu Chen, Ye Yang, Bo Wang, Ying Yu, Qingwen Sun","doi":"10.1186/s12863-025-01366-7","DOIUrl":"10.1186/s12863-025-01366-7","url":null,"abstract":"<p><strong>Background: </strong>Leafdevelopment represents a crucial stage in the plant life cycle, involving complex morphogenetic and physiological processes governed by evolving molecular mechanisms and metabolite profiles. The growth and maturation of Angiopteris fokiensis Hieron, a species used in traditional Chinese medicine, are characterized by fluctuating metabolite accumulation patterns regulated by largely unknown molecular pathways.</p><p><strong>Results: </strong>Touncover these pathways, we employed next-generation sequencing to construct the A. fokiensis leaf transcriptome at two distinct developmental stages, allowing for a comprehensive analysis of gene expression dynamics while emphasizing the identification of genes that regulate leaf development and metabolite synthesis. The de novo assembly of high-quality sequencing reads generated 117,627 unigenes averaging 1,308 base pairs in length. FPKM analysis uncovered significant transcriptomic alterations during leaf development. Additionally, non-targeted metabolomics identified 1,494 distinct analytes, with lipids representing the most abundant metabolite class in both A. fokiensis samples. In the 'phenylalanine, tyrosine and tryptophan biosynthesis' pathway, two downregulated arogenate dehydrogenase (NADP+) genes (Unigene23378-S4 and Unigene47537-S2) in Stage1 correlated with reduced L-tyrosine levels. In the 'galactose metabolism' pathway, the upregulation of three beta-galactosidase genes (Unigene43641-S6, Unigene43648-S6, Unigene47074-S1) and the downregulation of one (Unigene28294-S2) corresponded to decreased alpha-lactose levels.</p><p><strong>Conclusions: </strong>This study provides an in-depth examination of the dynamic transcriptomic and metabolomic changes occurring during A. fokiensis leaf development, revealing key regulatory networks and enhancing the annotation of theA. fokiensis genome. These findings lay a crucial groundwork for future research on this medicinal plant.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"70"},"PeriodicalIF":2.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12482733/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145194041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p><strong>Background: </strong>Sorghum, a diploid C4 cereal (2n = 2x = 20) with a 750 Mbp genome, is widely adaptable to tropical and temperate climates. As its center of origin and diversity, Ethiopia holds valuable genetic variation for improving yield and nutritional traits. This study aimed to identify and functionally characterize quantitative trait nucleotides (QTNs) linked to key agronomic and yield-related traits and their associated candidate genes.</p><p><strong>Methods: </strong>Two hundred sixteen sorghum genotypes were evaluated over two seasons in northwestern Ethiopia using an alpha lattice design. Agronomic traits assessed included days to flowering, days to maturity, plant height, seed number per plant, seed yield, and thousand-seed weight. Genotyping-by-sequencing (GBS) generated 351,692 SNPs, with 50,165 high-quality markers retained. Candidate gene identification and functional characterization were carried out using a combination of bioinformatics tools and publicly available databases. Data normalization and analysis were conducted using META-R and SAS JMP. Linkage disequilibrium was assessed via TASSEL 5.0, and multi-locus genome-wide association study (ML-GWAS) identified significant QTNs (LOD ≥ 4.0) associated with phenotypic traits.</p><p><strong>Result: </strong>This study investigates the genetic basis of key agronomic and yield related traits in sorghum by identifying QTNs associated with phenotypic variation. Descriptive statistics revealed notable variability in traits such as days to flowering (101 days), days to maturity (145.77 days), plant height (357.47 cm), seed number per plant (1808.92 count), seed yield (45.07 g), and thousand-seed weight (23.44 g). Correlation analysis showed strong relationships, particularly between days to flowering and maturity (r = 0.7058). ML-GWAS detected 176 QTNs across all 10 chromosomes, with 34 considered reliable Due to their consistent identification across multiple models. 117 candidate genes were mapped to these QTNs, associated with six major traits: 20 for flowering time, 16 for maturity, 16 for plant height, 17 for seed number per plant, 38 for seed yield, and 10 for seed weight. Key genes included Sobic.001G196700 (flowering time) and Sobic.005G176100 (stress responses). Two important regulatory genes, SbMADS1 and SbFT, were highlighted for their roles in flowering regulation. SbMADS1 influences days to flowering, while SbFT acts as a mobile signal integrating photoperiod cues. These genes are involved in starch and sucrose metabolism pathways, essential for energy storage and mobilization, thereby supporting improved growth and yield in sorghum.</p><p><strong>Conclusion: </strong>This study highlights the complexity of trait inheritance shaped by diverse genetic factors and underscores the significance of major, stable, and unique QTNs for marker-assisted selection. Functional genome annotation revealed that candidate genes are involved in key biological processes and
{"title":"Genome-wide identification of QTNs and candidate genes in Ethiopian sorghum (Sorghum bicolor (L.) moench) landraces using SNP-based approaches.","authors":"Addisu Getahun, Habte Nida, Adugna Abdi Woldesemayat","doi":"10.1186/s12863-025-01350-1","DOIUrl":"10.1186/s12863-025-01350-1","url":null,"abstract":"<p><strong>Background: </strong>Sorghum, a diploid C4 cereal (2n = 2x = 20) with a 750 Mbp genome, is widely adaptable to tropical and temperate climates. As its center of origin and diversity, Ethiopia holds valuable genetic variation for improving yield and nutritional traits. This study aimed to identify and functionally characterize quantitative trait nucleotides (QTNs) linked to key agronomic and yield-related traits and their associated candidate genes.</p><p><strong>Methods: </strong>Two hundred sixteen sorghum genotypes were evaluated over two seasons in northwestern Ethiopia using an alpha lattice design. Agronomic traits assessed included days to flowering, days to maturity, plant height, seed number per plant, seed yield, and thousand-seed weight. Genotyping-by-sequencing (GBS) generated 351,692 SNPs, with 50,165 high-quality markers retained. Candidate gene identification and functional characterization were carried out using a combination of bioinformatics tools and publicly available databases. Data normalization and analysis were conducted using META-R and SAS JMP. Linkage disequilibrium was assessed via TASSEL 5.0, and multi-locus genome-wide association study (ML-GWAS) identified significant QTNs (LOD ≥ 4.0) associated with phenotypic traits.</p><p><strong>Result: </strong>This study investigates the genetic basis of key agronomic and yield related traits in sorghum by identifying QTNs associated with phenotypic variation. Descriptive statistics revealed notable variability in traits such as days to flowering (101 days), days to maturity (145.77 days), plant height (357.47 cm), seed number per plant (1808.92 count), seed yield (45.07 g), and thousand-seed weight (23.44 g). Correlation analysis showed strong relationships, particularly between days to flowering and maturity (r = 0.7058). ML-GWAS detected 176 QTNs across all 10 chromosomes, with 34 considered reliable Due to their consistent identification across multiple models. 117 candidate genes were mapped to these QTNs, associated with six major traits: 20 for flowering time, 16 for maturity, 16 for plant height, 17 for seed number per plant, 38 for seed yield, and 10 for seed weight. Key genes included Sobic.001G196700 (flowering time) and Sobic.005G176100 (stress responses). Two important regulatory genes, SbMADS1 and SbFT, were highlighted for their roles in flowering regulation. SbMADS1 influences days to flowering, while SbFT acts as a mobile signal integrating photoperiod cues. These genes are involved in starch and sucrose metabolism pathways, essential for energy storage and mobilization, thereby supporting improved growth and yield in sorghum.</p><p><strong>Conclusion: </strong>This study highlights the complexity of trait inheritance shaped by diverse genetic factors and underscores the significance of major, stable, and unique QTNs for marker-assisted selection. Functional genome annotation revealed that candidate genes are involved in key biological processes and ","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"67"},"PeriodicalIF":2.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465425/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145180591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-26DOI: 10.1186/s12863-025-01363-w
Andrey Shelenkov, Anna Slavokhotova, Mariyam Yunusova, Vladimir Kulikov, Yulia Mikhaylova, Vasiliy Akimkin
Background: Bacterial infections pose a global health threat across clinical and community settings. Over the past decade, the alarming expansion of antimicrobial resistance (AMR) has progressively narrowed therapeutic options, particularly for healthcare-associated infections. This critical situation has been formally recognized by the World Health Organization as a major public health concern. Epidemiological studies have demonstrated that the dissemination of AMR is frequently mediated by specific high-risk bacterial lineages, often designated as "global clones" or "clonal complexes." Consequently, surveillance of these epidemic clones and elucidation of their pathogenic mechanisms and AMR acquisition pathways have become essential research priorities. The advent of whole genome sequencing has revolutionized these investigations, enabling comprehensive epidemiological tracking and detailed analysis of mobile genetic elements responsible for resistance gene transfer. However, despite the exponential increase in available bacterial genome sequences, significant challenges persist. Current genomic datasets often suffer from uneven representation of clinically relevant strains and inconsistent availability of accompanying metadata. These limitations create substantial obstacles for large-scale comparative studies and hinder effective surveillance efforts.
Description: This database represents a comprehensive genomic analysis of 98,950 Staphylococcus aureus isolates, a high-priority bacterial pathogen of global clinical significance. We provide detailed isolate characterization through several established typing schemes including multilocus sequence typing (MLST), clonal complex (CC) assignments, spa typing results, and core genome MLST (cgMLST) profiles. The dataset also documents the presence of CRISPR-Cas systems in these isolates. Beyond fundamental typing data, our resource incorporates the distribution of antimicrobial resistance determinants, virulence factors, and plasmid replicons. These systematically curated genomic features offer researchers valuable insights into isolate epidemiology, resistance mechanisms, and horizontal gene transfer patterns in this highly concerning pathogen.
Conclusion: This database is freely available under CC BY-NC-SA at https://doi.org/10.5281/zenodo.14833440 . The data provided enables researchers to identify optimal reference isolates for various genomic studies, supporting critical investigations into S. aureus epidemiology and antimicrobial resistance evolution. This resource will ultimately inform the development of more effective prevention and control measures against this high-priority pathogen.
{"title":"Genomic typing, antimicrobial resistance gene, virulence factor and plasmid replicon database for the important pathogenic bacteria Staphylococcus aureus.","authors":"Andrey Shelenkov, Anna Slavokhotova, Mariyam Yunusova, Vladimir Kulikov, Yulia Mikhaylova, Vasiliy Akimkin","doi":"10.1186/s12863-025-01363-w","DOIUrl":"10.1186/s12863-025-01363-w","url":null,"abstract":"<p><strong>Background: </strong>Bacterial infections pose a global health threat across clinical and community settings. Over the past decade, the alarming expansion of antimicrobial resistance (AMR) has progressively narrowed therapeutic options, particularly for healthcare-associated infections. This critical situation has been formally recognized by the World Health Organization as a major public health concern. Epidemiological studies have demonstrated that the dissemination of AMR is frequently mediated by specific high-risk bacterial lineages, often designated as \"global clones\" or \"clonal complexes.\" Consequently, surveillance of these epidemic clones and elucidation of their pathogenic mechanisms and AMR acquisition pathways have become essential research priorities. The advent of whole genome sequencing has revolutionized these investigations, enabling comprehensive epidemiological tracking and detailed analysis of mobile genetic elements responsible for resistance gene transfer. However, despite the exponential increase in available bacterial genome sequences, significant challenges persist. Current genomic datasets often suffer from uneven representation of clinically relevant strains and inconsistent availability of accompanying metadata. These limitations create substantial obstacles for large-scale comparative studies and hinder effective surveillance efforts.</p><p><strong>Description: </strong>This database represents a comprehensive genomic analysis of 98,950 Staphylococcus aureus isolates, a high-priority bacterial pathogen of global clinical significance. We provide detailed isolate characterization through several established typing schemes including multilocus sequence typing (MLST), clonal complex (CC) assignments, spa typing results, and core genome MLST (cgMLST) profiles. The dataset also documents the presence of CRISPR-Cas systems in these isolates. Beyond fundamental typing data, our resource incorporates the distribution of antimicrobial resistance determinants, virulence factors, and plasmid replicons. These systematically curated genomic features offer researchers valuable insights into isolate epidemiology, resistance mechanisms, and horizontal gene transfer patterns in this highly concerning pathogen.</p><p><strong>Conclusion: </strong>This database is freely available under CC BY-NC-SA at https://doi.org/10.5281/zenodo.14833440 . The data provided enables researchers to identify optimal reference isolates for various genomic studies, supporting critical investigations into S. aureus epidemiology and antimicrobial resistance evolution. This resource will ultimately inform the development of more effective prevention and control measures against this high-priority pathogen.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"65"},"PeriodicalIF":2.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145180607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-26DOI: 10.1186/s12863-025-01355-w
Nathan P Gill, Alan Kuang, Denise M Scholtens
Background: A great deal of previous research describes the impact of the maternal metabolic and genetic milieu on newborn adiposity outcomes. However, much of this research does not focus on all aspects of the problem simultaneously. Studies focusing on metabolic factors may not distinguish between maternal and fetal genetic pathways, while studies that do focus on these different genetic pathways may not incorporate metabolic information into effect estimates or variant classifications. In this paper, we introduce a novel multi-omics pipeline for maternal genetic variant selection and mediation effect testing that can handle all these pathways, and use it to investigate broad patterns in the effects of maternal genetic variants on newborn adiposity outcomes.
Results: A Bayesian network model is used to incorporate both metabolomic and genomic data into an initial filter for maternal variants likely to affect newborn adiposity outcomes through a direct maternal genetic effect, an indirect fetal genetic effect, a maternal metabolic effect, or some combination of these pathways. A mediation model is then fit to these candidate variants and associated outcomes to identify which of these pathways, if any, mediate the total effect. We then group maternal genetic variants according to the relative magnitudes of these three effect pathways. In an application to existing mother-newborn data from the HAPO study, we find that of 78 candidate variants, the majority influence newborn birthweight solely through either a direct maternal or indirect fetal genetic effect (37% and 40%, respectively), a smaller number through both of these (14%), relatively few exclusively through the maternal metabolic pathway (6%), and almost none through a combination of the maternal metabolic pathway with either of the two genetic pathways (3%). We also find that these overall patterns of mediation effects are similar across outcomes.
Conclusions: Our results reveal broad patterns in the effects of maternal genetic variants on newborn adiposity, and identify both new genetic loci and loci known from previous literature to influence newborn adiposity. These results demonstrate the potential for scientific discovery enabled by our multi-omics mediation pipeline, and the approach is broadly applicable for untangling path-specific contributions in the modern integrated multi-omics landscape.
{"title":"Multi-omics mediation pipeline reveals differential pathways of maternal SNPs affecting newborn adiposity outcomes.","authors":"Nathan P Gill, Alan Kuang, Denise M Scholtens","doi":"10.1186/s12863-025-01355-w","DOIUrl":"10.1186/s12863-025-01355-w","url":null,"abstract":"<p><strong>Background: </strong>A great deal of previous research describes the impact of the maternal metabolic and genetic milieu on newborn adiposity outcomes. However, much of this research does not focus on all aspects of the problem simultaneously. Studies focusing on metabolic factors may not distinguish between maternal and fetal genetic pathways, while studies that do focus on these different genetic pathways may not incorporate metabolic information into effect estimates or variant classifications. In this paper, we introduce a novel multi-omics pipeline for maternal genetic variant selection and mediation effect testing that can handle all these pathways, and use it to investigate broad patterns in the effects of maternal genetic variants on newborn adiposity outcomes.</p><p><strong>Results: </strong>A Bayesian network model is used to incorporate both metabolomic and genomic data into an initial filter for maternal variants likely to affect newborn adiposity outcomes through a direct maternal genetic effect, an indirect fetal genetic effect, a maternal metabolic effect, or some combination of these pathways. A mediation model is then fit to these candidate variants and associated outcomes to identify which of these pathways, if any, mediate the total effect. We then group maternal genetic variants according to the relative magnitudes of these three effect pathways. In an application to existing mother-newborn data from the HAPO study, we find that of 78 candidate variants, the majority influence newborn birthweight solely through either a direct maternal or indirect fetal genetic effect (37% and 40%, respectively), a smaller number through both of these (14%), relatively few exclusively through the maternal metabolic pathway (6%), and almost none through a combination of the maternal metabolic pathway with either of the two genetic pathways (3%). We also find that these overall patterns of mediation effects are similar across outcomes.</p><p><strong>Conclusions: </strong>Our results reveal broad patterns in the effects of maternal genetic variants on newborn adiposity, and identify both new genetic loci and loci known from previous literature to influence newborn adiposity. These results demonstrate the potential for scientific discovery enabled by our multi-omics mediation pipeline, and the approach is broadly applicable for untangling path-specific contributions in the modern integrated multi-omics landscape.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"66"},"PeriodicalIF":2.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12466079/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145180589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17DOI: 10.1186/s12863-025-01360-z
Mohammed Aslam Imtiaz, Konstantinos Melas, Adrienne Tin, Valentina Talevi, Honglei Chen, Myriam Fornage, Srishti Shrestha, Martin Gögele, David Emmert, Cristian Pattaro, Peter Pramstaller, Franz Förster, Katrin Horn, Thomas H Mosley, Christian Fuchsberger, Markus Scholz, Monique M B Breteler, N Ahmad Aziz
Background: Olfactory dysfunction is among the earliest signs of many age-related neurodegenerative diseases and has been associated with increased mortality in older adults; however, its genetic basis remains largely unknown. Therefore, here we aimed to elucidate its genetic architecture through a genome-wide association study meta-analysis (GWMA).
Methods: This GWMA included the participants of European ancestry (N = 22,730) enrolled in four different large population-based studies followed by a multi-ancestry GWMA including participants of African ancestry (N = 1,030). Olfactory dysfunction was assessed using a 12-item smell identification test.
Results: GWMA revealed a novel genome-wide significant locus (tagged by single nucleotide polymorphism rs11228623 at the 11q12 locus) associated with olfactory dysfunction. Gene-based analysis revealed a high enrichment for olfactory receptor genes in this region. Phenome-wide association studies demonstrated associations between genetic variants related to olfactory dysfunction and blood cell counts, kidney function, skeletal muscle mass, cholesterol levels and cardiovascular disease. Using individual-level data, we also confirmed and quantified the strength of these associations on a phenotypic level. Moreover, employing two-sample Mendelian Randomization analyses, we found evidence for causal associations between olfactory dysfunction and these phenotypes.
Conclusions: Our findings provide novel insights into the genetic architecture of the sense of smell and highlight its importance for many aspects of human health. Moreover, these findings could facilitate the identification and monitoring of individuals at increased risk of olfactory dysfunction and associated diseases.
{"title":"Genome-wide association study meta-analysis uncovers novel genetic variants associated with olfactory dysfunction.","authors":"Mohammed Aslam Imtiaz, Konstantinos Melas, Adrienne Tin, Valentina Talevi, Honglei Chen, Myriam Fornage, Srishti Shrestha, Martin Gögele, David Emmert, Cristian Pattaro, Peter Pramstaller, Franz Förster, Katrin Horn, Thomas H Mosley, Christian Fuchsberger, Markus Scholz, Monique M B Breteler, N Ahmad Aziz","doi":"10.1186/s12863-025-01360-z","DOIUrl":"10.1186/s12863-025-01360-z","url":null,"abstract":"<p><strong>Background: </strong>Olfactory dysfunction is among the earliest signs of many age-related neurodegenerative diseases and has been associated with increased mortality in older adults; however, its genetic basis remains largely unknown. Therefore, here we aimed to elucidate its genetic architecture through a genome-wide association study meta-analysis (GWMA).</p><p><strong>Methods: </strong>This GWMA included the participants of European ancestry (N = 22,730) enrolled in four different large population-based studies followed by a multi-ancestry GWMA including participants of African ancestry (N = 1,030). Olfactory dysfunction was assessed using a 12-item smell identification test.</p><p><strong>Results: </strong>GWMA revealed a novel genome-wide significant locus (tagged by single nucleotide polymorphism rs11228623 at the 11q12 locus) associated with olfactory dysfunction. Gene-based analysis revealed a high enrichment for olfactory receptor genes in this region. Phenome-wide association studies demonstrated associations between genetic variants related to olfactory dysfunction and blood cell counts, kidney function, skeletal muscle mass, cholesterol levels and cardiovascular disease. Using individual-level data, we also confirmed and quantified the strength of these associations on a phenotypic level. Moreover, employing two-sample Mendelian Randomization analyses, we found evidence for causal associations between olfactory dysfunction and these phenotypes.</p><p><strong>Conclusions: </strong>Our findings provide novel insights into the genetic architecture of the sense of smell and highlight its importance for many aspects of human health. Moreover, these findings could facilitate the identification and monitoring of individuals at increased risk of olfactory dysfunction and associated diseases.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"64"},"PeriodicalIF":2.5,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12445039/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145082371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-03DOI: 10.1186/s12863-025-01356-9
Bernardo Reyes-Tur, Zeyuan Chen, Mario Juan Gordillo-Pérez, Alexander Ben Hamadou, Charlotte Gerheim, Carola Greve, Julia D Sigwart
Objective: The Cuban Painted Landsnail is an iconic endemic tree snail species with distinctive colourful shells used in traditional handicrafts. This species won the International Mollusc of the Year 2022 competition in an open public vote. As the competition prize, we have assembled the draft genome of this species.
Data description: Genomic DNA from Polymita picta (Born, 1778) was sequenced using PacBio HiFi sequencing with a yield of 5.3 million reads (41.4 Gb) and an N50 of 8.1 Kb. The genome size of P. picta was estimated to be 2.9 Gb, and the final assembly was 1.85 Gb, with a total of 22,619 contigs and a contig N50 of 124.2 Kb. BUSCO analysis of the genome assembly indicated a genome completeness of 88.4%, with 7% complete duplicated BUSCOs in metazoa_odb10. The draft genome will be a valuable resource for work on the endangered Cuban Painted Landsnail including monitoring genetic diversity and establishing captive breeding for conservation.
{"title":"Draft genome of the Cuban Painted Landsnail Polymita picta, International Mollusc of the year 2022.","authors":"Bernardo Reyes-Tur, Zeyuan Chen, Mario Juan Gordillo-Pérez, Alexander Ben Hamadou, Charlotte Gerheim, Carola Greve, Julia D Sigwart","doi":"10.1186/s12863-025-01356-9","DOIUrl":"10.1186/s12863-025-01356-9","url":null,"abstract":"<p><strong>Objective: </strong>The Cuban Painted Landsnail is an iconic endemic tree snail species with distinctive colourful shells used in traditional handicrafts. This species won the International Mollusc of the Year 2022 competition in an open public vote. As the competition prize, we have assembled the draft genome of this species.</p><p><strong>Data description: </strong>Genomic DNA from Polymita picta (Born, 1778) was sequenced using PacBio HiFi sequencing with a yield of 5.3 million reads (41.4 Gb) and an N50 of 8.1 Kb. The genome size of P. picta was estimated to be 2.9 Gb, and the final assembly was 1.85 Gb, with a total of 22,619 contigs and a contig N50 of 124.2 Kb. BUSCO analysis of the genome assembly indicated a genome completeness of 88.4%, with 7% complete duplicated BUSCOs in metazoa_odb10. The draft genome will be a valuable resource for work on the endangered Cuban Painted Landsnail including monitoring genetic diversity and establishing captive breeding for conservation.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"63"},"PeriodicalIF":2.5,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12409939/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-02DOI: 10.1186/s12863-025-01357-8
Yeonkyeong Lee, Jin-Ju Nah, Hyun-Ok Ku, Il Jang
{"title":"High-quality genome assembly and annotation of live animal vaccine bacteria strains in South Korea.","authors":"Yeonkyeong Lee, Jin-Ju Nah, Hyun-Ok Ku, Il Jang","doi":"10.1186/s12863-025-01357-8","DOIUrl":"10.1186/s12863-025-01357-8","url":null,"abstract":"","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"62"},"PeriodicalIF":2.5,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12406338/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-28DOI: 10.1186/s12863-025-01344-z
Gwi-Deuk Jin, Ho-Youn Kim, Eun Bae Kim, Bokyung Lee
Objectives: Lacticaseibacillus rhamnosus is a widely recognized probiotic bacteria with therapeutic applications in human and animal health. The L. rhamnosus B3421 strain, isolated from Panax ginseng, has been reported to be associated with antioxidant and anti-inflammatory properties, supporting its functional potential. We sequenced and analyzed the genome of L. rhamnosus B3421 to evaluate its probiotic potential for human healthcare and animal applications, focusing on genomic features related to safety and functionality.
Data description: In this study, we isolated L. rhamnosus B3421 from Panax ginseng C. A. Meyer (Ginseng) and performed whole-genome sequencing. The genome of L. rhamnosus B3421 consists of 3,000,051 base pairs (bp) with a guanine + cytosine (G + C) content of 46.70%. It encodes 59 transfer RNAs, 15 ribosomal RNAs, and 2,807 coding sequences (CDSs). Of these CDSs, 99.13% (2,758 proteins) were assigned to functional categories in the Clusters of Orthologous Group (COGs) classification system, while 49 proteins remained uncharacterized. Our genome analysis identified no antibiotic resistance (ABR) or antimicrobial resistance (AMR) genes, indicating that L. rhamnosus B3421 is a safe probiotic bacterium with minimal risk of contributing to the horizontal transfer of antibiotic resistance within the gut microbiome. Additionally, the genome contains genes associated with the ggmotif (PF10439), Enterocin X chain beta, and Carnocin CP52, as identified through BAGEL4 analysis, along with 24 other genes related to reductase or peroxidase activities. These genes may confer competitive advantages against pathogenic bacteria and oxidative stress. Our findings highlight the probiotic potential of L. rhamnosus B3421 and its prospective applications in promoting human and animal health.
{"title":"Complete genome sequence of the probiotic candidate strain Lacticaseibacillus rhamnosus B3421 isolated from Panax ginseng C. A. Meyer in South Korea.","authors":"Gwi-Deuk Jin, Ho-Youn Kim, Eun Bae Kim, Bokyung Lee","doi":"10.1186/s12863-025-01344-z","DOIUrl":"https://doi.org/10.1186/s12863-025-01344-z","url":null,"abstract":"<p><strong>Objectives: </strong>Lacticaseibacillus rhamnosus is a widely recognized probiotic bacteria with therapeutic applications in human and animal health. The L. rhamnosus B3421 strain, isolated from Panax ginseng, has been reported to be associated with antioxidant and anti-inflammatory properties, supporting its functional potential. We sequenced and analyzed the genome of L. rhamnosus B3421 to evaluate its probiotic potential for human healthcare and animal applications, focusing on genomic features related to safety and functionality.</p><p><strong>Data description: </strong>In this study, we isolated L. rhamnosus B3421 from Panax ginseng C. A. Meyer (Ginseng) and performed whole-genome sequencing. The genome of L. rhamnosus B3421 consists of 3,000,051 base pairs (bp) with a guanine + cytosine (G + C) content of 46.70%. It encodes 59 transfer RNAs, 15 ribosomal RNAs, and 2,807 coding sequences (CDSs). Of these CDSs, 99.13% (2,758 proteins) were assigned to functional categories in the Clusters of Orthologous Group (COGs) classification system, while 49 proteins remained uncharacterized. Our genome analysis identified no antibiotic resistance (ABR) or antimicrobial resistance (AMR) genes, indicating that L. rhamnosus B3421 is a safe probiotic bacterium with minimal risk of contributing to the horizontal transfer of antibiotic resistance within the gut microbiome. Additionally, the genome contains genes associated with the ggmotif (PF10439), Enterocin X chain beta, and Carnocin CP52, as identified through BAGEL4 analysis, along with 24 other genes related to reductase or peroxidase activities. These genes may confer competitive advantages against pathogenic bacteria and oxidative stress. Our findings highlight the probiotic potential of L. rhamnosus B3421 and its prospective applications in promoting human and animal health.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"61"},"PeriodicalIF":2.5,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12395871/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objectives: This amplicon metagenomic study examines the relative abundance, taxonomic profiles and community structure of bacterial and fungal communities associated with the roots of parsley (Petroselinum crispum) and celery (Apium graveolens) under monocropping and intercropping systems. The study aims to provide a baseline understanding of how intercropping influences rhizosphere microbial dynamics.
Data description: The dataset provides insight into the effects of parsley-celery intercropping system on soil microbial richness, diversity and community structure. Amplicon metagenomic sequencing was performed on the DNA samples, targeting the 16S rRNA gene (V3-V4 region) and the ITS region for bacterial and fungal communities, respectively. The quantified libraries were pooled and sequenced using Illumina platforms, and the raw sequences were analyzed using the Quantitative Insights Into Microbial Ecology (QIIME 2 version 2019.1.) pipeline. The resulting Amplicon Sequence Variant (ASV) profiles revealed Actinobacteria and Protobacteria as the most predominant bacteria phyla, followed by Bacteroidota, Gemmatimonadota and Acidobacteriaota. The most predominant taxonomic distribution of fungi at the phylum level includes Ascomycota and Mortierellomycota. The dataset includes raw sequence reads in FASTQ format (.fastq.gz), which have been deposited in the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under the Bioproject Accession numbers; SRP540554 (16S rRNA) and SRP540675 (ITS).
目的:通过扩增子宏基因组研究,研究了单作和间作条件下欧芹(Petroselinum crispum)和芹菜(Apium graveolens)根系相关细菌和真菌群落的相对丰度、分类特征和群落结构。该研究旨在为间作如何影响根际微生物动力学提供一个基本的认识。数据说明:该数据集揭示了欧芹间作制度对土壤微生物丰富度、多样性和群落结构的影响。对DNA样本进行扩增子宏基因组测序,分别针对细菌群落的16S rRNA基因(V3-V4区)和真菌群落的ITS区。使用Illumina平台对定量文库进行汇总和测序,使用Quantitative Insights Into Microbial Ecology (QIIME 2 version 2019.1.)流水线对原始序列进行分析。扩增子序列变异(Amplicon Sequence Variant, ASV)显示放线菌门和原细菌门是最主要的菌门,其次是拟杆菌门、双歧杆菌门和酸杆菌门。在门水平上,真菌最主要的分类分布包括子囊菌门和Mortierellomycota门。该数据集包括FASTQ格式(.fastq.gz)的原始序列读取,已存放在国家生物技术信息中心(NCBI)的序列读取档案(SRA)中,编号为Bioproject Accession number;SRP540554 (16S rRNA)和SRP540675 (ITS)。
{"title":"Dataset of 16S rRNA and ITS gene amplicon sequencing of celery and parsley rhizosphere soils.","authors":"Olubukola Oluranti Babalola, Florence Oluwayemisi Ogundeji, Akinlolu Olalekan Akanmu","doi":"10.1186/s12863-025-01351-0","DOIUrl":"https://doi.org/10.1186/s12863-025-01351-0","url":null,"abstract":"<p><strong>Objectives: </strong>This amplicon metagenomic study examines the relative abundance, taxonomic profiles and community structure of bacterial and fungal communities associated with the roots of parsley (Petroselinum crispum) and celery (Apium graveolens) under monocropping and intercropping systems. The study aims to provide a baseline understanding of how intercropping influences rhizosphere microbial dynamics.</p><p><strong>Data description: </strong>The dataset provides insight into the effects of parsley-celery intercropping system on soil microbial richness, diversity and community structure. Amplicon metagenomic sequencing was performed on the DNA samples, targeting the 16S rRNA gene (V3-V4 region) and the ITS region for bacterial and fungal communities, respectively. The quantified libraries were pooled and sequenced using Illumina platforms, and the raw sequences were analyzed using the Quantitative Insights Into Microbial Ecology (QIIME 2 version 2019.1.) pipeline. The resulting Amplicon Sequence Variant (ASV) profiles revealed Actinobacteria and Protobacteria as the most predominant bacteria phyla, followed by Bacteroidota, Gemmatimonadota and Acidobacteriaota. The most predominant taxonomic distribution of fungi at the phylum level includes Ascomycota and Mortierellomycota. The dataset includes raw sequence reads in FASTQ format (.fastq.gz), which have been deposited in the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under the Bioproject Accession numbers; SRP540554 (16S rRNA) and SRP540675 (ITS).</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"60"},"PeriodicalIF":2.5,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12376418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-19DOI: 10.1186/s12863-025-01354-x
Juha Kantanen, Melak Weldenegodguad, Kisun Pokharel
Objectives: Finnsheep, a highly prolific breed of sheep, has been globally exported for improving fertility traits of many sheep breeds. Published genomic studies of Finnsheep have been based on Texel and Rambouillet reference genomes, which may not capture its unique genetic features. Our main objective was to generate a high-quality Finnsheep genome assembly and its annotation that could serve as a breed-specific reference for studying fertility and adaptation in Finnsheep and other short-tailed northern European sheep breeds.
Data description: We generated a 2.53 Gb assembly using PacBio HiFi long-reads and Hi-C sequencing from a highly fertile Finnsheep ewe. The assembly, scaffolded with Hi-C data has a contig N50 of 35.5 Mb and scaffold N50 of 100.6 Mb. Gene annotation identified 42,533 genes spanning 46.5 Mb of coding region. BUSCO completeness for the assembly and annotation was 94.9% and 84.3%, respectively. This data, including raw reads, assembly, and annotations supports genomic studies of Finnsheep and other prolific breeds of sheep that are particularly adapted to northern European environments.
{"title":"Oar_Finn: the genome assembly and annotation of an exceptionally fertile Finnsheep (Ovis aries) Ewe.","authors":"Juha Kantanen, Melak Weldenegodguad, Kisun Pokharel","doi":"10.1186/s12863-025-01354-x","DOIUrl":"10.1186/s12863-025-01354-x","url":null,"abstract":"<p><strong>Objectives: </strong>Finnsheep, a highly prolific breed of sheep, has been globally exported for improving fertility traits of many sheep breeds. Published genomic studies of Finnsheep have been based on Texel and Rambouillet reference genomes, which may not capture its unique genetic features. Our main objective was to generate a high-quality Finnsheep genome assembly and its annotation that could serve as a breed-specific reference for studying fertility and adaptation in Finnsheep and other short-tailed northern European sheep breeds.</p><p><strong>Data description: </strong>We generated a 2.53 Gb assembly using PacBio HiFi long-reads and Hi-C sequencing from a highly fertile Finnsheep ewe. The assembly, scaffolded with Hi-C data has a contig N50 of 35.5 Mb and scaffold N50 of 100.6 Mb. Gene annotation identified 42,533 genes spanning 46.5 Mb of coding region. BUSCO completeness for the assembly and annotation was 94.9% and 84.3%, respectively. This data, including raw reads, assembly, and annotations supports genomic studies of Finnsheep and other prolific breeds of sheep that are particularly adapted to northern European environments.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"59"},"PeriodicalIF":2.5,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12362895/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}