Pub Date : 2025-10-28DOI: 10.1186/s12863-025-01368-5
Viviana Floridia, Arianna Bionda, Katherine D Arias, Annalisa Amato, Carmelo Cavallo, Vincenzo Chiofalo, Matteo Cortellari, Vincenzo Lopreiato, Antonino N Virga, Paola Crepaldi, Luigi Liotta, Mario Barbato
Background: The Capra Comune di Sicilia (CCS), also known as Mascaruna, is a Sicilian local goat population first described in 1870 and is currently the focus of a recovery project aimed at its characterization and formal ethnic recognition. To elucidate the ancestral genetic components and selection trajectories of the CCS population, we genotyped 78 CCS goats using the Goat 60 K SNP BeadChip, integrated with genotype data from 1,920 individuals representing 66 goat breeds of Mediterranean and African origin.
Results: CCS exhibited relatively high heterozygosity (0.408), and moderate levels of inbreeding (0.04), an estimated effective population size of 185. Genetic ancestry analysis revealed gene flow from Maltese, Girgentana, Rossa Mediterranea and Saanen populations, alongside evidence of putative Greek ancestry shared with most Mediterranean breeds in our dataset, reflecting Sicily's profound historical and cultural ties with Greece. To better understand the evolutionary trajectories of the CCS population and to explore the potential contribution of Greek goat ancestry, we investigated selection signatures using iHS and ROH analyses. We identified 76 and 31 SNPs intercepting 38 and 12 genes, respectively, under putative selection. Subsequently, we applied XP-nSL and ROH analyses using Greek populations as ancestral references, identifying 21 and 431 SNPs associated with four and 157 genes, respectively, under putative selection. Overall, these selection signature analyses highlighted genes under positive selection related to traits such as milk and meat production, body size and growth, fertility, coat colour, fat deposition, and ear and horn development.
Conclusion: Our findings shed light on the historical and genetic distinctiveness of the CCS population, emphasizing its uniqueness and providing critical insights into its genetic background. This information is essential for supporting informed efforts to formally recognize CCS as a distinct and valuable breed.
背景:Capra Comune di Sicilia (CCS),也被称为Mascaruna,是西西里当地的山羊种群,于1870年首次被描述,目前是一个旨在描述其特征和正式种族承认的恢复项目的重点。为了阐明CCS群体的祖先遗传成分和选择轨迹,我们使用Goat 60 K SNP BeadChip对78只CCS山羊进行了基因分型,并结合了来自地中海和非洲66个山羊品种的1,920个个体的基因型数据。结果:CCS表现出较高的杂合度(0.408)和中等水平的近交(0.04),估计有效群体大小为185。遗传祖先分析揭示了马耳他人、吉尔根塔纳人、地中海红人和萨宁人的基因流,以及与我们数据集中大多数地中海品种共有的假定希腊祖先的证据,反映了西西里岛与希腊深厚的历史和文化联系。为了更好地了解希腊山羊种群的进化轨迹,并探索希腊山羊祖先的潜在贡献,我们使用his和ROH分析研究了选择特征。在假定的选择下,我们分别鉴定出76个和31个snp截获38个和12个基因。随后,我们使用XP-nSL和ROH分析,以希腊种群作为祖先参考,在假定的选择下分别鉴定出与4个和157个基因相关的21个和431个snp。总的来说,这些选择特征分析突出了与产奶和产肉、体型和生长、生育力、毛色、脂肪沉积以及耳朵和角发育等性状相关的正选择基因。结论:我们的研究结果揭示了CCS群体的历史和遗传独特性,强调了其独特性,并为其遗传背景提供了重要的见解。这些信息对于支持正式承认CCS是一个独特和有价值的品种的知情努力至关重要。
{"title":"Genomic insights on the history and selection trajectories of the Comune di Sicilia goat.","authors":"Viviana Floridia, Arianna Bionda, Katherine D Arias, Annalisa Amato, Carmelo Cavallo, Vincenzo Chiofalo, Matteo Cortellari, Vincenzo Lopreiato, Antonino N Virga, Paola Crepaldi, Luigi Liotta, Mario Barbato","doi":"10.1186/s12863-025-01368-5","DOIUrl":"10.1186/s12863-025-01368-5","url":null,"abstract":"<p><strong>Background: </strong>The Capra Comune di Sicilia (CCS), also known as Mascaruna, is a Sicilian local goat population first described in 1870 and is currently the focus of a recovery project aimed at its characterization and formal ethnic recognition. To elucidate the ancestral genetic components and selection trajectories of the CCS population, we genotyped 78 CCS goats using the Goat 60 K SNP BeadChip, integrated with genotype data from 1,920 individuals representing 66 goat breeds of Mediterranean and African origin.</p><p><strong>Results: </strong>CCS exhibited relatively high heterozygosity (0.408), and moderate levels of inbreeding (0.04), an estimated effective population size of 185. Genetic ancestry analysis revealed gene flow from Maltese, Girgentana, Rossa Mediterranea and Saanen populations, alongside evidence of putative Greek ancestry shared with most Mediterranean breeds in our dataset, reflecting Sicily's profound historical and cultural ties with Greece. To better understand the evolutionary trajectories of the CCS population and to explore the potential contribution of Greek goat ancestry, we investigated selection signatures using iHS and ROH analyses. We identified 76 and 31 SNPs intercepting 38 and 12 genes, respectively, under putative selection. Subsequently, we applied XP-nSL and ROH analyses using Greek populations as ancestral references, identifying 21 and 431 SNPs associated with four and 157 genes, respectively, under putative selection. Overall, these selection signature analyses highlighted genes under positive selection related to traits such as milk and meat production, body size and growth, fertility, coat colour, fat deposition, and ear and horn development.</p><p><strong>Conclusion: </strong>Our findings shed light on the historical and genetic distinctiveness of the CCS population, emphasizing its uniqueness and providing critical insights into its genetic background. This information is essential for supporting informed efforts to formally recognize CCS as a distinct and valuable breed.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"78"},"PeriodicalIF":2.5,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12570788/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145395713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-28DOI: 10.1186/s12863-025-01365-8
Olivia Marcuzzi, Francisco Calcaterra, Leónidas H Olivera, Analía Arizmendi, Marco R J M Henry, Danielle Cunha Cardoso, Ana M Loaiza Echeverri, Juan P Liron, María E Fernández, Denise A Andrade de Oliveira, Guillermo Giovambattista
The combined use of NGS technologies with bioinformatics tools has significantly advanced research by enabling comprehensive analyses of entire genomes, specific genomic regions of interest, and transcriptomes. Targeted NGS methods, which focus on smaller genome fractions, are widely used to study genetic diseases, epigenetic modifications, microbiomes, and environmental DNA, among other applications. This study aimed to develop a roadmap for detecting and selecting polymorphisms in candidate genes by integrating amplicon NGS-Target techniques with bioinformatics analyses. Sixty-eight genes associated with the hypothalamic-pituitary-gonadal (HPG) axis were selected to develop the amplicon NGS assay, comprising 730 regions that cover a total of 136,274 bp. This method was used to sequence 75 Guzerat cattle, a dual-purpose breed from Brazil, renowned for their high rusticity and adaptability. This Zebu cattle exhibit certain limitations, such as delayed puberty onset, which can reduce reproductive efficiency. Using the GATK protocol a total of 2,600 SNPs and 1,615 indels were detected. A series of consecutive filtering steps (maf, the detection of non-synonymous substitution, phylogenetic amino acid conservation, and biochemical properties) were used, resulting in a subset of 30 candidate SNPs. Then, these polymorphisms were analysed using bioinformatic tools (SIFT, PANTHER, PolyPhen2, and MutPred), identifying 5 SNPs with high effect on the protein. Their structure and stability were estimated using AlphaFold and DDMut. Finally, 3 candidate polymorphisms (IGF1R, LHCGR, TAC3R) with potentially significant effects on the protein remained to be validated through dynamic simulations or in vitro and in vivo experimental assays.
{"title":"A framework for identifying and prioritizing SNPs in genes of the hypothalamic- pituitary-gonadal axis in Guzerat cattle using amplicon-based NGS.","authors":"Olivia Marcuzzi, Francisco Calcaterra, Leónidas H Olivera, Analía Arizmendi, Marco R J M Henry, Danielle Cunha Cardoso, Ana M Loaiza Echeverri, Juan P Liron, María E Fernández, Denise A Andrade de Oliveira, Guillermo Giovambattista","doi":"10.1186/s12863-025-01365-8","DOIUrl":"10.1186/s12863-025-01365-8","url":null,"abstract":"<p><p>The combined use of NGS technologies with bioinformatics tools has significantly advanced research by enabling comprehensive analyses of entire genomes, specific genomic regions of interest, and transcriptomes. Targeted NGS methods, which focus on smaller genome fractions, are widely used to study genetic diseases, epigenetic modifications, microbiomes, and environmental DNA, among other applications. This study aimed to develop a roadmap for detecting and selecting polymorphisms in candidate genes by integrating amplicon NGS-Target techniques with bioinformatics analyses. Sixty-eight genes associated with the hypothalamic-pituitary-gonadal (HPG) axis were selected to develop the amplicon NGS assay, comprising 730 regions that cover a total of 136,274 bp. This method was used to sequence 75 Guzerat cattle, a dual-purpose breed from Brazil, renowned for their high rusticity and adaptability. This Zebu cattle exhibit certain limitations, such as delayed puberty onset, which can reduce reproductive efficiency. Using the GATK protocol a total of 2,600 SNPs and 1,615 indels were detected. A series of consecutive filtering steps (maf, the detection of non-synonymous substitution, phylogenetic amino acid conservation, and biochemical properties) were used, resulting in a subset of 30 candidate SNPs. Then, these polymorphisms were analysed using bioinformatic tools (SIFT, PANTHER, PolyPhen2, and MutPred), identifying 5 SNPs with high effect on the protein. Their structure and stability were estimated using AlphaFold and DDMut. Finally, 3 candidate polymorphisms (IGF1R, LHCGR, TAC3R) with potentially significant effects on the protein remained to be validated through dynamic simulations or in vitro and in vivo experimental assays.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"79"},"PeriodicalIF":2.5,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12570696/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145395681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1186/s12863-025-01372-9
Yang Chen, Yunfang Chen, Yanfeng Deng, Zhenchuan Mao, Yan Li, Jianlong Zhao, Guohua Chen, Jian Ling
Objectives: Verticillium albo-atrum is one of the most dangerous quarantine pathogen, which is a soil-borne pathogen known for causing verticillium wilt, a disease that affects a wide range of plants, including many economically important crops. However, the lack of high-quality genome resource has greatly limited the research of molecular and evolutionary mechanisms of Verticillium albo-atrum. The highly-quality genome of Verticillium albo-atrum provides a valuable resource for better understanding of the biological characteristics.
Data description: We sequenced and assembled the genome of Verticillium albo-atrum using ONT long reads combined with DNBSEQ-T7 paired-end short reads and anchored 11 contigs into 8 chromosomes using Hi-C chromatin contact information, yielding a 35.95 Mb chromosome-level genome assembly with a N50 of 4.20 Mb. In addition, transcript-based annotation identified 9967 protein-coding genes, of which 84.17% were functionally annotated. BUSCO analysis demonstrated that this genome assembly has a high-level completeness of 96.47% gene coverage.
{"title":"A chromosome-level genome assembly of Verticillium albo-atrum, an dangerous quarantine pathogen known for causing verticillium wilt.","authors":"Yang Chen, Yunfang Chen, Yanfeng Deng, Zhenchuan Mao, Yan Li, Jianlong Zhao, Guohua Chen, Jian Ling","doi":"10.1186/s12863-025-01372-9","DOIUrl":"10.1186/s12863-025-01372-9","url":null,"abstract":"<p><strong>Objectives: </strong>Verticillium albo-atrum is one of the most dangerous quarantine pathogen, which is a soil-borne pathogen known for causing verticillium wilt, a disease that affects a wide range of plants, including many economically important crops. However, the lack of high-quality genome resource has greatly limited the research of molecular and evolutionary mechanisms of Verticillium albo-atrum. The highly-quality genome of Verticillium albo-atrum provides a valuable resource for better understanding of the biological characteristics.</p><p><strong>Data description: </strong>We sequenced and assembled the genome of Verticillium albo-atrum using ONT long reads combined with DNBSEQ-T7 paired-end short reads and anchored 11 contigs into 8 chromosomes using Hi-C chromatin contact information, yielding a 35.95 Mb chromosome-level genome assembly with a N50 of 4.20 Mb. In addition, transcript-based annotation identified 9967 protein-coding genes, of which 84.17% were functionally annotated. BUSCO analysis demonstrated that this genome assembly has a high-level completeness of 96.47% gene coverage.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"76"},"PeriodicalIF":2.5,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12542403/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145350280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1186/s12863-025-01370-x
Gwi-Deuk Jin, Ho-Youn Kim, Eun Bae Kim, Bokyung Lee
Objectives: Leuconostoc menseteroides is a species of lactic acid bacteria with established safety, frequently isolated from fermented foods such as kimchi, and is Generally Regarded as Safe (GRAS). This species is used as a starter culture for kimchi fermentation in South Korea. In this study, we aimed to evaluate the probiotic potential of Leconostoc menseteroides B22051, isolated from Panax ginseng C. A. Meyer (Ginseng), through genome analysis by assessing probiotic traits including antibiotic resistance, bacteriocin, and redox genes relevant to both human and livestock applications.
Data description: We isolated a candidate probiotic strain from Panax ginseng Meyer and decoded the complete genome of L. menseteroides B22051. The complete genome was found to be 1,994,797 bp in size with a guanine + cytosine (G + C) content of 37.7%, and is composed of one chromosome (1,943,350 bp) and two plasmids (37,364 bp and 14,083 bp). Genome annotation revealed 71 transfer RNAs, 24 ribosomal RNAs, and 1,987 coding sequences (CDSs). Furthermore, 98.09% and 64.92% of these 1,987 CDSs were assigned to the COG and Gene Ontology classification systems, respectively. Two partial sequences of vanT (32.61% identity) and vanY (34.36% identity) were detected by CARD v3.2.0 analysis, and although in vitro assays confirmed vancomycin resistance, the low sequence identity and absence of a complete van operon indicate that the resistance is intrinsic rather than acquired. An entero_X_chain_beta gene, associated with Enterocin within Bacteriocin Class IIC (chromosomally encoded), along with 28 reductase genes and one oxidase gene identified through Gene Ontology analysis, is present in L. mesenteroides B22051. These genomic findings confirm the probiotic properties of L. menseteroides B22051.
{"title":"Complete genome of Leuconostoc mesenteroides B22051 from Panax ginseng Meyer C. A. in South Korea.","authors":"Gwi-Deuk Jin, Ho-Youn Kim, Eun Bae Kim, Bokyung Lee","doi":"10.1186/s12863-025-01370-x","DOIUrl":"10.1186/s12863-025-01370-x","url":null,"abstract":"<p><strong>Objectives: </strong>Leuconostoc menseteroides is a species of lactic acid bacteria with established safety, frequently isolated from fermented foods such as kimchi, and is Generally Regarded as Safe (GRAS). This species is used as a starter culture for kimchi fermentation in South Korea. In this study, we aimed to evaluate the probiotic potential of Leconostoc menseteroides B22051, isolated from Panax ginseng C. A. Meyer (Ginseng), through genome analysis by assessing probiotic traits including antibiotic resistance, bacteriocin, and redox genes relevant to both human and livestock applications.</p><p><strong>Data description: </strong>We isolated a candidate probiotic strain from Panax ginseng Meyer and decoded the complete genome of L. menseteroides B22051. The complete genome was found to be 1,994,797 bp in size with a guanine + cytosine (G + C) content of 37.7%, and is composed of one chromosome (1,943,350 bp) and two plasmids (37,364 bp and 14,083 bp). Genome annotation revealed 71 transfer RNAs, 24 ribosomal RNAs, and 1,987 coding sequences (CDSs). Furthermore, 98.09% and 64.92% of these 1,987 CDSs were assigned to the COG and Gene Ontology classification systems, respectively. Two partial sequences of vanT (32.61% identity) and vanY (34.36% identity) were detected by CARD v3.2.0 analysis, and although in vitro assays confirmed vancomycin resistance, the low sequence identity and absence of a complete van operon indicate that the resistance is intrinsic rather than acquired. An entero_X_chain_beta gene, associated with Enterocin within Bacteriocin Class IIC (chromosomally encoded), along with 28 reductase genes and one oxidase gene identified through Gene Ontology analysis, is present in L. mesenteroides B22051. These genomic findings confirm the probiotic properties of L. menseteroides B22051.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"75"},"PeriodicalIF":2.5,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12541927/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145350307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-14DOI: 10.1186/s12863-025-01361-y
Ramsha Azhar, Muhammad Faizan Malik, Rozeena Arif, Muhammad Waqas Khokhar, Yasir Mehmood Abbasi, Fatima Batool, Muhammad Haseeb Jalalzai, Yiming Bao, Amir Ali Abbasi
{"title":"PAHG: the database of human multi-gene families.","authors":"Ramsha Azhar, Muhammad Faizan Malik, Rozeena Arif, Muhammad Waqas Khokhar, Yasir Mehmood Abbasi, Fatima Batool, Muhammad Haseeb Jalalzai, Yiming Bao, Amir Ali Abbasi","doi":"10.1186/s12863-025-01361-y","DOIUrl":"10.1186/s12863-025-01361-y","url":null,"abstract":"","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"74"},"PeriodicalIF":2.5,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12522930/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145294629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-03DOI: 10.1186/s12863-025-01359-6
Larmande Pierre, Pittolat Bertrand, Tando Ndomassi, Pomie Yann, Happi Happi Bill Gates, Guignon Valentin, Ruiz Manuel
Background: The demand for food is expected to grow substantially in the coming years. To address this challenge, especially in the context of climate change, a deeper understanding of genotype-phenotype relationships is crucial for improving crop yields. Recent advances in high-throughput technologies have transformed the landscape of plant science research. However, there is an urgent need to integrate and consolidate complementary data to understand the biological system.
Results: We introduce AgroLD, a knowledge graph that uses Semantic Web technologies to seamlessly integrate plant science data. AgroLD is designed to facilitate hypothesis formulation and validation within the scientific community. With approximately 1.08 billion triples, it integrates and annotates data from more than 151 datasets across 19 distinct sources.
Conclusion: The overarching goal is to provide a specialized knowledge platform addressing complex biological questions in the plant sciences, including gene participation in plant disease resistance and adaptive responses to climate change.
{"title":"AgroLD: a knowledge graph for the plant sciences.","authors":"Larmande Pierre, Pittolat Bertrand, Tando Ndomassi, Pomie Yann, Happi Happi Bill Gates, Guignon Valentin, Ruiz Manuel","doi":"10.1186/s12863-025-01359-6","DOIUrl":"10.1186/s12863-025-01359-6","url":null,"abstract":"<p><strong>Background: </strong>The demand for food is expected to grow substantially in the coming years. To address this challenge, especially in the context of climate change, a deeper understanding of genotype-phenotype relationships is crucial for improving crop yields. Recent advances in high-throughput technologies have transformed the landscape of plant science research. However, there is an urgent need to integrate and consolidate complementary data to understand the biological system.</p><p><strong>Results: </strong>We introduce AgroLD, a knowledge graph that uses Semantic Web technologies to seamlessly integrate plant science data. AgroLD is designed to facilitate hypothesis formulation and validation within the scientific community. With approximately 1.08 billion triples, it integrates and annotates data from more than 151 datasets across 19 distinct sources.</p><p><strong>Conclusion: </strong>The overarching goal is to provide a specialized knowledge platform addressing complex biological questions in the plant sciences, including gene participation in plant disease resistance and adaptive responses to climate change.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 Suppl 1","pages":"73"},"PeriodicalIF":2.5,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12495601/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145226445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1186/s12863-025-01358-7
Anjum Shahzad, Tahir Mehmood, Sheeraz Akram
{"title":"Gradient responsive regularization: a deep learning framework for codon frequency based classification of evolutionarily conserved genes.","authors":"Anjum Shahzad, Tahir Mehmood, Sheeraz Akram","doi":"10.1186/s12863-025-01358-7","DOIUrl":"10.1186/s12863-025-01358-7","url":null,"abstract":"","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"72"},"PeriodicalIF":2.5,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490169/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145208572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-29DOI: 10.1186/s12863-025-01364-9
Huiping Huang, Xinnan Yang, Zerui Yang
Backgroud: Although Cinnamomum camphora's metabolic composition has been well studied, flavonoid distribution across its tissues remains poorly understood. This study combined transcriptome and metabolomic analyses on leaf, stem, and root tissues to uncover the synthesis pathway of flavonoids and to identify key regulatory genes.
Results: Metabolomic analysis revealed 2,893 metabolites, which can be divided into secondary metabolite 1,213(41.93%), primary metabolite: 622 (21.50%) and others: 1,058 (36.57%). As for the secondary metabolite, flavonoids were the most abundant (28%), followed by terpenoids (27%) and phenolic acids (12%). Differential metabolites were identified using VIP > 1, |log2 fold change|≥ 1, and p < 0.05 criteria, showing tissue-specific flavonoids distribution. For example, rutin, quercetin 3-o-alpha-l-rhamnoside, and quercetin were abundant in leaves and stems, while 2-hydroxyisoflavanone naringenin, fustin, and catechin were predominant in roots. Transcriptome analysis indicated that a total of 2,043 differentially expressed genes (DEGs) were identified, with the most considerable number found in the leaf-to-root comparison. The KEGG enrichment analysis of DEGs showed significant changes in pathways related to flavonoid and phenylpropanoid biosynthesis. Correlation analysis indicated that key enzyme genes including CcPAL_1, CcF3H_1, CcF3_H, CcCHS_1, CcC4H_2, CcANR_1, Cc4CL_9, Cc4CL_7 and Cc4CL_1 play positive regulatory roles in the accumulation of downstream metabolites, whereas CcPAL_4, CcPAL_2 and CcC4H_1 exert negative regulation on downstream metabolites. In addition, we have identified several bHLH and MYB transcription factors that may regulate flavonoid biosynthesis. Finally, qRT-PCR validation confirmed the RNA sequencing results.
Conclusions: This research elucidates the spatial variations in the accumulation profiles of flavonoid metabolites across different tissues and offers crucial insights into the regulatory mechanisms of flavonoid metabolism in C. camphora. Consequently, it laid a foundation for further research on the flavonoid biosynthetic pathway of C. camphora.
{"title":"Integration of transcriptome and metabolome analysis reveals the genes and pathways regulating flavonoids biosynthesis in Cinnamomum camphora.","authors":"Huiping Huang, Xinnan Yang, Zerui Yang","doi":"10.1186/s12863-025-01364-9","DOIUrl":"10.1186/s12863-025-01364-9","url":null,"abstract":"<p><strong>Backgroud: </strong>Although Cinnamomum camphora's metabolic composition has been well studied, flavonoid distribution across its tissues remains poorly understood. This study combined transcriptome and metabolomic analyses on leaf, stem, and root tissues to uncover the synthesis pathway of flavonoids and to identify key regulatory genes.</p><p><strong>Results: </strong>Metabolomic analysis revealed 2,893 metabolites, which can be divided into secondary metabolite 1,213(41.93%), primary metabolite: 622 (21.50%) and others: 1,058 (36.57%). As for the secondary metabolite, flavonoids were the most abundant (28%), followed by terpenoids (27%) and phenolic acids (12%). Differential metabolites were identified using VIP > 1, |log2 fold change|≥ 1, and p < 0.05 criteria, showing tissue-specific flavonoids distribution. For example, rutin, quercetin 3-o-alpha-l-rhamnoside, and quercetin were abundant in leaves and stems, while 2-hydroxyisoflavanone naringenin, fustin, and catechin were predominant in roots. Transcriptome analysis indicated that a total of 2,043 differentially expressed genes (DEGs) were identified, with the most considerable number found in the leaf-to-root comparison. The KEGG enrichment analysis of DEGs showed significant changes in pathways related to flavonoid and phenylpropanoid biosynthesis. Correlation analysis indicated that key enzyme genes including CcPAL_1, CcF3H_1, CcF3_H, CcCHS_1, CcC4H_2, CcANR_1, Cc4CL_9, Cc4CL_7 and Cc4CL_1 play positive regulatory roles in the accumulation of downstream metabolites, whereas CcPAL_4, CcPAL_2 and CcC4H_1 exert negative regulation on downstream metabolites. In addition, we have identified several bHLH and MYB transcription factors that may regulate flavonoid biosynthesis. Finally, qRT-PCR validation confirmed the RNA sequencing results.</p><p><strong>Conclusions: </strong>This research elucidates the spatial variations in the accumulation profiles of flavonoid metabolites across different tissues and offers crucial insights into the regulatory mechanisms of flavonoid metabolism in C. camphora. Consequently, it laid a foundation for further research on the flavonoid biosynthetic pathway of C. camphora.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"71"},"PeriodicalIF":2.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12482126/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145194066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-29DOI: 10.1186/s12863-025-01352-z
Caren Weinhouse, Luiza Perez, Ian Ryde, Jaclyn M Goodrich, J Jaime Miranda, Heileen Hsu-Kim, Susan K Murphy, Joel N Meyer, William K Pan
Background: Epigenome-wide association studies (EWAS) are a highly promising approach that can inform precision environmental health. However, current EWAS are underpowered and increasing sample sizes will require substantial resources. Therefore, alternative approaches for identifying candidate biomarkers through EWAS are critical. Here, we provide proof of principle that maximizing exposure variance in EWAS enables effective candidate biomarker detection, even in small sample sizes.
Methods: We profiled genome-wide DNA methylation in whole blood from individuals from Madre de Dios, Peru, with either high methylmercury (MeHg) exposure (> 10 µg/g total hair mercury; N = 16) or low MeHg exposure (< 1 µg/g total hair mercury; N = 16).
Results: We identified nine differentially methylated CpG sites (FDR < 0.05), which is comparable to the number identified by much larger EWAS. The most significantly different CpG site was in an intronic enhancer of the SLC5A7 gene, which encodes the L-type amino acid transporter 1 (LAT1) that facilitates MeHg transport. Our Gene Ontology and transcription factor motif enrichment analyses identified genes involved in outcomes linked to MeHg toxicity, including immune response, neurotoxicity, and type 2 diabetes (T2D).
Conclusions: Similar EWAS in global populations with known high exposure variance can be leveraged to develop targeted, custom sequencing panels and microarrays limited to replicated, validated biomarkers of a given exposure.
背景:全表观基因组关联研究(EWAS)是一种非常有前途的方法,可以为精确环境健康提供信息。然而,目前的EWAS动力不足,增加样本量将需要大量资源。因此,通过EWAS识别候选生物标志物的替代方法至关重要。在这里,我们提供了原理证明,即使在小样本量下,EWAS中最大化暴露方差也能有效地检测候选生物标志物。方法:我们分析了来自秘鲁Madre de Dios的个体全血中全基因组DNA甲基化,这些个体要么是高甲基汞(MeHg)暴露(bbb10 μ g/g总发汞;结果:我们确定了9个差异甲基化的CpG位点(FDR)。结论:在已知高暴露差异的全球人群中,类似的EWAS可以用来开发靶向的、定制的测序面板和微阵列,仅限于复制、验证给定暴露的生物标志物。
{"title":"High exposure variance enables candidate biomarker detection in a small EWAS of methylmercury-exposed Peruvian adults.","authors":"Caren Weinhouse, Luiza Perez, Ian Ryde, Jaclyn M Goodrich, J Jaime Miranda, Heileen Hsu-Kim, Susan K Murphy, Joel N Meyer, William K Pan","doi":"10.1186/s12863-025-01352-z","DOIUrl":"10.1186/s12863-025-01352-z","url":null,"abstract":"<p><strong>Background: </strong>Epigenome-wide association studies (EWAS) are a highly promising approach that can inform precision environmental health. However, current EWAS are underpowered and increasing sample sizes will require substantial resources. Therefore, alternative approaches for identifying candidate biomarkers through EWAS are critical. Here, we provide proof of principle that maximizing exposure variance in EWAS enables effective candidate biomarker detection, even in small sample sizes.</p><p><strong>Methods: </strong>We profiled genome-wide DNA methylation in whole blood from individuals from Madre de Dios, Peru, with either high methylmercury (MeHg) exposure (> 10 µg/g total hair mercury; N = 16) or low MeHg exposure (< 1 µg/g total hair mercury; N = 16).</p><p><strong>Results: </strong>We identified nine differentially methylated CpG sites (FDR < 0.05), which is comparable to the number identified by much larger EWAS. The most significantly different CpG site was in an intronic enhancer of the SLC5A7 gene, which encodes the L-type amino acid transporter 1 (LAT1) that facilitates MeHg transport. Our Gene Ontology and transcription factor motif enrichment analyses identified genes involved in outcomes linked to MeHg toxicity, including immune response, neurotoxicity, and type 2 diabetes (T2D).</p><p><strong>Conclusions: </strong>Similar EWAS in global populations with known high exposure variance can be leveraged to develop targeted, custom sequencing panels and microarrays limited to replicated, validated biomarkers of a given exposure.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"68"},"PeriodicalIF":2.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12482037/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145194021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-29DOI: 10.1186/s12863-025-01367-6
Seok Bin Yang, Doyun Ku, Ji-Hoi Moon, Jae-Hyung Lee, Sang Wook Kang, Hak Kyun Kim, Kyu Hwan Kwack
Objective: Streptococcus hominis is a recently described species within the genus Streptococcus, yet its genomic characteristics remain poorly understood, particularly in the context of the oral microbiome. Previously, only two complete genomes from non-oral sources were available. To address this gap, we sequenced and analyzed S. hominis strain KHUD_010, isolated from the subgingival biofilm of a healthy Korean adult.
Data description: Genomic DNA from KHUD_010 was extracted and confirmed as S. hominis by 16 S rRNA gene sequencing. Whole-genome sequencing using the PacBio Sequel II platform generated 135,974 HiFi reads (N50: 10,345 bp). De novo assembly with SMRT Link v11.0 produced a single circular chromosome of 1,883,665 bp with 39.04% GC content. Annotation via the NCBI Prokaryotic Genome Annotation Pipeline predicted 1,793 protein-coding genes, four rRNA operons (5 S, 16 S, 23 S), and 120 tRNAs. BUSCO analysis showed 99.1% completeness. Comparative genomics with NSJ-17 and UMB6992B revealed 1,416 core, 223 dispensable, and 398 strain-specific gene clusters. KHUD_010 harbored 18 unique gene clusters comprising 20 genes, mostly assigned to COG category L (replication, recombination, repair). This high-quality genome expands the genomic landscape of S. hominis and provides a valuable reference for future studies on oral microbiome diversity and host adaptation.
{"title":"Complete genome sequence of Streptococcus hominis isolated from subgingival biofilm.","authors":"Seok Bin Yang, Doyun Ku, Ji-Hoi Moon, Jae-Hyung Lee, Sang Wook Kang, Hak Kyun Kim, Kyu Hwan Kwack","doi":"10.1186/s12863-025-01367-6","DOIUrl":"10.1186/s12863-025-01367-6","url":null,"abstract":"<p><strong>Objective: </strong>Streptococcus hominis is a recently described species within the genus Streptococcus, yet its genomic characteristics remain poorly understood, particularly in the context of the oral microbiome. Previously, only two complete genomes from non-oral sources were available. To address this gap, we sequenced and analyzed S. hominis strain KHUD_010, isolated from the subgingival biofilm of a healthy Korean adult.</p><p><strong>Data description: </strong>Genomic DNA from KHUD_010 was extracted and confirmed as S. hominis by 16 S rRNA gene sequencing. Whole-genome sequencing using the PacBio Sequel II platform generated 135,974 HiFi reads (N50: 10,345 bp). De novo assembly with SMRT Link v11.0 produced a single circular chromosome of 1,883,665 bp with 39.04% GC content. Annotation via the NCBI Prokaryotic Genome Annotation Pipeline predicted 1,793 protein-coding genes, four rRNA operons (5 S, 16 S, 23 S), and 120 tRNAs. BUSCO analysis showed 99.1% completeness. Comparative genomics with NSJ-17 and UMB6992B revealed 1,416 core, 223 dispensable, and 398 strain-specific gene clusters. KHUD_010 harbored 18 unique gene clusters comprising 20 genes, mostly assigned to COG category L (replication, recombination, repair). This high-quality genome expands the genomic landscape of S. hominis and provides a valuable reference for future studies on oral microbiome diversity and host adaptation.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"69"},"PeriodicalIF":2.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12482221/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145193983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}