Pub Date : 2026-03-24DOI: 10.1093/gigascience/giag033
Haoke Deng, Xiaolian Ning, Xun Lin, Liang Zong, Shanqiao Zheng, Yun Zhao, Jing Wang, Lingyun Chen, Jin Zi, Zhanlong Mei
Current tools for spatial omics analysis often face challenges in performing integrated transcriptomics and metabolomics analysis, in-depth biological interpretation, and user-friendly operation. To address this, we developed SMIntegration, the first web-based graphical platform designed specifically for integrated spatial metabolomics and transcriptomics analysis. Built with R/Shiny and deployed using Docker containerization, the platform provides a complete integration workflow, starting from pre-processed spatial features through to functional annotation. Its core functions include: (1) automated and interactive spatial registration; (2) cross-modal spatial pattern recognition; (3) flexible differential analysis of genes and mass features based on clustering results, user-defined regions, or cell type annotations; and (4) group-specific gene-metabolite network construction and interactive visualization. Using adjacent mouse brain coronal sections (Stereo-seq transcriptomics and AFADESI-MS metabolomics) as an example, SMIntegration successfully identified both the periaqueductal gray and subcommissural organ, which were missed by single-modality clustering. Cell type analysis revealed an association between astrocyte-enriched GABA metabolism and Slc6a11, while a comparison between the cornu ammonis region and the midbrain periaqueductal gray dissected glutamatergic and endogenous cannabinoid signaling pathway modules. With a zero-code interface, SMIntegration enables a wide range of researchers to deeply explore gene-metabolite interaction mechanisms within microenvironments during development, homeostasis, and disease.
{"title":"SMIntegration: A Web Tool for Comprehensive Spatial Metabolomics and Transcriptomics Integrated Analysis and Visualization.","authors":"Haoke Deng, Xiaolian Ning, Xun Lin, Liang Zong, Shanqiao Zheng, Yun Zhao, Jing Wang, Lingyun Chen, Jin Zi, Zhanlong Mei","doi":"10.1093/gigascience/giag033","DOIUrl":"https://doi.org/10.1093/gigascience/giag033","url":null,"abstract":"<p><p>Current tools for spatial omics analysis often face challenges in performing integrated transcriptomics and metabolomics analysis, in-depth biological interpretation, and user-friendly operation. To address this, we developed SMIntegration, the first web-based graphical platform designed specifically for integrated spatial metabolomics and transcriptomics analysis. Built with R/Shiny and deployed using Docker containerization, the platform provides a complete integration workflow, starting from pre-processed spatial features through to functional annotation. Its core functions include: (1) automated and interactive spatial registration; (2) cross-modal spatial pattern recognition; (3) flexible differential analysis of genes and mass features based on clustering results, user-defined regions, or cell type annotations; and (4) group-specific gene-metabolite network construction and interactive visualization. Using adjacent mouse brain coronal sections (Stereo-seq transcriptomics and AFADESI-MS metabolomics) as an example, SMIntegration successfully identified both the periaqueductal gray and subcommissural organ, which were missed by single-modality clustering. Cell type analysis revealed an association between astrocyte-enriched GABA metabolism and Slc6a11, while a comparison between the cornu ammonis region and the midbrain periaqueductal gray dissected glutamatergic and endogenous cannabinoid signaling pathway modules. With a zero-code interface, SMIntegration enables a wide range of researchers to deeply explore gene-metabolite interaction mechanisms within microenvironments during development, homeostasis, and disease.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147503537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-24DOI: 10.1093/gigascience/giag031
Stephen A Schlebusch, Vladimir Trifonov, Zuzana Halenková, Marharyta Klianitskaya, Dmitrij Dedukh, Aurora Ruiz-Herrera, Lucia Álvarez-González, Gala Pujol, Eva Hřibová, Lucija Andjel, Oldřich Bartoš, Petr Pajer, Tomáš Tichopád, Daniel Kulik, Jan Kotusz, Marie Kaštánková Doležálková, Astrid Böhne, Anatolie Marta, Patrik Horna, Radka Reifová, Yann Guiguen, Heiner Kuhl, Jan Pačes, Karel Janko
Background: Hybridisation between divergent species can result in meiotic aberrations and the emergence of asexual reproduction. Yet, it remains poorly understood to what extent such outcomes arise from genome-wide incompatibilities versus more specific conflicts among individual chromosomes inherited from parental species, including their ability to pair during meiosis in hybrids. It is also unclear how interspecific hybrids cope with differences in sex determination systems, particularly in the context of increased ploidy. Addressing these questions requires high-quality, chromosome-level reference genomes of the parental species involved in hybrid formation.
Findings: Here, we present the first chromosome-level genome assemblies for three hybridising Cobitis species (C. elongatoides, C. taenia, and C. tanaitica), providing a comprehensive framework for investigating the genomic and cytogenetic basis of hybrid sterility and the transition to asexuality. By integrating genome scaffolding, male/female pooled sequencing (Pool-Seq), and molecular cytogenetics, we uncover extensive structural variation among homologous chromosomes of the three species, despite overall karyotype conservation. Population-level analyses revealed that each species possesses distinct, non-homologous sex chromosomes, highlighting rapid sex chromosome turnover in this recently diverged lineage. Finally, the design of chromosome-specific painting probes, which we applied to meiotic metaphase I spreads of diploid hybrids. This approach revealed striking differences in the pairing success of orthologous chromosomes.
Conclusions: Our results demonstrate that individual orthologous chromosomes differ markedly in their ability to form bivalents during meiosis in hybrids, indicating that hybrid meiotic behaviour is shaped by chromosome-specific incompatibilities rather than uniform genome-wide failure. We also found that even closely related parental species possess distinct, non-homologous sex chromosomes, highlighting rapid turnover of sex determination systems in hybridising lineages. Together, these findings provide a high-resolution genomic and cytogenetic framework to explore how the architecture of inherited parental genomes influences sex-specific reproductive outcomes in hybrids-ranging from male sterility to the establishment of fertile, clonally reproducing female lineages-and how such asymmetries may contribute to the emergence of asexuality in vertebrates.
背景:不同物种之间的杂交会导致减数分裂畸变和无性生殖的出现。然而,对于这种结果在多大程度上是由全基因组不相容引起的,而不是亲本物种遗传的单个染色体之间更具体的冲突,包括它们在杂交减数分裂期间的配对能力,人们仍然知之甚少。种间杂交种如何应对性别决定系统的差异,特别是在倍性增加的情况下,目前还不清楚。解决这些问题需要高质量的,染色体水平的亲本物种参与杂交形成的参考基因组。研究结果:本研究首次获得了三种杂交Cobitis物种(C. elongatoides, C. taenia和C. tanaitica)的染色体水平基因组组装,为研究杂交不育和向无性生殖过渡的基因组和细胞遗传学基础提供了一个全面的框架。通过整合基因组脚手架、雄性/雌性池测序(Pool-Seq)和分子细胞遗传学,我们发现尽管整体核型保持不变,但这三个物种的同源染色体之间存在广泛的结构差异。种群水平的分析显示,每个物种都具有不同的、非同源的性染色体,突出了在这个最近分化的谱系中快速的性染色体更新。最后,染色体特异性染色探针的设计,我们将其应用于二倍体杂交种减数分裂中期I的扩散。这种方法揭示了同源染色体配对成功的显著差异。结论:我们的研究结果表明,在杂种减数分裂过程中,个体同源染色体形成二价体的能力存在显著差异,这表明杂种减数分裂行为是由染色体特异性不相容而不是统一的全基因组失败决定的。我们还发现,即使是近亲亲本物种也具有不同的非同源性染色体,这突出了杂交谱系中性别决定系统的快速更替。总之,这些发现提供了一个高分辨率的基因组和细胞遗传学框架,以探索遗传亲本基因组的结构如何影响杂交种中性别特异性的生殖结果——从雄性不育到建立可生育的、无性繁殖的雌性谱系——以及这种不对称如何导致脊椎动物无性繁殖的出现。
{"title":"Sex Chromosome Turnover and Structural Genome Divergence Shapes Meiotic Outcomes in Hybridising Cobitis.","authors":"Stephen A Schlebusch, Vladimir Trifonov, Zuzana Halenková, Marharyta Klianitskaya, Dmitrij Dedukh, Aurora Ruiz-Herrera, Lucia Álvarez-González, Gala Pujol, Eva Hřibová, Lucija Andjel, Oldřich Bartoš, Petr Pajer, Tomáš Tichopád, Daniel Kulik, Jan Kotusz, Marie Kaštánková Doležálková, Astrid Böhne, Anatolie Marta, Patrik Horna, Radka Reifová, Yann Guiguen, Heiner Kuhl, Jan Pačes, Karel Janko","doi":"10.1093/gigascience/giag031","DOIUrl":"https://doi.org/10.1093/gigascience/giag031","url":null,"abstract":"<p><strong>Background: </strong>Hybridisation between divergent species can result in meiotic aberrations and the emergence of asexual reproduction. Yet, it remains poorly understood to what extent such outcomes arise from genome-wide incompatibilities versus more specific conflicts among individual chromosomes inherited from parental species, including their ability to pair during meiosis in hybrids. It is also unclear how interspecific hybrids cope with differences in sex determination systems, particularly in the context of increased ploidy. Addressing these questions requires high-quality, chromosome-level reference genomes of the parental species involved in hybrid formation.</p><p><strong>Findings: </strong>Here, we present the first chromosome-level genome assemblies for three hybridising Cobitis species (C. elongatoides, C. taenia, and C. tanaitica), providing a comprehensive framework for investigating the genomic and cytogenetic basis of hybrid sterility and the transition to asexuality. By integrating genome scaffolding, male/female pooled sequencing (Pool-Seq), and molecular cytogenetics, we uncover extensive structural variation among homologous chromosomes of the three species, despite overall karyotype conservation. Population-level analyses revealed that each species possesses distinct, non-homologous sex chromosomes, highlighting rapid sex chromosome turnover in this recently diverged lineage. Finally, the design of chromosome-specific painting probes, which we applied to meiotic metaphase I spreads of diploid hybrids. This approach revealed striking differences in the pairing success of orthologous chromosomes.</p><p><strong>Conclusions: </strong>Our results demonstrate that individual orthologous chromosomes differ markedly in their ability to form bivalents during meiosis in hybrids, indicating that hybrid meiotic behaviour is shaped by chromosome-specific incompatibilities rather than uniform genome-wide failure. We also found that even closely related parental species possess distinct, non-homologous sex chromosomes, highlighting rapid turnover of sex determination systems in hybridising lineages. Together, these findings provide a high-resolution genomic and cytogenetic framework to explore how the architecture of inherited parental genomes influences sex-specific reproductive outcomes in hybrids-ranging from male sterility to the establishment of fertile, clonally reproducing female lineages-and how such asymmetries may contribute to the emergence of asexuality in vertebrates.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147503572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-23DOI: 10.1093/gigascience/giag030
Josipa Lipovac, Mile Šikić, Riccardo Vicedomini, Krešimir Križanović
Strain-level metagenomic classification is essential for understanding microbial diversity and functional potential, yet remains challenging, particularly when sample composition is unknown and reference databases are large and redundant. Here we present MADRe, a modular and scalable pipeline for long-read strain-level metagenomic classification based on Metagenome Assembly-Driven Database Reduction. Beyond system-level integration, MADRe introduces statistical strategies that leverage assembly-derived genomic context to guide database reduction and probabilistic read reassignment. Specifically, it combines long-read metagenome assembly, contig-to-reference reassignment using an expectation-maximization framework for reference reduction, and probabilistic read mapping reassignment on a reduced database to achieve sensitive and precise strain-level classification. We extensively evaluated MADRe on simulated datasets, mock communities, and a real anaerobic digester sludge metagenome. Across diverse similarity and coverage conditions, MADRe consistently improves precision by reducing false-positive strain detections. MADRe's design allows users to apply either the database reduction or read classification step individually. Using only the read classification step shows results on par with other tested tools. MADRe is open source and publicly available at https://github.com/lbcb-sci/MADRe.
{"title":"MADRe: Strain-level Metagenomic Classification Through Assembly-Driven Database Reduction.","authors":"Josipa Lipovac, Mile Šikić, Riccardo Vicedomini, Krešimir Križanović","doi":"10.1093/gigascience/giag030","DOIUrl":"https://doi.org/10.1093/gigascience/giag030","url":null,"abstract":"<p><p>Strain-level metagenomic classification is essential for understanding microbial diversity and functional potential, yet remains challenging, particularly when sample composition is unknown and reference databases are large and redundant. Here we present MADRe, a modular and scalable pipeline for long-read strain-level metagenomic classification based on Metagenome Assembly-Driven Database Reduction. Beyond system-level integration, MADRe introduces statistical strategies that leverage assembly-derived genomic context to guide database reduction and probabilistic read reassignment. Specifically, it combines long-read metagenome assembly, contig-to-reference reassignment using an expectation-maximization framework for reference reduction, and probabilistic read mapping reassignment on a reduced database to achieve sensitive and precise strain-level classification. We extensively evaluated MADRe on simulated datasets, mock communities, and a real anaerobic digester sludge metagenome. Across diverse similarity and coverage conditions, MADRe consistently improves precision by reducing false-positive strain detections. MADRe's design allows users to apply either the database reduction or read classification step individually. Using only the read classification step shows results on par with other tested tools. MADRe is open source and publicly available at https://github.com/lbcb-sci/MADRe.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147503595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-17DOI: 10.1093/gigascience/giag028
David R Hemprich-Bennett, Ezekiel Donkor, Bernard Adams, Naana Afua Acquaah, Eva D Ofori, Samuel Anie-Amoah, Abigail Bailey, H Charles J Godfray, Owen T Lewis, Fred Aboagye-Antwi, Talya D Hackett
Background: West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, and insecticide use and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs, a species proxy). Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.
Results: Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.
Conclusions: This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.
{"title":"Characterising a species-rich and understudied tropical insect fauna using DNA barcoding.","authors":"David R Hemprich-Bennett, Ezekiel Donkor, Bernard Adams, Naana Afua Acquaah, Eva D Ofori, Samuel Anie-Amoah, Abigail Bailey, H Charles J Godfray, Owen T Lewis, Fred Aboagye-Antwi, Talya D Hackett","doi":"10.1093/gigascience/giag028","DOIUrl":"https://doi.org/10.1093/gigascience/giag028","url":null,"abstract":"<p><strong>Background: </strong>West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, and insecticide use and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs, a species proxy). Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.</p><p><strong>Results: </strong>Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.</p><p><strong>Conclusions: </strong>This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147473325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-09DOI: 10.1093/gigascience/giag027
Jakub Savara, Tomas Novosad, Petr Gajdos, Anna Petrackova, Marek Behalek, Jirina Manakova, Filip Ctvrtlik, Jiri Minarik, Tomas Papajik, Eva Kriegova
Background: Structural variants (SVs) are increasingly recognised as key contributors to human diseases. However, our understanding of SVs in health and disease is limited, mainly due to their structural complexity and variable length in individuals as well as limitations inherent to the available genomic technologies and reference genome used.
Results: To systematically evaluate SVs across human whole-genome samples using hg38/GRCh38 and gapless T2T-CHM13 references, we introduced an innovative multiplatform approach, LongReadChecker (LoReC), which advances SV comparison and annotation based on distance variance, intersection, gene overlap and the closest SV in the clinical database. Comparison of the performance in detecting SVs from public and our own whole-genome datasets from short-read sequencing (SRS), available long-read sequencing (LRS) platforms and optical genome mapping (OGM) revealed that most SVs detected by SRS were confirmed by LRS, but LRS can identify twice as many SVs (25,000 SVs/genome) with greater read mapping accuracy. Our LoReC analysis further highlights the utility of the T2T-CHM13 reference in SV detection, as 20% more deletions and 20% less insertions were detected compared with hg38/GRCh38, which was particularly evident in long-read datasets. Since 80% of the SVs detected by LRS/SRS are smaller than 0.5 kbp, OGM did not detect them.
Conclusions: Our study revealed that introducing distance variance, intersection, gene overlap and the closest SV in the clinical database may help compare and annotate SVs in diagnostics. Our data showed that LRS together with T2T-CHM13 gapless sequences can improve the diagnostics of patients with human diseases when SRS fails to identify the cause.
{"title":"Multiplatform comparisons and annotation of structural variants highlight the utility of the T2T reference genome in human diagnostics.","authors":"Jakub Savara, Tomas Novosad, Petr Gajdos, Anna Petrackova, Marek Behalek, Jirina Manakova, Filip Ctvrtlik, Jiri Minarik, Tomas Papajik, Eva Kriegova","doi":"10.1093/gigascience/giag027","DOIUrl":"https://doi.org/10.1093/gigascience/giag027","url":null,"abstract":"<p><strong>Background: </strong>Structural variants (SVs) are increasingly recognised as key contributors to human diseases. However, our understanding of SVs in health and disease is limited, mainly due to their structural complexity and variable length in individuals as well as limitations inherent to the available genomic technologies and reference genome used.</p><p><strong>Results: </strong>To systematically evaluate SVs across human whole-genome samples using hg38/GRCh38 and gapless T2T-CHM13 references, we introduced an innovative multiplatform approach, LongReadChecker (LoReC), which advances SV comparison and annotation based on distance variance, intersection, gene overlap and the closest SV in the clinical database. Comparison of the performance in detecting SVs from public and our own whole-genome datasets from short-read sequencing (SRS), available long-read sequencing (LRS) platforms and optical genome mapping (OGM) revealed that most SVs detected by SRS were confirmed by LRS, but LRS can identify twice as many SVs (25,000 SVs/genome) with greater read mapping accuracy. Our LoReC analysis further highlights the utility of the T2T-CHM13 reference in SV detection, as 20% more deletions and 20% less insertions were detected compared with hg38/GRCh38, which was particularly evident in long-read datasets. Since 80% of the SVs detected by LRS/SRS are smaller than 0.5 kbp, OGM did not detect them.</p><p><strong>Conclusions: </strong>Our study revealed that introducing distance variance, intersection, gene overlap and the closest SV in the clinical database may help compare and annotate SVs in diagnostics. Our data showed that LRS together with T2T-CHM13 gapless sequences can improve the diagnostics of patients with human diseases when SRS fails to identify the cause.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147389980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-05DOI: 10.1093/gigascience/giag026
Stefano Porrelli, Alice Fornasiero, Hong Phuong Le, Wenzhe Yin, Maria Navarrete Rodriguez, Nahed Mohammed, Axel Himmelbach, Andrew C Clarke, Nils Stein, Paul J Kersey, Rod A Wing, Rafal M Gutaker
Herbarium collections are a vast but underutilized resource for ancient DNA research, containing over 400 million specimens with detailed metadata and spanning centuries of global biodiversity. Understanding patterns of DNA preservation in natural collections is crucial for optimizing ancient DNA studies and informing future curation practices. We analysed genomic data for 573 herbarium specimens from six plant species from the genera Hordeum and Oryza collected from the Americas and Eurasia over 220 years. Using standardized laboratory protocols and shotgun sequencing, we quantified DNA degradation and elucidated factors that accelerate it. We find significant age-dependent DNA fragmentation rates, indicating temporal degradation processes not detected in prehistoric samples. In our analysis, DNA decay rates in herbarium specimens were almost eight times faster than in moa bones, reflecting fundamental differences in tissue composition and preservation environments. Environmental conditions at the time of specimen collection emerged as the major determinants of post-mortem damage rates, with the interaction term between temperature and genus being the dominant driver of cytosine deamination. We find no effect of sample storage on DNA damage and degradation. These findings provide insights into how climatic origin, preservation environment, taxonomic identity and age influence DNA preservation while highlighting opportunities for improving institutional preservation practices. Due to standardised preservation conditions, museum collections can provide better insights into DNA damage and degradation over time than archaeological and paleontological samples.
{"title":"Patterns of aDNA Damage Through Time and Environments - lessons from herbarium specimens.","authors":"Stefano Porrelli, Alice Fornasiero, Hong Phuong Le, Wenzhe Yin, Maria Navarrete Rodriguez, Nahed Mohammed, Axel Himmelbach, Andrew C Clarke, Nils Stein, Paul J Kersey, Rod A Wing, Rafal M Gutaker","doi":"10.1093/gigascience/giag026","DOIUrl":"https://doi.org/10.1093/gigascience/giag026","url":null,"abstract":"<p><p>Herbarium collections are a vast but underutilized resource for ancient DNA research, containing over 400 million specimens with detailed metadata and spanning centuries of global biodiversity. Understanding patterns of DNA preservation in natural collections is crucial for optimizing ancient DNA studies and informing future curation practices. We analysed genomic data for 573 herbarium specimens from six plant species from the genera Hordeum and Oryza collected from the Americas and Eurasia over 220 years. Using standardized laboratory protocols and shotgun sequencing, we quantified DNA degradation and elucidated factors that accelerate it. We find significant age-dependent DNA fragmentation rates, indicating temporal degradation processes not detected in prehistoric samples. In our analysis, DNA decay rates in herbarium specimens were almost eight times faster than in moa bones, reflecting fundamental differences in tissue composition and preservation environments. Environmental conditions at the time of specimen collection emerged as the major determinants of post-mortem damage rates, with the interaction term between temperature and genus being the dominant driver of cytosine deamination. We find no effect of sample storage on DNA damage and degradation. These findings provide insights into how climatic origin, preservation environment, taxonomic identity and age influence DNA preservation while highlighting opportunities for improving institutional preservation practices. Due to standardised preservation conditions, museum collections can provide better insights into DNA damage and degradation over time than archaeological and paleontological samples.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147354631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Zygotic genome activation (ZGA) is a pivotal process during early embryogenesis, marking the maternal-to-zygotic transition. ZGA is regulated by a variety of epigenetic and transcriptional factors. However, the transcriptional regulatory mechanisms underlying ZGA in livestock species remain largely unclear.
Results: By integrating ATAC-seq and RNA-seq, we characterized chromatin accessibility and transcriptional dynamics in goat embryos. Transcriptional inhibition with α-amanitin markedly reduced promoter accessibility and disrupted RNA polymerase II (Pol II)-mediated transcription. Motif enrichment analysis identified ZNF331 as a potential regulator with specific upregulation at the 8-cell stage. Functional knockdown of ZNF331 resulted in impaired embryonic development, reduced blastocyst formation, and widespread transcriptome alterations. Mechanistically, ZNF331 depletion caused abnormal elevation of Pol II Ser5 phosphorylation, excessive transcriptional activity, maternal mRNA retention, and excessive activation of zygotic genes.
Conclusion: Our study identifies ZNF331 as a critical regulator of goat ZGA, functioning through fine-tuning Pol II Ser5 phosphorylation to balance maternal transcript clearance and zygotic gene activation. These findings highlight the essential role of the ZNF331-Pol II axis in goat embryogenesis and suggest a potentially conserved mechanism across mammals.
背景:合子基因组激活(Zygotic genome activation, ZGA)是胚胎早期发生的关键过程,标志着母体向合子的转变。ZGA受多种表观遗传和转录因子的调控。然而,家畜ZGA的转录调控机制在很大程度上仍不清楚。结果:通过整合ATAC-seq和RNA-seq,我们表征了山羊胚胎的染色质可及性和转录动力学。α-amanitin的转录抑制显著降低了启动子的可及性和RNA聚合酶II (Pol II)介导的转录中断。Motif富集分析发现ZNF331是8细胞期特异性上调的潜在调控因子。ZNF331的功能性敲低导致胚胎发育受损、囊胚形成减少和广泛的转录组改变。从机制上讲,ZNF331缺失导致Pol II Ser5磷酸化异常升高、转录活性过度、母体mRNA保留和合子基因过度激活。结论:我们的研究发现ZNF331是山羊ZGA的关键调节因子,通过微调Pol II Ser5磷酸化来平衡母体转录物清除和合子基因激活。这些发现强调了ZNF331-Pol II轴在山羊胚胎发生中的重要作用,并提示了哺乳动物中潜在的保守机制。
{"title":"ZNF331 Modulates Early Embryonic Transcription During Zygotic Genome Activation in Goat.","authors":"Yingnan Yang, Jinhao Zhang, Xiaowei Chen, Haonan Chen, Dongxu Li, Yongjie Wan, Mingtian Deng, Feng Wang","doi":"10.1093/gigascience/giag025","DOIUrl":"https://doi.org/10.1093/gigascience/giag025","url":null,"abstract":"<p><strong>Background: </strong>Zygotic genome activation (ZGA) is a pivotal process during early embryogenesis, marking the maternal-to-zygotic transition. ZGA is regulated by a variety of epigenetic and transcriptional factors. However, the transcriptional regulatory mechanisms underlying ZGA in livestock species remain largely unclear.</p><p><strong>Results: </strong>By integrating ATAC-seq and RNA-seq, we characterized chromatin accessibility and transcriptional dynamics in goat embryos. Transcriptional inhibition with α-amanitin markedly reduced promoter accessibility and disrupted RNA polymerase II (Pol II)-mediated transcription. Motif enrichment analysis identified ZNF331 as a potential regulator with specific upregulation at the 8-cell stage. Functional knockdown of ZNF331 resulted in impaired embryonic development, reduced blastocyst formation, and widespread transcriptome alterations. Mechanistically, ZNF331 depletion caused abnormal elevation of Pol II Ser5 phosphorylation, excessive transcriptional activity, maternal mRNA retention, and excessive activation of zygotic genes.</p><p><strong>Conclusion: </strong>Our study identifies ZNF331 as a critical regulator of goat ZGA, functioning through fine-tuning Pol II Ser5 phosphorylation to balance maternal transcript clearance and zygotic gene activation. These findings highlight the essential role of the ZNF331-Pol II axis in goat embryogenesis and suggest a potentially conserved mechanism across mammals.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147354644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.1093/gigascience/giag021
Damian Panas, Marcin Tabaka
The integration of high-throughput single-cell profiling technologies with RNA velocity analysis has enabled the reconstruction of dynamic cellular differentiation trajectories at unprecedented resolution. Despite these advances, current visualization techniques for RNA velocity are predominantly confined to two-dimensional representations, typically employing arrows or streamlines. While effective for depicting simple cellular trajectories, these approaches are insufficient for capturing the complex topologies of multipartite cellular transitions. This limitation highlights the need for advanced three-dimensional visualization tools that can more accurately convey the structure and dynamics of velocity-inferred transitions in single-cell data. Here, we present Cell Journey, an interactive visualization platform specifically developed for three-dimensional analysis and representation of RNA velocity trajectories derived from single-cell datasets. The platform features an intuitive graphical interface supporting both unimodal and multimodal data, accommodates multiple input formats, and provides extensive customization capabilities for trajectory visualization. Cell Journey computes RNA velocity vector fields on a user-defined three-dimensional grid and constructs velocity trajectories using either Euler integration or the fourth-order Runge-Kutta method. The platform enables dynamic exploration of cellular dynamics through interactive visual elements, including streamlines, streamlets, cones, and volumetric plots. Furthermore, it allows users to investigate changes in feature activity along selected paths, facilitating deeper insights into cellular state transitions within complex multimodal single-cell datasets.
{"title":"Interactive analysis of single-cell trajectories in 3D space with Cell Journey.","authors":"Damian Panas, Marcin Tabaka","doi":"10.1093/gigascience/giag021","DOIUrl":"https://doi.org/10.1093/gigascience/giag021","url":null,"abstract":"<p><p>The integration of high-throughput single-cell profiling technologies with RNA velocity analysis has enabled the reconstruction of dynamic cellular differentiation trajectories at unprecedented resolution. Despite these advances, current visualization techniques for RNA velocity are predominantly confined to two-dimensional representations, typically employing arrows or streamlines. While effective for depicting simple cellular trajectories, these approaches are insufficient for capturing the complex topologies of multipartite cellular transitions. This limitation highlights the need for advanced three-dimensional visualization tools that can more accurately convey the structure and dynamics of velocity-inferred transitions in single-cell data. Here, we present Cell Journey, an interactive visualization platform specifically developed for three-dimensional analysis and representation of RNA velocity trajectories derived from single-cell datasets. The platform features an intuitive graphical interface supporting both unimodal and multimodal data, accommodates multiple input formats, and provides extensive customization capabilities for trajectory visualization. Cell Journey computes RNA velocity vector fields on a user-defined three-dimensional grid and constructs velocity trajectories using either Euler integration or the fourth-order Runge-Kutta method. The platform enables dynamic exploration of cellular dynamics through interactive visual elements, including streamlines, streamlets, cones, and volumetric plots. Furthermore, it allows users to investigate changes in feature activity along selected paths, facilitating deeper insights into cellular state transitions within complex multimodal single-cell datasets.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147343982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-03DOI: 10.1093/gigascience/giag019
Sowmya S Sundaram, Rafael S Gonçalves, Mark A Musen
Scientific metadata often suffer from incompleteness, inconsistency, and formatting errors, which hinder effective discovery and reuse of the associated datasets. We present a method that combines Generative Pre-trained Transformer 4 (GPT-4) with structured metadata templates from the Center for Expanded Data Annotation and Retrieval (CEDAR) knowledge base to automatically standardize metadata and to ensure compliance with established standards. A CEDAR template specifies the expected fields of a metadata submission and their permissible values. Our standardization process involves using CEDAR templates to guide the GPT-4 in accurately correcting and refining metadata entries in bulk, resulting in significant improvements in metadata retrieval performance, especially in recall-the proportion of relevant datasets retrieved from the total relevant datasets available. Using the BioSample and Gene Expression Omnibus (GEO) repositories maintained by the National Center for Biotechnology Information (NCBI), we demonstrate that retrieval of datasets whose metadata are altered by GPT-4 when provided with CEDAR templates (GPT-4+CEDAR) is substantially better than retrieval of datasets whose metadata are in their original state and that of datasets whose metadata are altered using GPT-4 with only data-dictionary guidance (GPT-4+DD). The average recall increases dramatically, from 17.65% with baseline raw metadata to 62.87% with GPT-4+CEDAR. Furthermore, we evaluate the robustness of our approach by comparing GPT-4 against other large language models, including LLaMA-3 and MedLLaMA2, demonstrating consistent performance advantages for GPT-4+CEDAR. These results underscore the transformative potential of combining advanced language models with symbolic models of standardized metadata structures for more effective and reliable data retrieval, thus accelerating scientific discoveries and data-driven research.
{"title":"Toward total recall: Enhancing data FAIRness through AI-driven metadata standardization.","authors":"Sowmya S Sundaram, Rafael S Gonçalves, Mark A Musen","doi":"10.1093/gigascience/giag019","DOIUrl":"https://doi.org/10.1093/gigascience/giag019","url":null,"abstract":"<p><p>Scientific metadata often suffer from incompleteness, inconsistency, and formatting errors, which hinder effective discovery and reuse of the associated datasets. We present a method that combines Generative Pre-trained Transformer 4 (GPT-4) with structured metadata templates from the Center for Expanded Data Annotation and Retrieval (CEDAR) knowledge base to automatically standardize metadata and to ensure compliance with established standards. A CEDAR template specifies the expected fields of a metadata submission and their permissible values. Our standardization process involves using CEDAR templates to guide the GPT-4 in accurately correcting and refining metadata entries in bulk, resulting in significant improvements in metadata retrieval performance, especially in recall-the proportion of relevant datasets retrieved from the total relevant datasets available. Using the BioSample and Gene Expression Omnibus (GEO) repositories maintained by the National Center for Biotechnology Information (NCBI), we demonstrate that retrieval of datasets whose metadata are altered by GPT-4 when provided with CEDAR templates (GPT-4+CEDAR) is substantially better than retrieval of datasets whose metadata are in their original state and that of datasets whose metadata are altered using GPT-4 with only data-dictionary guidance (GPT-4+DD). The average recall increases dramatically, from 17.65% with baseline raw metadata to 62.87% with GPT-4+CEDAR. Furthermore, we evaluate the robustness of our approach by comparing GPT-4 against other large language models, including LLaMA-3 and MedLLaMA2, demonstrating consistent performance advantages for GPT-4+CEDAR. These results underscore the transformative potential of combining advanced language models with symbolic models of standardized metadata structures for more effective and reliable data retrieval, thus accelerating scientific discoveries and data-driven research.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147344007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A characteristic feature of the liver is the presence of numerous polyploid hepatocytes. However, the functional distinctions among diploid, tetraploid, and octoploid hepatocytes remain poorly understood. In this study, we employed the spatially resolved single-cell sequencing technology, Stereo-cell, to dissect the transcriptomic and functional heterogeneity across hepatocyte ploidy subtypes. We detail the development of Stereo-cell Imaging-based ploidy Identification (SCIPI), a technical pipeline that integrates bright-field cell contour recognition, DAPI-based nuclear area and number quantification, and UMI-barcoded single-cell transcriptomics. This approach enables precise identification of six core hepatocyte subtypes: mononucleated diploid (2n×1), mononucleated tetraploid (4n×1), binucleated tetraploid (2n×2), mononucleated octoploid (8n×1), binucleated octoploid (4n×2), and binucleated hexadecaploid (8n×2) hepatocytes. Single-cell transcriptomic analysis based on ploidy annotation revealed that gene expression levels scale positively with increasing ploidy and nuclear number. Metabolic pathway-associated genes were significantly upregulated in polyploid cells, suggesting that cellular polyploidization enhances the metabolic activity of hepatocytes. Furthermore, this SCIPI strategy is broadly applicable to the study of various polyploid tissues, offering a novel and versatile framework for innovative ploidy-resolved research across diverse biological researches.
{"title":"Stereo-cell deciphers the spatial and functional heterogeneity of polyploid hepatocytes.","authors":"Yongqing Yang, Jiahui Luo, Yier Cai, Pengcheng Guo, Qiang Guo, Hong Wu, Longqi Liu, Shijie Hao","doi":"10.1093/gigascience/giag023","DOIUrl":"https://doi.org/10.1093/gigascience/giag023","url":null,"abstract":"<p><p>A characteristic feature of the liver is the presence of numerous polyploid hepatocytes. However, the functional distinctions among diploid, tetraploid, and octoploid hepatocytes remain poorly understood. In this study, we employed the spatially resolved single-cell sequencing technology, Stereo-cell, to dissect the transcriptomic and functional heterogeneity across hepatocyte ploidy subtypes. We detail the development of Stereo-cell Imaging-based ploidy Identification (SCIPI), a technical pipeline that integrates bright-field cell contour recognition, DAPI-based nuclear area and number quantification, and UMI-barcoded single-cell transcriptomics. This approach enables precise identification of six core hepatocyte subtypes: mononucleated diploid (2n×1), mononucleated tetraploid (4n×1), binucleated tetraploid (2n×2), mononucleated octoploid (8n×1), binucleated octoploid (4n×2), and binucleated hexadecaploid (8n×2) hepatocytes. Single-cell transcriptomic analysis based on ploidy annotation revealed that gene expression levels scale positively with increasing ploidy and nuclear number. Metabolic pathway-associated genes were significantly upregulated in polyploid cells, suggesting that cellular polyploidization enhances the metabolic activity of hepatocytes. Furthermore, this SCIPI strategy is broadly applicable to the study of various polyploid tissues, offering a novel and versatile framework for innovative ploidy-resolved research across diverse biological researches.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147325628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}