首页 > 最新文献

GigaScience最新文献

英文 中文
Giant chromosomes of a tiny plant-the complete telomere-to-telomere genome assembly of the simple thalloid liverwort Apopellia endiviifolia (Jungermanniopsida, Marchantiophyta). 一种微小植物的巨大染色体——简单菌体肝草Apopellia endiviifolia (Jungermanniopsida, Marchantiophyta)端粒到端粒的完整基因组组装。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-21 DOI: 10.1093/gigascience/giaf145
Joanna Szablińska-Piernik, Paweł Sulima, Jakub Sawicki

Background: The liverwort Apopellia endiviifolia, a dioicous, simple thalloid species, is notable for its cryptic diversity, habitat adaptability, and genomic innovation, and it represents a clade that is sister to all other Jungermanniopsida. These features make A. endiviifolia an essential model for exploring speciation mechanisms and the evolution of genome structures within liverworts.

Findings: We present the genome assembly of a haploid A. endiviifolia isolate with a total size of 2,914,960,273 bp and an N50 of 468,157,909 bp, demonstrating high completeness (99.2% BUSCO) and a high consensus quality (quality value 47.6). The assembly consisted of 9 chromosomes, which included 18 telomeres and 9 centromeres (ranging from 1.9 to 5 Mbp in length). RNA sequencing-based annotation identified 34,615 genes, predominantly protein coding. The transposable elements comprised 12.16% long terminal repeat elements and 57 Helitrons. Among the retroelements, the Copia and Gypsy superfamilies comprised 8.94% and 2.95% of the genome, respectively. The Ty3/Gypsy superfamily was significantly enriched in centromeric regions. The average GC content ranged from 38.8% to 39.6%, with gene density varying between 5.52 and 9.78 genes per 500 kbp. Synteny analysis of related liverwort species has revealed complex chromosomal relationships, indicating extensive genome rearrangements among species.

Conclusions: This study provides the first high-quality reference genome assembly of the haploid liverwort A. endiviifolia. Assembly and annotation offer valuable resources for investigating liverwort evolution, centromere biology, and genome expansion in simple thalloid liverworts.

背景:苔类a . endiviifolia dioicous,简单的叶状物种,值得注意的是它的神秘的多样性、生境适应性,基因组创新,代表着一个进化枝,是所有其他Jungermanniopsida妹妹。这些特征使其成为探索地植物物种形成机制和基因组结构进化的重要模型。结果:我们展示了一个单倍体a . endiviifolia分离物的基因组组装,其总大小为2,914,960,273 bp, N50为468,157,909 bp,显示出高完整性(99.2% BUSCO)和高一致性质量(QV 47.6)。该组合由9条染色体组成,其中包括18个端粒和9个着丝粒(长度从1.9到5mbp不等)。基于rna -seq的注释鉴定了34,615个基因,主要是蛋白质编码。TEs由12.16%的LTR元素和57个helitron组成。其中,Copia超家族和Gypsy超家族分别占基因组的8.94%和2.95%。Ty3/Gypsy超家族在着丝粒区显著富集。平均GC含量为38.8% ~ 39.6%,基因密度为5.52 ~ 9.78个/ 500 kbp。近缘种的同源性分析揭示了复杂的染色体关系,表明物种之间广泛的基因组重排。结论:本研究提供了第一个高质量的单倍体肝草参考基因组序列。组装和注释为研究简单菌体苔类的进化、着丝粒生物学和基因组扩增提供了宝贵的资源。
{"title":"Giant chromosomes of a tiny plant-the complete telomere-to-telomere genome assembly of the simple thalloid liverwort Apopellia endiviifolia (Jungermanniopsida, Marchantiophyta).","authors":"Joanna Szablińska-Piernik, Paweł Sulima, Jakub Sawicki","doi":"10.1093/gigascience/giaf145","DOIUrl":"10.1093/gigascience/giaf145","url":null,"abstract":"<p><strong>Background: </strong>The liverwort Apopellia endiviifolia, a dioicous, simple thalloid species, is notable for its cryptic diversity, habitat adaptability, and genomic innovation, and it represents a clade that is sister to all other Jungermanniopsida. These features make A. endiviifolia an essential model for exploring speciation mechanisms and the evolution of genome structures within liverworts.</p><p><strong>Findings: </strong>We present the genome assembly of a haploid A. endiviifolia isolate with a total size of 2,914,960,273 bp and an N50 of 468,157,909 bp, demonstrating high completeness (99.2% BUSCO) and a high consensus quality (quality value 47.6). The assembly consisted of 9 chromosomes, which included 18 telomeres and 9 centromeres (ranging from 1.9 to 5 Mbp in length). RNA sequencing-based annotation identified 34,615 genes, predominantly protein coding. The transposable elements comprised 12.16% long terminal repeat elements and 57 Helitrons. Among the retroelements, the Copia and Gypsy superfamilies comprised 8.94% and 2.95% of the genome, respectively. The Ty3/Gypsy superfamily was significantly enriched in centromeric regions. The average GC content ranged from 38.8% to 39.6%, with gene density varying between 5.52 and 9.78 genes per 500 kbp. Synteny analysis of related liverwort species has revealed complex chromosomal relationships, indicating extensive genome rearrangements among species.</p><p><strong>Conclusions: </strong>This study provides the first high-quality reference genome assembly of the haploid liverwort A. endiviifolia. Assembly and annotation offer valuable resources for investigating liverwort evolution, centromere biology, and genome expansion in simple thalloid liverworts.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12885004/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145632216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
nf-core/proteinfamilies: A scalable pipeline for the generation of protein families. nf-core/proteinfamilies:一个可扩展的蛋白质家族生成管道。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-21 DOI: 10.1093/gigascience/giag009
Evangelos Karatzas, Martin Beracochea, Fotis A Baltoumas, Eleni Aplakidou, Lorna Richardson, James A Fellows Yates, Daniel Lundin

The growth of metagenomics-derived amino acid sequence data has transformed our understanding of protein function, microbial diversity, and evolutionary relationships. However, the vast majority of these proteins remain functionally uncharacterized. Grouping the millions of such uncharacterised sequences with the few experimentally characterised ones allows the transfer of annotations, while the inspection of conserved residues with multiple sequence alignments can provide clues to function, even in the absence of existing functional information. To address the challenges associated with this data surge and the need to group sequences, we present a scalable, open-source, parametrizable Nextflow pipeline (nf-core/proteinfamilies) that generates nascent protein families or assigns new proteins to existing families. The computational benchmarks demonstrated that resource usage scales approximately linearly with input size, and the biological benchmarks showed that the generated protein families closely resemble manually curated families in widely used databases.

元基因组学衍生的氨基酸序列数据的增长改变了我们对蛋白质功能、微生物多样性和进化关系的理解。然而,这些蛋白质中的绝大多数在功能上仍未被表征。将数百万这样的未表征序列与少数实验表征序列分组允许注释的转移,而使用多个序列比对检查保守残基可以提供功能线索,即使在缺乏现有功能信息的情况下。为了应对与数据激增相关的挑战和对序列进行分组的需求,我们提出了一个可扩展、开源、可参数化的Nextflow管道(nf-core/proteinfamilies),它可以生成新生的蛋白质家族或将新的蛋白质分配给现有的家族。计算基准测试表明,资源使用与输入大小呈近似线性关系,生物基准测试表明,生成的蛋白质家族与广泛使用的数据库中人工筛选的家族非常相似。
{"title":"nf-core/proteinfamilies: A scalable pipeline for the generation of protein families.","authors":"Evangelos Karatzas, Martin Beracochea, Fotis A Baltoumas, Eleni Aplakidou, Lorna Richardson, James A Fellows Yates, Daniel Lundin","doi":"10.1093/gigascience/giag009","DOIUrl":"https://doi.org/10.1093/gigascience/giag009","url":null,"abstract":"<p><p>The growth of metagenomics-derived amino acid sequence data has transformed our understanding of protein function, microbial diversity, and evolutionary relationships. However, the vast majority of these proteins remain functionally uncharacterized. Grouping the millions of such uncharacterised sequences with the few experimentally characterised ones allows the transfer of annotations, while the inspection of conserved residues with multiple sequence alignments can provide clues to function, even in the absence of existing functional information. To address the challenges associated with this data surge and the need to group sequences, we present a scalable, open-source, parametrizable Nextflow pipeline (nf-core/proteinfamilies) that generates nascent protein families or assigns new proteins to existing families. The computational benchmarks demonstrated that resource usage scales approximately linearly with input size, and the biological benchmarks showed that the generated protein families closely resemble manually curated families in widely used databases.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146009872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
pyRootHair: Machine learning accelerated software for high-throughput phenotyping of plant root hair traits. pyRootHair:用于植物根毛性状高通量表型的机器学习加速软件。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-21 DOI: 10.1093/gigascience/giaf141
Ian Tsang, Lawrence Percival-Alwyn, Stephen Rawsthorne, James Cockram, Fiona Leigh, Jonathan A Atkinson

Background: Root hairs play a key role in plant nutrient and water uptake. Historically, root hair traits have largely been quantified manually. As such, this process has been laborious and low-throughput. However, given their importance for plant health and development, high-throughput quantification of root hair morphology could help underpin rapid advances in the genetic understanding of these traits. With recent increases in the accessibility and availability of artificial intelligence (AI) and machine learning techniques, the development of tools to automate plant phenotyping processes has been greatly accelerated.

Results: We present pyRootHair, a high-throughput, AI-powered software application to automate root hair trait extraction from microscope images of plant roots grown on agar plates. pyRootHair is capable of batch processing over 600 images per hour without manual input from the end user. In this study, we deploy pyRootHair on a panel of 24 diverse wheat (Triticum aestivum and Triticum turgidum ssp. durum) cultivars and uncover a large, previously unresolved amount of variation in many root hair traits. We show that the overall root hair profile falls under 2 distinct shape categories and that different root hair traits often correlate with each other. We also demonstrate that pyRootHair can be deployed on a range of plant species, including oat (Avena sativa), rice (Oryza sativa), teff (Eragrostis tef), and tomato (Solanum lycopersicum).

Conclusions: The application of pyRootHair enables users to rapidly screen a large number of plant germplasm resources for variation in root hair morphology, supporting high-resolution measurements and high-throughput data analysis. This facilitates downstream investigation of the impacts of root hair genetic control and morphological variation on plant performance. pyRootHair is installable via PyPI (https://pypi.org/project/pyRootHair/) and can be accessed on GitHub at https://github.com/iantsang779/pyRootHair.

根毛在植物养分和水分吸收中起着关键作用。历史上,根毛性状在很大程度上是人工量化的。因此,这个过程一直是费力和低吞吐量的。然而,鉴于它们对植物健康和发育的重要性,根毛形态的高通量定量可以帮助在这些性状的遗传理解方面取得快速进展。随着人工智能(AI)和机器学习技术的可及性和可用性的增加,自动化植物表型过程的工具的开发已经大大加快。在这里,我们展示了pyRootHair,这是一款高通量、人工智能驱动的软件应用程序,可以从琼脂板上生长的植物根系的显微镜图像中自动提取根毛特征。pyRootHair能够每小时批量处理超过600张图像,而无需最终用户的手动输入。在这项研究中,我们将pyRootHair部署在24种不同小麦(Triticum aestivum和Triticum turgidum ssp)的面板上。在许多根毛性状中发现了大量以前未解决的变异。我们表明,整体的根毛轮廓属于两个不同的形状类别,不同的根毛性状往往相互关联。我们还证明pyRootHair可以应用于一系列植物物种,包括燕麦(Avena sativa)、水稻(Oryza sativa)、苔麸(Eragrostis tef)和番茄(Solanum lycopersicum)。pyRootHair的应用使用户能够快速筛选大量植物种质资源的根毛形态变异,支持高分辨率测量和高通量数据分析。这有助于下游研究根毛遗传控制和形态变异对植物性能的影响。pyRootHair可以通过PyPI: https://pypi.org/project/pyRootHair/安装,也可以在GitHub上访问https://github.com/iantsang779/pyRootHair。
{"title":"pyRootHair: Machine learning accelerated software for high-throughput phenotyping of plant root hair traits.","authors":"Ian Tsang, Lawrence Percival-Alwyn, Stephen Rawsthorne, James Cockram, Fiona Leigh, Jonathan A Atkinson","doi":"10.1093/gigascience/giaf141","DOIUrl":"10.1093/gigascience/giaf141","url":null,"abstract":"<p><strong>Background: </strong>Root hairs play a key role in plant nutrient and water uptake. Historically, root hair traits have largely been quantified manually. As such, this process has been laborious and low-throughput. However, given their importance for plant health and development, high-throughput quantification of root hair morphology could help underpin rapid advances in the genetic understanding of these traits. With recent increases in the accessibility and availability of artificial intelligence (AI) and machine learning techniques, the development of tools to automate plant phenotyping processes has been greatly accelerated.</p><p><strong>Results: </strong>We present pyRootHair, a high-throughput, AI-powered software application to automate root hair trait extraction from microscope images of plant roots grown on agar plates. pyRootHair is capable of batch processing over 600 images per hour without manual input from the end user. In this study, we deploy pyRootHair on a panel of 24 diverse wheat (Triticum aestivum and Triticum turgidum ssp. durum) cultivars and uncover a large, previously unresolved amount of variation in many root hair traits. We show that the overall root hair profile falls under 2 distinct shape categories and that different root hair traits often correlate with each other. We also demonstrate that pyRootHair can be deployed on a range of plant species, including oat (Avena sativa), rice (Oryza sativa), teff (Eragrostis tef), and tomato (Solanum lycopersicum).</p><p><strong>Conclusions: </strong>The application of pyRootHair enables users to rapidly screen a large number of plant germplasm resources for variation in root hair morphology, supporting high-resolution measurements and high-throughput data analysis. This facilitates downstream investigation of the impacts of root hair genetic control and morphological variation on plant performance. pyRootHair is installable via PyPI (https://pypi.org/project/pyRootHair/) and can be accessed on GitHub at https://github.com/iantsang779/pyRootHair.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12824728/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145512386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harnessing artificial intelligence for genomic variant prediction: advances, challenges, and future directions. 利用人工智能进行基因组变异预测:进展、挑战和未来方向。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-21 DOI: 10.1093/gigascience/giag004
Indah Pakpahan, Mentari Sihombing, Haohan Liu, Mengyao Wang, Zheng Su, Mingyan Fang

Accurate genetic variant interpretation is crucial for disease research and the development of targeted therapies. Artificial intelligence is transforming this field by integrating computational methodologies across structural biology, evolutionary analysis, and multimodal genomic data. This review examines the evolution from traditional rule-based systems and statistical models to contemporary machine learning, deep learning, and protein language models, while addressing critical challenges in variant classification. Key obstacles include data heterogeneity, interpretability, and the persistence of variants of uncertain significance, emphasizing the critical need for explainable artificial intelligence frameworks and more inclusive genomic databases to improve predictive accuracy across diverse populations. Based on the assessment of current variant impact predictors, we propose strategies for enhanced predictor selection, effective multi-omics data integration, and optimized computational workflows. These recommendations aim to enhance variant interpretation accuracy in both research settings and clinical practice, ultimately contributing to advances in personalized medicine.

准确的基因变异解释对于疾病研究和靶向治疗的发展至关重要。人工智能(AI)通过整合结构生物学、进化分析和多模态基因组数据的计算方法,正在改变这一领域。本文回顾了从传统的基于规则的系统和统计模型到当代机器学习、深度学习和蛋白质语言模型的演变,同时解决了变体分类中的关键挑战。主要障碍包括数据异质性、可解释性和不确定意义变体(VUS)的持久性,这强调了对可解释的人工智能框架和更具包容性的基因组数据库的迫切需要,以提高不同人群的预测准确性。在评估当前变异影响预测因子(VIPs)的基础上,我们提出了增强预测因子选择、有效的多组学数据集成和优化计算工作流程的策略。这些建议旨在提高研究环境和临床实践中变异解释的准确性,最终促进个性化医疗的进步。
{"title":"Harnessing artificial intelligence for genomic variant prediction: advances, challenges, and future directions.","authors":"Indah Pakpahan, Mentari Sihombing, Haohan Liu, Mengyao Wang, Zheng Su, Mingyan Fang","doi":"10.1093/gigascience/giag004","DOIUrl":"10.1093/gigascience/giag004","url":null,"abstract":"<p><p>Accurate genetic variant interpretation is crucial for disease research and the development of targeted therapies. Artificial intelligence is transforming this field by integrating computational methodologies across structural biology, evolutionary analysis, and multimodal genomic data. This review examines the evolution from traditional rule-based systems and statistical models to contemporary machine learning, deep learning, and protein language models, while addressing critical challenges in variant classification. Key obstacles include data heterogeneity, interpretability, and the persistence of variants of uncertain significance, emphasizing the critical need for explainable artificial intelligence frameworks and more inclusive genomic databases to improve predictive accuracy across diverse populations. Based on the assessment of current variant impact predictors, we propose strategies for enhanced predictor selection, effective multi-omics data integration, and optimized computational workflows. These recommendations aim to enhance variant interpretation accuracy in both research settings and clinical practice, ultimately contributing to advances in personalized medicine.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145948740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Charting immune variation through genetics and single-cell genomics. 通过遗传学和单细胞基因组学绘制免疫变异图。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-21 DOI: 10.1093/gigascience/giaf161
Joseph E Powell

Large-scale single-cell genomics projects have revolutionized our understanding of human immune variation. Yet most studies to date have been Eurocentric, limited in cell-type resolution, or restricted to a single data modality. The newly published Chinese Immune Multi-Omics Atlas helps address these gaps by profiling 428 healthy Chinese adults using a multiomics single-cell approach that combines single-cell RNA sequencing and single-cell chromatin accessibility sequencing across over 10 million immune cells. This integrated strategy enabled the identification of 73 distinct immune cell subsets and the construction of cell-type-specific gene regulatory networks linking noncoding enhancers to target genes. The atlas delineated hundreds of enhancer modules (eRegulons), highlighting both established and novel regulators of immune cell identity. By aligning transcriptomic and epigenomic maps, Yin et al. show how expanding both the ancestral diversity and data modalities of immune cell genomics can reveal new biology and provide a valuable addition to global reference cell atlases.

大规模的单细胞基因组学项目彻底改变了我们对人类免疫变异的理解。然而,到目前为止,大多数研究都是以欧洲为中心的,局限于细胞类型分辨率,或者局限于单一数据模式。新发表的中国免疫多组学图谱(CIMA)通过使用多组学单细胞方法对428名健康的中国成年人进行分析,该方法将单细胞RNA测序(scRNA-seq)和单细胞染色质可及性测序(scATAC-seq)结合在1000多万个免疫细胞中,有助于解决这些差距。这种整合策略能够识别73种不同的免疫细胞亚群,并构建将非编码增强子与靶基因连接起来的细胞类型特异性基因调控网络。该图谱描绘了数百个增强子模块(eRegulons),突出了已建立的和新的免疫细胞身份调节因子。通过比对转录组和表观基因组图谱,Yin等人展示了如何扩展免疫细胞基因组学的祖先多样性和数据模式可以揭示新的生物学,并为全球参考细胞图谱提供有价值的补充。
{"title":"Charting immune variation through genetics and single-cell genomics.","authors":"Joseph E Powell","doi":"10.1093/gigascience/giaf161","DOIUrl":"10.1093/gigascience/giaf161","url":null,"abstract":"<p><p>Large-scale single-cell genomics projects have revolutionized our understanding of human immune variation. Yet most studies to date have been Eurocentric, limited in cell-type resolution, or restricted to a single data modality. The newly published Chinese Immune Multi-Omics Atlas helps address these gaps by profiling 428 healthy Chinese adults using a multiomics single-cell approach that combines single-cell RNA sequencing and single-cell chromatin accessibility sequencing across over 10 million immune cells. This integrated strategy enabled the identification of 73 distinct immune cell subsets and the construction of cell-type-specific gene regulatory networks linking noncoding enhancers to target genes. The atlas delineated hundreds of enhancer modules (eRegulons), highlighting both established and novel regulators of immune cell identity. By aligning transcriptomic and epigenomic maps, Yin et al. show how expanding both the ancestral diversity and data modalities of immune cell genomics can reveal new biology and provide a valuable addition to global reference cell atlases.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12821369/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145855259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RNA-SeqEZPZ: a point-and-click pipeline for comprehensive transcriptomics analysis with interactive visualizations. RNA-SeqEZPZ:一个点和点击管道综合转录组学分析与交互式可视化。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-21 DOI: 10.1093/gigascience/giaf133
Cenny Taslim, Yuan Zhang, Galen Rask, Genevieve C Kendall, Emily R Theisen

Background: RNA sequencing (RNA-seq) analysis has become a routine task in numerous genomic research labs, driven by the reduced cost of bulk RNA sequencing experiments. These studies generate billions of reads that require easy-to-run, comprehensive, and reproducible analysis. However, many labs rely on in-house scripts, which can be challenging for bench scientists to use and hinder standardization and reproducibility. While existing RNA-seq pipelines attempt to address these challenges, they often lack a complete end-to-end user interface.

Findings: To bridge this gap, we developed RNA-SeqEZPZ, an automated pipeline with a user-friendly point-and-click interface, enabling rigorous and reproducible RNA-seq analysis without requiring programming or bioinformatics expertise. For advanced users, the pipeline can also be executed from the command line, allowing customization of steps to suit specific applications. The innovation of this pipeline lies in the combination of 3 key features: (i) all software is packaged within a Singularity container, eliminating installation issues; (ii) it offers a graphical, point-and-click interface from raw FASTQ files through differential expression and pathway analysis; and (iii) it includes a Nextflow implementation, enabling scalability and portability for seamless execution across various platforms, including job submission in the cloud and cluster computing. Additionally, RNA-SeqEZPZ generates a comprehensive statistical report and offers an option for batch adjustment to minimize effects of noise due to technical variation across replicates. Reports can also be reviewed by a bioinformatician to ensure the overall quality of the analysis.

Conclusions: RNA-SeqEZPZ is a robust, accessible, and scalable solution for comprehensive RNA-seq analysis, enabling researchers to focus on biological insights rather than computational challenges.

背景:由于大量RNA测序实验成本的降低,RNA- seq分析已成为许多基因组研究实验室的常规任务。这些研究产生了数十亿个读数,需要易于运行、全面和可重复的分析。然而,许多实验室依赖于内部脚本,这对于实验室科学家来说是具有挑战性的,并且阻碍了标准化和可重复性。虽然现有的RNA-Seq管道试图解决这些挑战,但它们往往缺乏完整的端到端用户界面。为了弥补这一差距,我们开发了RNA-SeqEZPZ,这是一种具有用户友好的点击界面的自动化管道,无需编程或生物信息学专业知识即可进行严格且可重复的RNA-Seq分析。对于高级用户,还可以从命令行执行管道,从而允许定制步骤以适应特定的应用程序。该管道的创新之处在于三个关键特性的结合:(1)所有软件都打包在一个Singularity容器中,消除了安装问题;(2)通过差分表达式和路径分析,它提供了一个来自原始FASTQ文件的点击式界面;(3)它包含一个Nextflow版本,实现了可扩展性和可移植性,可以在各种平台上无缝执行,包括云和集群计算中的作业提交。此外,RNA-SeqEZPZ生成全面的统计报告,并提供批量调整选项,以尽量减少由于重复的技术变化而产生的噪音影响。报告也可以由生物信息学家审查,以确保分析的整体质量。结论:RNA-SeqEZPZ是一个强大的、可访问的、可扩展的全面RNA-Seq分析解决方案,使研究人员能够专注于生物学见解,而不是计算挑战。
{"title":"RNA-SeqEZPZ: a point-and-click pipeline for comprehensive transcriptomics analysis with interactive visualizations.","authors":"Cenny Taslim, Yuan Zhang, Galen Rask, Genevieve C Kendall, Emily R Theisen","doi":"10.1093/gigascience/giaf133","DOIUrl":"10.1093/gigascience/giaf133","url":null,"abstract":"<p><strong>Background: </strong>RNA sequencing (RNA-seq) analysis has become a routine task in numerous genomic research labs, driven by the reduced cost of bulk RNA sequencing experiments. These studies generate billions of reads that require easy-to-run, comprehensive, and reproducible analysis. However, many labs rely on in-house scripts, which can be challenging for bench scientists to use and hinder standardization and reproducibility. While existing RNA-seq pipelines attempt to address these challenges, they often lack a complete end-to-end user interface.</p><p><strong>Findings: </strong>To bridge this gap, we developed RNA-SeqEZPZ, an automated pipeline with a user-friendly point-and-click interface, enabling rigorous and reproducible RNA-seq analysis without requiring programming or bioinformatics expertise. For advanced users, the pipeline can also be executed from the command line, allowing customization of steps to suit specific applications. The innovation of this pipeline lies in the combination of 3 key features: (i) all software is packaged within a Singularity container, eliminating installation issues; (ii) it offers a graphical, point-and-click interface from raw FASTQ files through differential expression and pathway analysis; and (iii) it includes a Nextflow implementation, enabling scalability and portability for seamless execution across various platforms, including job submission in the cloud and cluster computing. Additionally, RNA-SeqEZPZ generates a comprehensive statistical report and offers an option for batch adjustment to minimize effects of noise due to technical variation across replicates. Reports can also be reviewed by a bioinformatician to ensure the overall quality of the analysis.</p><p><strong>Conclusions: </strong>RNA-SeqEZPZ is a robust, accessible, and scalable solution for comprehensive RNA-seq analysis, enabling researchers to focus on biological insights rather than computational challenges.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12857227/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145495174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MBGC2: Boosting compression via efficient encoding of approximate matches in genome collections. MBGC2:通过有效编码基因组集合中的近似匹配来提高压缩。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-21 DOI: 10.1093/gigascience/giag008
Tomasz M Kowalski

Background: FASTA is the primary format for representing DNA, RNA and protein sequences. While progress has been made in specialized FASTA collection compressors, they still struggle with practical limitations and inconsistent performance across different datasets, hindering effective storage and transfer of large genomic datasets.

Results: We present an enhanced version of the Multiple Bacteria Genome Compressor (MBGC), a high-throughput, in-memory algorithm for compressing genome collections. It relies on information about maximum exact matches in the compressed set to identify possibly long approximate matches. It encodes them even when they partially overlap, boosting the compression ratio by an average of 14% across bacterial datasets, while the reengineered multi-threaded decoding speeds up decompression compared to its predecessor by around 40%. The compression ratio improvement is even more pronounced on other collections, for H. sapiens reaching 18%, and up to 55% for S. paradoxus. MBGC2 performs consistently across diverse datasets and introduces practical features to ease data management such as archive appending, repacking, fast content listing and flexible decompression options. Benchmark tests covering nucleotide-based bacterial, viral, and human genome collections show that MBGC2 combines compression efficiency and processing speed. The tool supports working with single genomes or amino acid collections, but does not guarantee such high performance in these cases.

Conclusions: MBGC2 addresses critical limitations in genome collection compression by delivering reliable performance, improved compression ratios, and enhanced usability features. The consistent efficiency across diverse genomic datasets makes it a versatile tool for managing the growing volume of genomic data in research and clinical settings. The balance between compression ratio and speed positions MBGC2 as a practical solution for the storage and transfer of large genomic collections.

背景:FASTA是表示DNA、RNA和蛋白质序列的主要格式。虽然专业的FASTA收集压缩器已经取得了进展,但它们仍然受到实际限制和不同数据集的性能不一致的困扰,阻碍了大型基因组数据集的有效存储和传输。结果:我们提出了一种增强版的多细菌基因组压缩器(MBGC),这是一种用于压缩基因组收集的高通量内存算法。它依赖于压缩集中的最大精确匹配信息来识别可能的长近似匹配。即使它们部分重叠,它也会对它们进行编码,使细菌数据集的压缩比平均提高14%,而重新设计的多线程解码比其前身提高了约40%的解压速度。压缩比的提高在其他物种中更为明显,智人达到18%,佯谬猿达到55%。MBGC2在不同的数据集上一致地执行,并引入了实用的功能来简化数据管理,如存档追加、重新打包、快速内容列表和灵活的解压缩选项。涵盖基于核苷酸的细菌、病毒和人类基因组收集的基准测试表明,MBGC2结合了压缩效率和处理速度。该工具支持处理单基因组或氨基酸集合,但不能保证在这些情况下具有如此高的性能。结论:MBGC2通过提供可靠的性能、改进的压缩比和增强的可用性特性,解决了基因组收集压缩的关键限制。不同基因组数据集的一致效率使其成为管理研究和临床环境中不断增长的基因组数据量的通用工具。压缩比和速度之间的平衡使MBGC2成为存储和传输大型基因组集合的实用解决方案。
{"title":"MBGC2: Boosting compression via efficient encoding of approximate matches in genome collections.","authors":"Tomasz M Kowalski","doi":"10.1093/gigascience/giag008","DOIUrl":"https://doi.org/10.1093/gigascience/giag008","url":null,"abstract":"<p><strong>Background: </strong>FASTA is the primary format for representing DNA, RNA and protein sequences. While progress has been made in specialized FASTA collection compressors, they still struggle with practical limitations and inconsistent performance across different datasets, hindering effective storage and transfer of large genomic datasets.</p><p><strong>Results: </strong>We present an enhanced version of the Multiple Bacteria Genome Compressor (MBGC), a high-throughput, in-memory algorithm for compressing genome collections. It relies on information about maximum exact matches in the compressed set to identify possibly long approximate matches. It encodes them even when they partially overlap, boosting the compression ratio by an average of 14% across bacterial datasets, while the reengineered multi-threaded decoding speeds up decompression compared to its predecessor by around 40%. The compression ratio improvement is even more pronounced on other collections, for H. sapiens reaching 18%, and up to 55% for S. paradoxus. MBGC2 performs consistently across diverse datasets and introduces practical features to ease data management such as archive appending, repacking, fast content listing and flexible decompression options. Benchmark tests covering nucleotide-based bacterial, viral, and human genome collections show that MBGC2 combines compression efficiency and processing speed. The tool supports working with single genomes or amino acid collections, but does not guarantee such high performance in these cases.</p><p><strong>Conclusions: </strong>MBGC2 addresses critical limitations in genome collection compression by delivering reliable performance, improved compression ratios, and enhanced usability features. The consistent efficiency across diverse genomic datasets makes it a versatile tool for managing the growing volume of genomic data in research and clinical settings. The balance between compression ratio and speed positions MBGC2 as a practical solution for the storage and transfer of large genomic collections.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146009898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved reference assembly and core collection resequencing to facilitate exploration of important agronomical traits for the improvement of oilseed crop, Carthamus tinctorius L. 改进参比组合和核心集合重测序,为油料作物红花改良的重要农艺性状探索提供便利。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-21 DOI: 10.1093/gigascience/giaf151
Megha Sharma, Varun Bhardwaj, Praveen Kumar Oraon, Shivani Choudhary, Heena Ambreen, Rohit Nandan Shukla, Harsha Rayudu Jamedar, Ajitha Vijjeswarapu, Vandana Jaiswal, Palchamy Kadirvel, Arun Jagannath, Shailendra Goel

Background: Safflower (Carthamus tinctorius L.) is a drought-resilient oilseed crop. Besides producing edible oil rich in oleic and linoleic acids, it is also used in biofuels, cosmetics, coloring dyes, pharmaceuticals, and nutraceuticals. Despite its significant economic uses, the availability of genetic and genomic resources in safflower is limited.

Results: We report an improved de novo genome assembly of safflower (Safflower_A2). A chromosome-level assembly of 1.15 Gb with telomeres and centromeric repeats was constructed using PacBio HiFi reads, optical maps, Illumina short reads, and Hi-C sequencing. Safflower_A2 shows better contiguity, completeness, and high-quality annotation than previous assemblies. The assembly was further validated with the help of a single-nucleotide polymorphism (SNP)-based linkage map. A genome-wide survey identified genes for comprehensive exploration of disease resistance in the safflower. Employing the de novo genome assembly as a reference, we used resequencing data of a global core collection of 123 accessions to carry out an SNP-based genome-wide association study, which identified significant associations for several traits and their haplotypes of agronomic value, including seed oil content. Resequencing data were also applied for a pan-genome analysis, which provided critical insights into genome diversity, identifying an additional ~11,000 genes and their functional enrichment that will be useful for region-specific breeding lines.

Conclusion: Our study provides insights into the genomic architecture of safflower by leveraging an improved genome assembly and annotation. Additionally, resources, including a high-density linkage map, marker-trait associations, and pan-genome development in this study, provide valuable resources for use in breeding and crop improvement programs by the global research community.

背景:红花(Carthamus tinctorius L.)是一种抗旱油料作物。除了生产富含油酸和亚油酸的食用油外,它还用于生物燃料、化妆品、染料、药品和营养保健品。尽管红花具有重要的经济用途,但其遗传和基因组资源的可用性有限。结果:我们报道了一个改进的红花(Safflower_A2)从头基因组组装。利用PacBio HiFi reads、光学图谱、Illumina short reads和Hi-C测序,构建了1.15 Gb染色体水平的端粒和着丝粒重复序列。与以前的程序集相比,Safflower_A2具有更好的连续性、完整性和高质量的注释。通过基于单核苷酸多态性(SNP)的连锁图谱进一步验证了该序列。一项全基因组调查确定了红花抗病基因的全面探索。以从头基因组组装为参考,我们利用123份全球核心收集的重测序数据进行了基于snp的全基因组关联研究,发现了几种性状及其农艺价值单倍型(包括种子含油量)的显著相关性。重测序数据还用于泛基因组分析,该分析为基因组多样性提供了关键见解,确定了额外的约11000个基因及其功能富集,这将对区域特异性育种系有用。结论:我们的研究利用改进的基因组组装和注释为红花的基因组结构提供了见解。此外,本研究开发的高密度连锁图谱、标记-性状关联、泛基因组等资源为全球研究界的育种和作物改良计划提供了宝贵的资源。
{"title":"Improved reference assembly and core collection resequencing to facilitate exploration of important agronomical traits for the improvement of oilseed crop, Carthamus tinctorius L.","authors":"Megha Sharma, Varun Bhardwaj, Praveen Kumar Oraon, Shivani Choudhary, Heena Ambreen, Rohit Nandan Shukla, Harsha Rayudu Jamedar, Ajitha Vijjeswarapu, Vandana Jaiswal, Palchamy Kadirvel, Arun Jagannath, Shailendra Goel","doi":"10.1093/gigascience/giaf151","DOIUrl":"10.1093/gigascience/giaf151","url":null,"abstract":"<p><strong>Background: </strong>Safflower (Carthamus tinctorius L.) is a drought-resilient oilseed crop. Besides producing edible oil rich in oleic and linoleic acids, it is also used in biofuels, cosmetics, coloring dyes, pharmaceuticals, and nutraceuticals. Despite its significant economic uses, the availability of genetic and genomic resources in safflower is limited.</p><p><strong>Results: </strong>We report an improved de novo genome assembly of safflower (Safflower_A2). A chromosome-level assembly of 1.15 Gb with telomeres and centromeric repeats was constructed using PacBio HiFi reads, optical maps, Illumina short reads, and Hi-C sequencing. Safflower_A2 shows better contiguity, completeness, and high-quality annotation than previous assemblies. The assembly was further validated with the help of a single-nucleotide polymorphism (SNP)-based linkage map. A genome-wide survey identified genes for comprehensive exploration of disease resistance in the safflower. Employing the de novo genome assembly as a reference, we used resequencing data of a global core collection of 123 accessions to carry out an SNP-based genome-wide association study, which identified significant associations for several traits and their haplotypes of agronomic value, including seed oil content. Resequencing data were also applied for a pan-genome analysis, which provided critical insights into genome diversity, identifying an additional ~11,000 genes and their functional enrichment that will be useful for region-specific breeding lines.</p><p><strong>Conclusion: </strong>Our study provides insights into the genomic architecture of safflower by leveraging an improved genome assembly and annotation. Additionally, resources, including a high-density linkage map, marker-trait associations, and pan-genome development in this study, provide valuable resources for use in breeding and crop improvement programs by the global research community.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145722306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Volume and Toward Coherence: A research parasite's perspective. 超越体积和走向连贯性:一个研究寄生虫的观点。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-20 DOI: 10.1093/gigascience/giag001
Gina Turco

The Pacific Symposium on Biocomputing (PSB) recognized my work with the 2024 Junior Research Parasite Award, an honor established to highlight the scientific value of reanalyzing, integrating, and reinterpreting existing datasets. The award invites recipients to reflect on the role of research parasites within the broader ecosystem of computational biology and data reuse. For me, this perspective is rooted in years of working across diverse -omics datasets, where I've seen firsthand how the structure, resolution, and context of a dataset shape the biological insight it can support. Rather than focusing on data volume alone, meaningful discovery often emerges from understanding what each dataset can-and cannot-reveal. Here, I outline how different modes of secondary analysis, from integrating complementary datasets to deeply mining a single omics layer.

太平洋生物计算研讨会(PSB)以2024年初级研究寄生虫奖表彰了我的工作,这一荣誉旨在强调重新分析,整合和重新解释现有数据集的科学价值。该奖项邀请获奖者反思研究寄生虫在计算生物学和数据重用的更广泛生态系统中的作用。对我来说,这种观点根植于多年来在不同组学数据集上的工作,在那里我亲眼目睹了数据集的结构、分辨率和背景如何塑造它所能支持的生物学洞察力。有意义的发现往往来自于理解每个数据集能揭示什么和不能揭示什么,而不是仅仅关注数据量。在这里,我概述了二级分析的不同模式,从整合互补数据集到深入挖掘单个组学层。
{"title":"Beyond Volume and Toward Coherence: A research parasite's perspective.","authors":"Gina Turco","doi":"10.1093/gigascience/giag001","DOIUrl":"https://doi.org/10.1093/gigascience/giag001","url":null,"abstract":"<p><p>The Pacific Symposium on Biocomputing (PSB) recognized my work with the 2024 Junior Research Parasite Award, an honor established to highlight the scientific value of reanalyzing, integrating, and reinterpreting existing datasets. The award invites recipients to reflect on the role of research parasites within the broader ecosystem of computational biology and data reuse. For me, this perspective is rooted in years of working across diverse -omics datasets, where I've seen firsthand how the structure, resolution, and context of a dataset shape the biological insight it can support. Rather than focusing on data volume alone, meaningful discovery often emerges from understanding what each dataset can-and cannot-reveal. Here, I outline how different modes of secondary analysis, from integrating complementary datasets to deeply mining a single omics layer.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146009879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SpaceBF: Spatial coexpression analysis using Bayesian Fused approaches in spatial omics datasets. 空间组学数据集中使用贝叶斯融合方法的空间共表达分析。
IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES Pub Date : 2026-01-20 DOI: 10.1093/gigascience/giag006
Souvik Seal, Brian Neelon

Advances in spatial omics enable measurement of genes (spatial transcriptomics) and peptides, lipids, or N-glycans (mass spectrometry imaging) across thousands of locations within a tissue. While detecting spatially variable molecules is a well-studied problem, robust methods for identifying spatially varying co-expression between molecule pairs remain limited. We introduce SpaceBF, a Bayesian fused modeling framework that estimates co-expression at both local (location-specific) and global (tissue-wide) levels. SpaceBF enforces spatial smoothness via a fused horseshoe prior on the edges of a predefined spatial adjacency graph, allowing large, edge-specific differences to escape shrinkage while preserving overall structure. In extensive simulations, SpaceBF achieves higher specificity and power than commonly used methods that leverage geospatial metrics, including bivariate Moran's I and Lee's L. We also benchmark the proposed prior against standard alternatives, such as intrinsic conditional autoregressive (ICAR) and Matérn priors. Applied to spatial transcriptomics and proteomics datasets, SpaceBF reveals cancer-relevant molecular interactions and patterns of cell-cell communication (e.g., ligand-receptor signaling), demonstrating its utility for principled, uncertainty-aware co-expression analysis of spatial omics data.

空间组学的进步使基因(空间转录组学)和肽、脂质或n -聚糖(质谱成像)在组织内数千个位置的测量成为可能。虽然检测空间可变分子是一个研究得很好的问题,但识别分子对之间空间变化共表达的可靠方法仍然有限。我们介绍SpaceBF,这是一个贝叶斯融合建模框架,可以在局部(特定位置)和全局(组织范围)级别估计共表达。SpaceBF通过预先在预定义的空间邻接图的边缘上融合马蹄形来增强空间的平滑性,允许大的、边缘特定的差异在保留整体结构的同时避免收缩。在广泛的模拟中,SpaceBF比利用地理空间度量的常用方法(包括双变量Moran's I和Lee's l)实现了更高的特异性和功率。我们还将提出的先验与标准替代方法(如内在条件自回归(ICAR)和mat先验)进行了基准测试。SpaceBF应用于空间转录组学和蛋白质组学数据集,揭示了癌症相关的分子相互作用和细胞-细胞通信模式(例如配体-受体信号),证明了其在空间组学数据的原则、不确定性感知共表达分析中的实用性。
{"title":"SpaceBF: Spatial coexpression analysis using Bayesian Fused approaches in spatial omics datasets.","authors":"Souvik Seal, Brian Neelon","doi":"10.1093/gigascience/giag006","DOIUrl":"10.1093/gigascience/giag006","url":null,"abstract":"<p><p>Advances in spatial omics enable measurement of genes (spatial transcriptomics) and peptides, lipids, or N-glycans (mass spectrometry imaging) across thousands of locations within a tissue. While detecting spatially variable molecules is a well-studied problem, robust methods for identifying spatially varying co-expression between molecule pairs remain limited. We introduce SpaceBF, a Bayesian fused modeling framework that estimates co-expression at both local (location-specific) and global (tissue-wide) levels. SpaceBF enforces spatial smoothness via a fused horseshoe prior on the edges of a predefined spatial adjacency graph, allowing large, edge-specific differences to escape shrinkage while preserving overall structure. In extensive simulations, SpaceBF achieves higher specificity and power than commonly used methods that leverage geospatial metrics, including bivariate Moran's I and Lee's L. We also benchmark the proposed prior against standard alternatives, such as intrinsic conditional autoregressive (ICAR) and Matérn priors. Applied to spatial transcriptomics and proteomics datasets, SpaceBF reveals cancer-relevant molecular interactions and patterns of cell-cell communication (e.g., ligand-receptor signaling), demonstrating its utility for principled, uncertainty-aware co-expression analysis of spatial omics data.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146009819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
GigaScience
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1