Genomics, proteomics & bioinformatics最新文献

Single-cell Spatial Multiomics: Technologies, Methods, and Biological Applications. 单细胞空间多组学：技术、方法和生物学应用。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-02-09 DOI: 10.1093/gpbjnl/qzag012

Dong Xing, Rong Fan, Fangqing Zhao

引用次数: 0

Forensic Transcriptomics: Research Progress of the Past Two Decades. 法医转录组学：近二十年的研究进展。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-02-04 DOI: 10.1093/gpbjnl/qzag007

Fanzhang Lei, Xi Yuan, Qiong Lan, Ruonan Shen, Yiman Wu, Xin Shi, Bofeng Zhu, Bin Cong

Over the last two decades, advancements in sequencing technology and data science have significantly deepened the study of transcriptomics, especially non-coding transcriptomics, leading to substantial developments in forensic applications. During the 2000s, forensic transcriptomics analysis technology evolved from targeted messenger ribonucleic acid (mRNA) typing to massive parallel sequencing and deoxyribonucleic acid (DNA) microarray. This progression facilitated the source tracing and degradation dynamics of biomaterials from crime scenes, as well as transcriptomic changes associated with cadavers, injuries and toxicology, thereby providing additional clues for solving forensic cases. In the next decade, the development of high-throughput sequencing technology further expanded the research frontiers of forensic transcriptomics from mRNA to non-coding RNAs (ncRNAs). These molecules have been demonstrated to exhibit unique functions in expression regulation and epigenetic modifications, showing great potential in forensic practices such as forensic polymorphism studies, tissue and body fluid tracing, forensic RNA molecular clock, death & wound analyses, as well as forensic toxicology. Modern transcriptomics combined with deep learning and multimodal analysis through multidisciplinary integration can potentially characterize the dynamic spatiotemporal panoramic features of forensic biological samples. However, these technologies will face bottlenecks such as standardization, sample collection and processing, ethics, and evidence interpretation in forensic practice. Breaking through these obstacles will be the core task of forensic transcriptomics in the next ten years. This integrative review, building on bibliometric analysis, details the new paradigms and latest advances in forensic transcriptomics across multiple forensic fields, demonstrating its wide-ranging prospects in practical applications.

在过去的二十年中，测序技术和数据科学的进步极大地深化了转录组学的研究，特别是非编码转录组学，导致了法医应用的实质性发展。在2000年代，法医转录组学分析技术从靶向信使核糖核酸（mRNA）分型发展到大规模平行测序和脱氧核糖核酸（DNA）微阵列。这一进展促进了犯罪现场生物材料的来源追踪和降解动力学，以及与尸体、损伤和毒理学相关的转录组学变化，从而为解决法医案件提供了额外的线索。在接下来的十年中，高通量测序技术的发展进一步将法医转录组学的研究领域从mRNA扩展到非编码rna （ncRNAs）。这些分子已被证明在表达调控和表观遗传修饰方面具有独特的功能，在法医多态性研究、组织和体液追踪、法医RNA分子钟、死亡和伤口分析以及法医毒理学等法医实践中显示出巨大的潜力。现代转录组学结合多学科整合的深度学习和多模态分析，可以潜在地表征法医生物样本的动态时空全景特征。然而，在法医实践中，这些技术将面临标准化、样本收集和处理、伦理和证据解释等瓶颈。突破这些障碍将是未来十年法医转录组学研究的核心任务。本综述以文献计量学分析为基础，详细介绍了法医转录组学在多个法医领域的新范式和最新进展，展示了其在实际应用中的广泛前景。

{"title":"Forensic Transcriptomics: Research Progress of the Past Two Decades.","authors":"Fanzhang Lei, Xi Yuan, Qiong Lan, Ruonan Shen, Yiman Wu, Xin Shi, Bofeng Zhu, Bin Cong","doi":"10.1093/gpbjnl/qzag007","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag007","url":null,"abstract":"Over the last two decades, advancements in sequencing technology and data science have significantly deepened the study of transcriptomics, especially non-coding transcriptomics, leading to substantial developments in forensic applications. During the 2000s, forensic transcriptomics analysis technology evolved from targeted messenger ribonucleic acid (mRNA) typing to massive parallel sequencing and deoxyribonucleic acid (DNA) microarray. This progression facilitated the source tracing and degradation dynamics of biomaterials from crime scenes, as well as transcriptomic changes associated with cadavers, injuries and toxicology, thereby providing additional clues for solving forensic cases. In the next decade, the development of high-throughput sequencing technology further expanded the research frontiers of forensic transcriptomics from mRNA to non-coding RNAs (ncRNAs). These molecules have been demonstrated to exhibit unique functions in expression regulation and epigenetic modifications, showing great potential in forensic practices such as forensic polymorphism studies, tissue and body fluid tracing, forensic RNA molecular clock, death & wound analyses, as well as forensic toxicology. Modern transcriptomics combined with deep learning and multimodal analysis through multidisciplinary integration can potentially characterize the dynamic spatiotemporal panoramic features of forensic biological samples. However, these technologies will face bottlenecks such as standardization, sample collection and processing, ethics, and evidence interpretation in forensic practice. Breaking through these obstacles will be the core task of forensic transcriptomics in the next ten years. This integrative review, building on bibliometric analysis, details the new paradigms and latest advances in forensic transcriptomics across multiple forensic fields, demonstrating its wide-ranging prospects in practical applications.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Approaches to Studying Virus Pangenome Variation Graphs. 研究病毒泛基因组变异图的方法。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-02-03 DOI: 10.1093/gpbjnl/qzag003

Tim Downing

Pangenome variation graphs (PVGs) allow for the representation of genetic diversity in a more nuanced way than traditional reference-based approaches. Here I focus on how PVGs are a powerful tool for studying genetic variation in viruses, offering insights into the complexities of viral quasispecies, mutation rates, and population dynamics. PVGs originated in human genomics and hold great promise for viral genomics. Previous work has been constrained by small sample sizes and gene-centric methods, whereas PVGs enable a more comprehensive approach to studying viral diversity. Large viral genome collections should be used to make PVGs, which offer significant advantages. Here, I outline accessible tools to achieve their construction. This spans PVG construction, PVG file formats, PVG manipulation and analysis, PVG visualisation, measuring PVG openness, and mapping reads to PVGs. Additionally, the development of PVG-specific formats for mutation representation and personalised PVGs that reflect specific research questions will further enhance PVG applications. Challenges remain, particularly in managing nested variants, optimising error detection, optimising k-mer/minimizer-based approaches for AT-rich genomes, incorporating long read sequencing data, and scalable visualisation approaches. Nevertheless, PVGs offer a new opportunity for viral population genomics, and a testing ground for tool development prior to application to larger eukaryotic genomes. These advances will enable more accurate and comprehensive detection of viral mutations, contributing to a deeper understanding of viral evolution and genotype-phenotype associations.

泛基因组变异图（PVGs）允许以比传统的基于参考的方法更细致的方式表示遗传多样性。在这里，我将重点介绍PVGs如何成为研究病毒遗传变异的强大工具，为病毒准种、突变率和种群动态的复杂性提供见解。PVGs起源于人类基因组学，在病毒基因组学中具有很大的应用前景。以前的工作受到小样本量和以基因为中心的方法的限制，而PVGs能够更全面地研究病毒多样性。大量的病毒基因组收集应该用于制造PVGs，这具有显著的优势。在这里，我概述了实现其构建的可访问工具。这涵盖了PVG构建、PVG文件格式、PVG操作和分析、PVG可视化、测量PVG开放性以及将读取映射到PVG。此外，开发用于突变表示的PVG特定格式和反映特定研究问题的个性化PVG将进一步增强PVG的应用。挑战依然存在，特别是在管理嵌套变体、优化错误检测、优化基于k-mer/minimizer的at丰富基因组方法、整合长读测序数据和可扩展的可视化方法方面。尽管如此，PVGs为病毒种群基因组学提供了一个新的机会，并为应用于更大的真核生物基因组之前的工具开发提供了一个试验场。这些进展将使更准确和全面的检测病毒突变，有助于更深入地了解病毒进化和基因型-表型关联。

{"title":"Approaches to Studying Virus Pangenome Variation Graphs.","authors":"Tim Downing","doi":"10.1093/gpbjnl/qzag003","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag003","url":null,"abstract":"Pangenome variation graphs (PVGs) allow for the representation of genetic diversity in a more nuanced way than traditional reference-based approaches. Here I focus on how PVGs are a powerful tool for studying genetic variation in viruses, offering insights into the complexities of viral quasispecies, mutation rates, and population dynamics. PVGs originated in human genomics and hold great promise for viral genomics. Previous work has been constrained by small sample sizes and gene-centric methods, whereas PVGs enable a more comprehensive approach to studying viral diversity. Large viral genome collections should be used to make PVGs, which offer significant advantages. Here, I outline accessible tools to achieve their construction. This spans PVG construction, PVG file formats, PVG manipulation and analysis, PVG visualisation, measuring PVG openness, and mapping reads to PVGs. Additionally, the development of PVG-specific formats for mutation representation and personalised PVGs that reflect specific research questions will further enhance PVG applications. Challenges remain, particularly in managing nested variants, optimising error detection, optimising k-mer/minimizer-based approaches for AT-rich genomes, incorporating long read sequencing data, and scalable visualisation approaches. Nevertheless, PVGs offer a new opportunity for viral population genomics, and a testing ground for tool development prior to application to larger eukaryotic genomes. These advances will enable more accurate and comprehensive detection of viral mutations, contributing to a deeper understanding of viral evolution and genotype-phenotype associations.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Gap-free Petunia Genome Assemblies Reveal the Evolutionary Dynamics of the S-locus Supergene. 无间隙牵牛花基因组组装揭示了s位点超基因的进化动力学。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-02-03 DOI: 10.1093/gpbjnl/qzag011

Chen Wang, Hong Zhao, He Wu, Sijie Sun, Hongkui Zhang, Yongbiao Xue

Petunia hybrida is a key genetic model for investigating self-incompatibility (SI), a reproductive barrier governed by the multi-allelic S-locus, which encodes a pistil-specific S-RNase and multiple S-locus F-box (SLF) genes. Due to high heterozygosity and abundant repetitive sequences, previous S-locus assemblies in reference genomes have been fragmented and collapsed. Here, we present the telomere-to-telomere (T2T), haplotype-resolved genomes of two homozygous SI lines (P. hybrida S3LS3L and SVSV), enabling the complete reconstruction of both S-loci. Population genomic analyses delineated their boundaries, spanning approximately 14.01 Mb and 20.83 Mb, respectively. Remarkably, both S-loci exhibited extremely low nucleotide polymorphism and structural variation compared with the remainder of the genome. In addition to the S-RNase and the complete repertoire of SLF genes, we identified two pollen-specific genes, ubiquitin-like and MYB, which may contribute to SI regulation. Our results demonstrate that the genomic architecture of the Petunia S-locus continues to evolve dynamically while retaining the core genetic components essential for SI. Furthermore, we propose six evolutionary scenarios, providing new insights into the processes driving the generation, diversification, loss, functional maintenance, and structural reorganization of SLF genes in Petunia. Overall, the T2T genomes reported here establish P. hybrida as a premier model for comparative genomics and SI research in the Solanaceae family.

矮牵牛（Petunia hybrida）是研究自交不亲和（self-incompatibility， SI）的重要遗传模型。自交不亲和是一种由多等位基因S-locus控制的生殖屏障，S-locus编码雌蕊特异性S-RNase和多个S-locus F-box （SLF）基因。由于参比基因组的高杂合性和丰富的重复序列，以往的s位点序列片段化和崩溃。在这里，我们展示了两种纯合子SI系（P. hybrida S3LS3L和SVSV）的端粒到端粒（T2T）单倍型分辨基因组，从而实现了两个s位点的完全重建。种群基因组分析划定了它们的边界，分别跨越约14.01 Mb和20.83 Mb。值得注意的是，与基因组的其余部分相比，这两个s位点表现出极低的核苷酸多态性和结构变异。除了S-RNase和完整的SLF基因库外，我们还鉴定了两个花粉特异性基因，泛素样基因和MYB，它们可能参与SI调节。我们的研究结果表明，矮牵牛s位点的基因组结构在保留SI必需的核心遗传成分的同时继续动态进化。此外，我们提出了六种进化情景，为矮牵牛SLF基因的产生、多样化、丧失、功能维持和结构重组过程提供了新的见解。总的来说，这里报道的T2T基因组建立了P. hybrida作为茄科比较基因组学和SI研究的首要模型。

{"title":"The Gap-free Petunia Genome Assemblies Reveal the Evolutionary Dynamics of the S-locus Supergene.","authors":"Chen Wang, Hong Zhao, He Wu, Sijie Sun, Hongkui Zhang, Yongbiao Xue","doi":"10.1093/gpbjnl/qzag011","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag011","url":null,"abstract":"Petunia hybrida is a key genetic model for investigating self-incompatibility (SI), a reproductive barrier governed by the multi-allelic S-locus, which encodes a pistil-specific S-RNase and multiple S-locus F-box (SLF) genes. Due to high heterozygosity and abundant repetitive sequences, previous S-locus assemblies in reference genomes have been fragmented and collapsed. Here, we present the telomere-to-telomere (T2T), haplotype-resolved genomes of two homozygous SI lines (P. hybrida S3LS3L and SVSV), enabling the complete reconstruction of both S-loci. Population genomic analyses delineated their boundaries, spanning approximately 14.01 Mb and 20.83 Mb, respectively. Remarkably, both S-loci exhibited extremely low nucleotide polymorphism and structural variation compared with the remainder of the genome. In addition to the S-RNase and the complete repertoire of SLF genes, we identified two pollen-specific genes, ubiquitin-like and MYB, which may contribute to SI regulation. Our results demonstrate that the genomic architecture of the Petunia S-locus continues to evolve dynamically while retaining the core genetic components essential for SI. Furthermore, we propose six evolutionary scenarios, providing new insights into the processes driving the generation, diversification, loss, functional maintenance, and structural reorganization of SLF genes in Petunia. Overall, the T2T genomes reported here establish P. hybrida as a premier model for comparative genomics and SI research in the Solanaceae family.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SAFAARI: Contrastive Adversarial Open-set Domain Adaptation for Single-cell Integration & Annotation. 单细胞集成与标注的对比对抗性开集域自适应。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-01-30 DOI: 10.1093/gpbjnl/qzag008

Fatemeh Aminzadeh, Jun Wu, Jingrui He, Morteza Saberi, Fatemeh Vafaee

Single-cell sequencing technologies have enabled in-depth analysis of cellular heterogeneity across tissues and disease contexts. However, as datasets increase in size and complexity, characterizing diverse cellular populations, integrating data across multiple modalities, and correcting batch effects remain challenges. We present SAFAARI (Single-cell Annotation and Fusion with Adversarial Open-set Domain Adaptation Reliable for Data Integration), a unified deep learning framework designed for cell annotation, batch correction, and multi-omics integration. SAFAARI leverages supervised contrastive learning and adversarial domain adaptation to achieve domain-invariant embeddings and enables label transfer across datasets, addressing challenges posed by batch effects, biological domain shifts, and multi-omics modalities. SAFAARI identifies novel cell types and mitigates class imbalance to enhance the detection of rare cell types. Through comprehensive benchmarking, we evaluated SAFAARI against existing annotation and integration methods across real-world datasets exhibiting batch effects and domain shifts, as well as simulated and multi-omics data. SAFAARI demonstrated scalability and robust performance in cell annotation via label transfer across heterogeneous datasets, detection of unknown cell types, correction of batch effects, and cross-omics data integration while leveraging available annotations for improved integration. SAFAARI's innovative approach outperformed competing methods in both qualitative and quantitative metrics, offering a flexible, accurate, and scalable solution for single-cell analysis with broad applicability to diverse biological and clinical research questions. An open-source implementation of the SAFAARI algorithm is available at https://github.com/VafaeeLab/SAFAARI.

单细胞测序技术能够深入分析组织和疾病背景下的细胞异质性。然而，随着数据集的规模和复杂性的增加，表征不同的细胞群体，跨多种模式整合数据，以及纠正批处理效应仍然是挑战。我们提出了SAFAARI（单细胞注释和融合与对抗性开放域自适应可靠的数据集成），一个统一的深度学习框架，设计用于细胞注释，批量校正和多组学集成。SAFAARI利用监督对比学习和对抗域适应来实现域不变嵌入，并实现跨数据集的标签转移，解决批处理效应、生物域转移和多组学模式带来的挑战。SAFAARI可以识别新的细胞类型，并减轻类不平衡，从而增强对稀有细胞类型的检测。通过全面的基准测试，我们评估了SAFAARI与现有的注释和集成方法在真实世界的数据集显示批处理效应和域转移，以及模拟和多组学数据。SAFAARI通过跨异构数据集的标签传输、未知细胞类型的检测、批处理效果的校正和跨组学数据集成展示了细胞注释的可扩展性和鲁棒性，同时利用可用的注释来改进集成。SAFAARI的创新方法在定性和定量指标上都优于竞争对手的方法，为单细胞分析提供了灵活、准确和可扩展的解决方案，广泛适用于各种生物和临床研究问题。SAFAARI算法的开源实现可从https://github.com/VafaeeLab/SAFAARI获得。

{"title":"SAFAARI: Contrastive Adversarial Open-set Domain Adaptation for Single-cell Integration & Annotation.","authors":"Fatemeh Aminzadeh, Jun Wu, Jingrui He, Morteza Saberi, Fatemeh Vafaee","doi":"10.1093/gpbjnl/qzag008","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag008","url":null,"abstract":"Single-cell sequencing technologies have enabled in-depth analysis of cellular heterogeneity across tissues and disease contexts. However, as datasets increase in size and complexity, characterizing diverse cellular populations, integrating data across multiple modalities, and correcting batch effects remain challenges. We present SAFAARI (Single-cell Annotation and Fusion with Adversarial Open-set Domain Adaptation Reliable for Data Integration), a unified deep learning framework designed for cell annotation, batch correction, and multi-omics integration. SAFAARI leverages supervised contrastive learning and adversarial domain adaptation to achieve domain-invariant embeddings and enables label transfer across datasets, addressing challenges posed by batch effects, biological domain shifts, and multi-omics modalities. SAFAARI identifies novel cell types and mitigates class imbalance to enhance the detection of rare cell types. Through comprehensive benchmarking, we evaluated SAFAARI against existing annotation and integration methods across real-world datasets exhibiting batch effects and domain shifts, as well as simulated and multi-omics data. SAFAARI demonstrated scalability and robust performance in cell annotation via label transfer across heterogeneous datasets, detection of unknown cell types, correction of batch effects, and cross-omics data integration while leveraging available annotations for improved integration. SAFAARI's innovative approach outperformed competing methods in both qualitative and quantitative metrics, offering a flexible, accurate, and scalable solution for single-cell analysis with broad applicability to diverse biological and clinical research questions. An open-source implementation of the SAFAARI algorithm is available at https://github.com/VafaeeLab/SAFAARI.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Evolution of Spatial Omics Technologies Introduces A Novel Avenue for Lung Cancer Research. 空间组学技术的发展为肺癌研究开辟了一条新的途径。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-01-30 DOI: 10.1093/gpbjnl/qzag010

Yue He, Zifan Li, Wenxiang Wang, Xu Liu, Shanshan Lu, Jing Bai, Lin Weng, Qingna Zhang, Jun Wang, Kezhong Chen

Lung cancer is a highly malignant disease, posing a significant threat to global health. The presence of tumor heterogeneity results in substantial variations in prognosis and therapeutic responses among patients. Advances in bulk RNA sequencing and single-cell RNA sequencing have facilitated the identification of driver gene mutations and the exploration of cellular diversity within tumors. However, tumors are complex ecosystems comprising both tumor cells and their microenvironment, where interactions among different cell types give rise to specific functional structural units that collectively drive tumorigenesis and progression. The emergence of spatial omics technologies has allowed for the analysis of tumor ecosystems, providing unprecedented insights into tumor heterogeneity. This review aims to present updates on spatial omics technologies and data analysis algorithms, discuss current technical limitations, and explore potential future developments. Furthermore, we summarize the latest applications of spatial omics in elucidating lung cancer heterogeneity, investigating mechanisms of lung cancer progression and drug resistance, and identifying novel biomarkers. Based on these findings, we propose strategies for integrating spatial omics into lung cancer research, offering new perspectives for precision medicine.

肺癌是一种高度恶性的疾病，对全球健康构成重大威胁。肿瘤异质性的存在导致患者预后和治疗反应的实质性差异。大量RNA测序和单细胞RNA测序的进展促进了驱动基因突变的识别和肿瘤细胞多样性的探索。然而，肿瘤是由肿瘤细胞及其微环境组成的复杂生态系统，不同细胞类型之间的相互作用产生特定的功能结构单元，共同推动肿瘤的发生和发展。空间组学技术的出现使得对肿瘤生态系统的分析成为可能，为肿瘤异质性提供了前所未有的见解。本文旨在介绍空间组学技术和数据分析算法的最新进展，讨论当前的技术限制，并探讨潜在的未来发展。此外，我们总结了空间组学在阐明肺癌异质性、研究肺癌进展和耐药机制以及鉴定新的生物标志物方面的最新应用。基于这些发现，我们提出了将空间组学整合到肺癌研究中的策略，为精准医学提供了新的视角。

{"title":"The Evolution of Spatial Omics Technologies Introduces A Novel Avenue for Lung Cancer Research.","authors":"Yue He, Zifan Li, Wenxiang Wang, Xu Liu, Shanshan Lu, Jing Bai, Lin Weng, Qingna Zhang, Jun Wang, Kezhong Chen","doi":"10.1093/gpbjnl/qzag010","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag010","url":null,"abstract":"Lung cancer is a highly malignant disease, posing a significant threat to global health. The presence of tumor heterogeneity results in substantial variations in prognosis and therapeutic responses among patients. Advances in bulk RNA sequencing and single-cell RNA sequencing have facilitated the identification of driver gene mutations and the exploration of cellular diversity within tumors. However, tumors are complex ecosystems comprising both tumor cells and their microenvironment, where interactions among different cell types give rise to specific functional structural units that collectively drive tumorigenesis and progression. The emergence of spatial omics technologies has allowed for the analysis of tumor ecosystems, providing unprecedented insights into tumor heterogeneity. This review aims to present updates on spatial omics technologies and data analysis algorithms, discuss current technical limitations, and explore potential future developments. Furthermore, we summarize the latest applications of spatial omics in elucidating lung cancer heterogeneity, investigating mechanisms of lung cancer progression and drug resistance, and identifying novel biomarkers. Based on these findings, we propose strategies for integrating spatial omics into lung cancer research, offering new perspectives for precision medicine.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Amplification Optimized and Unique Molecular Identifier Guided High Accuracy Full-length CircRNA Sequencing. 扩增优化和独特的分子标识引导高精度全长CircRNA测序。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-01-28 DOI: 10.1093/gpbjnl/qzag009

Yueqi Jin, Xueyan Hu, Yun Zhang, Jianqi She, Changyu Tao, Ence Yang

As an emerging important regulatory noncoding RNA, circular RNAs (circRNAs) present significant spatiotemporal expression patterns in a variety of physiological processes and diseases. Thus, accurate identification and quantification of circRNA is crucial to understanding its functions and clinical significance. However, obvious inconsistencies exist between mainstream high-throughput circRNA identification workflows based on next-generation sequencing and third-generation sequencing technologies, likely due to uncertainties inherent to each workflow. In the current study, we first confirmed that sequencing error introduced in the library preparation is a considerable contributor to the observed inconsistencies. To assess this challenge, we established a UMI-based full-length circRNA sequencing method, ucircFL-seq. By employing UMI and optimizing signal amplification procedures, ucircFL-seq achieved a substantial improvement in the accuracy of both circRNA detection and quantification, leading to stronger cross-platform concordance. Furthermore, our study revealed that the two platforms identify distinct pools of circRNAs, which exhibited differences in length and secondary structure, suggesting the complementary nature of the two platforms in circRNA identification. Overall, our study presents a UMI-guided workflow, ucircFL-seq, which enhances full-length circRNA identification and quantification accuracy, facilitating further functional exploration of circRNAs.

环状RNA （circular RNA, circRNAs）作为一种新兴的重要调控非编码RNA，在多种生理过程和疾病中呈现出显著的时空表达模式。因此，准确鉴定和定量circRNA对于了解其功能和临床意义至关重要。然而，基于下一代测序和第三代测序技术的主流高通量circRNA鉴定工作流程之间存在明显的不一致性，这可能是由于每个工作流程固有的不确定性。在目前的研究中，我们首先证实了文库制备过程中引入的测序错误是导致观察到的不一致的一个重要因素。为了评估这一挑战，我们建立了一种基于umi的全长circRNA测序方法，ucircFL-seq。通过使用UMI和优化信号放大程序，ucircFL-seq在circRNA检测和定量的准确性方面取得了实质性的提高，从而实现了更强的跨平台一致性。此外，我们的研究表明，这两种平台鉴定出不同的circRNA池，它们在长度和二级结构上存在差异，这表明这两种平台在circRNA鉴定中具有互补性。总的来说，我们的研究提出了一个umi引导的工作流程，ucircFL-seq，它提高了全长circRNA的鉴定和定量准确性，促进了circRNA的进一步功能探索。

{"title":"Amplification Optimized and Unique Molecular Identifier Guided High Accuracy Full-length CircRNA Sequencing.","authors":"Yueqi Jin, Xueyan Hu, Yun Zhang, Jianqi She, Changyu Tao, Ence Yang","doi":"10.1093/gpbjnl/qzag009","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag009","url":null,"abstract":"As an emerging important regulatory noncoding RNA, circular RNAs (circRNAs) present significant spatiotemporal expression patterns in a variety of physiological processes and diseases. Thus, accurate identification and quantification of circRNA is crucial to understanding its functions and clinical significance. However, obvious inconsistencies exist between mainstream high-throughput circRNA identification workflows based on next-generation sequencing and third-generation sequencing technologies, likely due to uncertainties inherent to each workflow. In the current study, we first confirmed that sequencing error introduced in the library preparation is a considerable contributor to the observed inconsistencies. To assess this challenge, we established a UMI-based full-length circRNA sequencing method, ucircFL-seq. By employing UMI and optimizing signal amplification procedures, ucircFL-seq achieved a substantial improvement in the accuracy of both circRNA detection and quantification, leading to stronger cross-platform concordance. Furthermore, our study revealed that the two platforms identify distinct pools of circRNAs, which exhibited differences in length and secondary structure, suggesting the complementary nature of the two platforms in circRNA identification. Overall, our study presents a UMI-guided workflow, ucircFL-seq, which enhances full-length circRNA identification and quantification accuracy, facilitating further functional exploration of circRNAs.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FAST: Scalable Factor Analysis for Spatial Dimension Reduction of Multi-section Spatial Transcriptomics. FAST：多片段空间转录组学空间降维的可扩展因子分析。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-01-24 DOI: 10.1093/gpbjnl/qzag006

Wei Liu, Xiao Zhang, Xiaoran Chai, Zhenqian Fan, Huazhen Lin, Jinmiao Chen, Lei Sun, Tianwei Yu, Joe Yeong, Jin Liu

Biological techniques for spatially resolved transcriptomics (SRT) have advanced rapidly in both throughput and spatial resolution. This progress calls for efficient and scalable spatial dimension reduction methods capable of handling large-scale SRT data from multiple tissue sections. Here, we developed FAST, a fast and efficient generalized probabilistic factor analysis model for spatially aware dimension reduction. FAST simultaneously accounts for the count-based nature of SRT data and extracts low-dimensional representations across multiple sections, while preserving biological signals and incorporating spatial smoothness among neighboring locations. Unlike existing methods, FAST explicitly models count data across sections and leverages local spatial dependencies with scalable computational complexity. Using both simulated and real datasets, we demonstrated that embeddings estimated by FAST show improved correlation with annotated cell and domain types. Notably, FAST was the only method capable of analyzing a mouse embryo Stereo-seq dataset with > 2.3 million spatial locations in just 2 hours. FAST also identified differential activity of immune-related transcription factors between tumor and non-tumor clusters and predicted the carcinogenesis factor CCNH as an upstream regulator of differentially expressed genes in a breast cancer Xenium dataset. FAST is available for non-commercial use at https://github.com/feiyoung/ProFAST.

空间分辨转录组学（SRT）的生物学技术在通量和空间分辨方面都取得了迅速的进展。这一进展需要高效和可扩展的空间降维方法，能够处理来自多个组织切片的大规模SRT数据。本文提出了一种快速高效的空间感知降维广义概率因子分析模型FAST。FAST同时考虑到SRT数据基于计数的特性，并在多个部分提取低维表示，同时保留生物信号并结合相邻位置之间的空间平滑性。与现有的方法不同，FAST显式地跨部分计算数据，并利用具有可扩展计算复杂性的局部空间依赖性。使用模拟和真实数据集，我们证明FAST估计的嵌入与注释的细胞和域类型具有更好的相关性。值得注意的是，FAST是唯一能够在2小时内分析小鼠胚胎Stereo-seq数据集的方法，该数据集具有bbb230万个空间位置。FAST还发现了肿瘤和非肿瘤簇之间免疫相关转录因子的差异活性，并在乳腺癌Xenium数据集中预测了致癌因子CCNH作为差异表达基因的上游调节因子。FAST可在https://github.com/feiyoung/ProFAST上用于非商业用途。

{"title":"FAST: Scalable Factor Analysis for Spatial Dimension Reduction of Multi-section Spatial Transcriptomics.","authors":"Wei Liu, Xiao Zhang, Xiaoran Chai, Zhenqian Fan, Huazhen Lin, Jinmiao Chen, Lei Sun, Tianwei Yu, Joe Yeong, Jin Liu","doi":"10.1093/gpbjnl/qzag006","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag006","url":null,"abstract":"Biological techniques for spatially resolved transcriptomics (SRT) have advanced rapidly in both throughput and spatial resolution. This progress calls for efficient and scalable spatial dimension reduction methods capable of handling large-scale SRT data from multiple tissue sections. Here, we developed FAST, a fast and efficient generalized probabilistic factor analysis model for spatially aware dimension reduction. FAST simultaneously accounts for the count-based nature of SRT data and extracts low-dimensional representations across multiple sections, while preserving biological signals and incorporating spatial smoothness among neighboring locations. Unlike existing methods, FAST explicitly models count data across sections and leverages local spatial dependencies with scalable computational complexity. Using both simulated and real datasets, we demonstrated that embeddings estimated by FAST show improved correlation with annotated cell and domain types. Notably, FAST was the only method capable of analyzing a mouse embryo Stereo-seq dataset with > 2.3 million spatial locations in just 2 hours. FAST also identified differential activity of immune-related transcription factors between tumor and non-tumor clusters and predicted the carcinogenesis factor CCNH as an upstream regulator of differentially expressed genes in a breast cancer Xenium dataset. FAST is available for non-commercial use at https://github.com/feiyoung/ProFAST.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146047617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ψ-Atlas: An Integrated Atlas for Pseudouridine Epitranscriptome. Ψ-Atlas：伪尿苷表转录组的集成图谱。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-01-23 DOI: 10.1093/gpbjnl/qzag004

Xiaochen Wang, Jinjing Luo, Xiaoqiang Lang, Yongqing Ling, Yiming Zhou, Guoxian Liu, Xiangye Chen, Yibo Chen, Yingshun Zhou, Yi Cao, Zhonghui Zhang, Changjun Ding, Demeng Chen, Qi Liu

Pseudouridine (Ψ) is a C5 glycoside isomer of uridine, formed by breaking the N1 glycosyl bond and undergoing a 180° base rotation. This modification stands as one of the most widespread post-transcriptional alterations in RNA and is universally distributed among diverse RNA species. The pervasiveness of this modification enhances RNA structural integrity, bestows unique structural and functional attributes upon the RNA molecules it adorns, and facilitates additional hydrogen bonding. However, the absence of a convenient and integrated intuitive visualization database encompassing all currently reported species and RNA types is evident. Here, we present the Ψ-Atlas, an extensive database meticulously curated for the comprehensive collection and annotation of RNA pseudouridine. This database encompasses 554,895 Ψ modification sites across various RNA categories, including mRNA, ncRNA, tRNA, rRNA, and others, in 78 distinct species reported in current literature. The sequencing methodologies employed comprise the next-generation sequencing techniques such as Ψ-Seq, Pseudo-Seq, CeU-Seq, PSI-Seq, RBS-Seq, HydraPsiSeq, BID-Seq, PRAISE-Seq, as well as third-generation sequencing methods like Direct RNA sequencing. Ψ-Atlas is the most comprehensive and integrated resource for RNA pseudouridine modifications to date. The Ψ-Atlas database offers an intuitive interface for information display and a myriad of analytical tools, including PsiVar and PsiFinder. In essence, this platform serves as a robust search and visualization tool for the study of pseudouridylation in epitranscriptomics. Ψ-Atlas is available at https://rnainformatics.org.cn/PsiAtlas.

假尿嘧啶（Ψ）是尿嘧啶的C5糖苷异构体，通过破坏N1糖基键并进行180°碱基旋转而形成。这种修饰是RNA中最普遍的转录后修饰之一，普遍分布于不同的RNA物种中。这种修饰的普遍性增强了RNA结构的完整性，赋予其修饰的RNA分子独特的结构和功能属性，并促进了额外的氢键。然而，缺乏一个方便和集成的直观可视化数据库，包括所有目前报道的物种和RNA类型是显而易见的。在这里，我们提出Ψ-Atlas，一个广泛的数据库精心策划的RNA伪尿嘧啶的全面收集和注释。该数据库包含目前文献报道的78个不同物种的554,895个Ψ不同RNA类别的修饰位点，包括mRNA， ncRNA, tRNA， rRNA等。采用的测序方法包括下一代测序技术，如Ψ-Seq、Pseudo-Seq、CeU-Seq、PSI-Seq、RBS-Seq、HydraPsiSeq、BID-Seq、PRAISE-Seq，以及第三代测序方法，如Direct RNA测序。Ψ-Atlas是迄今为止最全面、最完整的RNA伪尿嘧啶修饰资源。Ψ-Atlas数据库提供了直观的信息显示界面和无数的分析工具，包括PsiVar和PsiFinder。从本质上讲，该平台可作为一个强大的搜索和可视化工具，用于研究表转录组学中的假尿嘧啶化。Ψ-Atlas的网址是https://rnainformatics.org.cn/PsiAtlas。

{"title":"Ψ-Atlas: An Integrated Atlas for Pseudouridine Epitranscriptome.","authors":"Xiaochen Wang, Jinjing Luo, Xiaoqiang Lang, Yongqing Ling, Yiming Zhou, Guoxian Liu, Xiangye Chen, Yibo Chen, Yingshun Zhou, Yi Cao, Zhonghui Zhang, Changjun Ding, Demeng Chen, Qi Liu","doi":"10.1093/gpbjnl/qzag004","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag004","url":null,"abstract":"Pseudouridine (Ψ) is a C5 glycoside isomer of uridine, formed by breaking the N1 glycosyl bond and undergoing a 180° base rotation. This modification stands as one of the most widespread post-transcriptional alterations in RNA and is universally distributed among diverse RNA species. The pervasiveness of this modification enhances RNA structural integrity, bestows unique structural and functional attributes upon the RNA molecules it adorns, and facilitates additional hydrogen bonding. However, the absence of a convenient and integrated intuitive visualization database encompassing all currently reported species and RNA types is evident. Here, we present the Ψ-Atlas, an extensive database meticulously curated for the comprehensive collection and annotation of RNA pseudouridine. This database encompasses 554,895 Ψ modification sites across various RNA categories, including mRNA, ncRNA, tRNA, rRNA, and others, in 78 distinct species reported in current literature. The sequencing methodologies employed comprise the next-generation sequencing techniques such as Ψ-Seq, Pseudo-Seq, CeU-Seq, PSI-Seq, RBS-Seq, HydraPsiSeq, BID-Seq, PRAISE-Seq, as well as third-generation sequencing methods like Direct RNA sequencing. Ψ-Atlas is the most comprehensive and integrated resource for RNA pseudouridine modifications to date. The Ψ-Atlas database offers an intuitive interface for information display and a myriad of analytical tools, including PsiVar and PsiFinder. In essence, this platform serves as a robust search and visualization tool for the study of pseudouridylation in epitranscriptomics. Ψ-Atlas is available at https://rnainformatics.org.cn/PsiAtlas.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146041965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ClusterGVis: An Advanced Visualization and Clustering Tool for Gene Expression Analysis. ClusterGVis：用于基因表达分析的高级可视化和聚类工具。

IF 7.9

Genomics, proteomics & bioinformatics

Pub Date : 2026-01-22 DOI: 10.1093/gpbjnl/qzag005

Jun Zhang, Hongyuan Li, Wenjun Tao, Jun Zhou

Single-cell RNA sequencing as well as bulk RNA sequencing data provide valuable insights into both physiological and pathological processes. The effective interpretation of these data relies on the availability of sophisticated analysis and visualization tools. Here, we introduce ClusterGVis, an advanced bioinformatics software specifically designed to simplify the analysis and visualization of gene expression data. ClusterGVis provides a user-friendly interface that allows researchers to perform fuzzy c-means and k-means clustering on transcriptomic datasets. It enables researchers to uncover patterns and relationships within complex gene expression profiles effectively. The integrated heatmap visualization features support intuitive exploration of co-expression networks and the identification of differentially expressed genes across diverse experimental conditions. ClusterGVis serves a dual purpose by aiding in the identification of potential biomarkers and enriching the understanding of gene function and regulatory mechanisms. The tutorials, manual, source code, and demo data of ClusterGVis are publicly available at https://github.com/junjunlab/ClusterGVis and https://bioconductor.org/packages/ClusterGVis. The ClusterGVis Shiny App has been deployed on shinyapps.io and is accessible at https://laojunjun.shinyapps.io/clustergvis_app_v0/. The Shiny App source code is hosted on GitHub at https://github.com/junjunlab/ClusterGvis-app.

单细胞RNA测序以及大量RNA测序数据为生理和病理过程提供了有价值的见解。这些数据的有效解释依赖于复杂的分析和可视化工具的可用性。在这里，我们介绍ClusterGVis，一个先进的生物信息学软件，专门用于简化基因表达数据的分析和可视化。ClusterGVis提供了一个用户友好的界面，允许研究人员在转录组数据集上执行模糊c均值和k均值聚类。它使研究人员能够有效地揭示复杂基因表达谱中的模式和关系。集成的热图可视化功能支持直观地探索共表达网络和识别不同实验条件下的差异表达基因。ClusterGVis具有双重作用，有助于识别潜在的生物标志物，并丰富对基因功能和调控机制的理解。ClusterGVis的教程、手册、源代码和演示数据可在https://github.com/junjunlab/ClusterGVis和https://bioconductor.org/packages/ClusterGVis上公开获取。ClusterGVis的Shiny App已经部署在shinyapps上。IO和可访问的https://laojunjun.shinyapps.io/clustergvis_app_v0/。Shiny App的源代码托管在GitHub上https://github.com/junjunlab/ClusterGvis-app。

{"title":"ClusterGVis: An Advanced Visualization and Clustering Tool for Gene Expression Analysis.","authors":"Jun Zhang, Hongyuan Li, Wenjun Tao, Jun Zhou","doi":"10.1093/gpbjnl/qzag005","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag005","url":null,"abstract":"Single-cell RNA sequencing as well as bulk RNA sequencing data provide valuable insights into both physiological and pathological processes. The effective interpretation of these data relies on the availability of sophisticated analysis and visualization tools. Here, we introduce ClusterGVis, an advanced bioinformatics software specifically designed to simplify the analysis and visualization of gene expression data. ClusterGVis provides a user-friendly interface that allows researchers to perform fuzzy c-means and k-means clustering on transcriptomic datasets. It enables researchers to uncover patterns and relationships within complex gene expression profiles effectively. The integrated heatmap visualization features support intuitive exploration of co-expression networks and the identification of differentially expressed genes across diverse experimental conditions. ClusterGVis serves a dual purpose by aiding in the identification of potential biomarkers and enriching the understanding of gene function and regulatory mechanisms. The tutorials, manual, source code, and demo data of ClusterGVis are publicly available at https://github.com/junjunlab/ClusterGVis and https://bioconductor.org/packages/ClusterGVis. The ClusterGVis Shiny App has been deployed on shinyapps.io and is accessible at https://laojunjun.shinyapps.io/clustergvis_app_v0/. The Shiny App source code is hosted on GitHub at https://github.com/junjunlab/ClusterGvis-app.","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146021006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0