Fanzhang Lei, Xi Yuan, Qiong Lan, Ruonan Shen, Yiman Wu, Xin Shi, Bofeng Zhu, Bin Cong
Over the last two decades, advancements in sequencing technology and data science have significantly deepened the study of transcriptomics, especially non-coding transcriptomics, leading to substantial developments in forensic applications. During the 2000s, forensic transcriptomics analysis technology evolved from targeted messenger ribonucleic acid (mRNA) typing to massive parallel sequencing and deoxyribonucleic acid (DNA) microarray. This progression facilitated the source tracing and degradation dynamics of biomaterials from crime scenes, as well as transcriptomic changes associated with cadavers, injuries and toxicology, thereby providing additional clues for solving forensic cases. In the next decade, the development of high-throughput sequencing technology further expanded the research frontiers of forensic transcriptomics from mRNA to non-coding RNAs (ncRNAs). These molecules have been demonstrated to exhibit unique functions in expression regulation and epigenetic modifications, showing great potential in forensic practices such as forensic polymorphism studies, tissue and body fluid tracing, forensic RNA molecular clock, death & wound analyses, as well as forensic toxicology. Modern transcriptomics combined with deep learning and multimodal analysis through multidisciplinary integration can potentially characterize the dynamic spatiotemporal panoramic features of forensic biological samples. However, these technologies will face bottlenecks such as standardization, sample collection and processing, ethics, and evidence interpretation in forensic practice. Breaking through these obstacles will be the core task of forensic transcriptomics in the next ten years. This integrative review, building on bibliometric analysis, details the new paradigms and latest advances in forensic transcriptomics across multiple forensic fields, demonstrating its wide-ranging prospects in practical applications.
{"title":"Forensic Transcriptomics: Research Progress of the Past Two Decades.","authors":"Fanzhang Lei, Xi Yuan, Qiong Lan, Ruonan Shen, Yiman Wu, Xin Shi, Bofeng Zhu, Bin Cong","doi":"10.1093/gpbjnl/qzag007","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag007","url":null,"abstract":"<p><p>Over the last two decades, advancements in sequencing technology and data science have significantly deepened the study of transcriptomics, especially non-coding transcriptomics, leading to substantial developments in forensic applications. During the 2000s, forensic transcriptomics analysis technology evolved from targeted messenger ribonucleic acid (mRNA) typing to massive parallel sequencing and deoxyribonucleic acid (DNA) microarray. This progression facilitated the source tracing and degradation dynamics of biomaterials from crime scenes, as well as transcriptomic changes associated with cadavers, injuries and toxicology, thereby providing additional clues for solving forensic cases. In the next decade, the development of high-throughput sequencing technology further expanded the research frontiers of forensic transcriptomics from mRNA to non-coding RNAs (ncRNAs). These molecules have been demonstrated to exhibit unique functions in expression regulation and epigenetic modifications, showing great potential in forensic practices such as forensic polymorphism studies, tissue and body fluid tracing, forensic RNA molecular clock, death & wound analyses, as well as forensic toxicology. Modern transcriptomics combined with deep learning and multimodal analysis through multidisciplinary integration can potentially characterize the dynamic spatiotemporal panoramic features of forensic biological samples. However, these technologies will face bottlenecks such as standardization, sample collection and processing, ethics, and evidence interpretation in forensic practice. Breaking through these obstacles will be the core task of forensic transcriptomics in the next ten years. This integrative review, building on bibliometric analysis, details the new paradigms and latest advances in forensic transcriptomics across multiple forensic fields, demonstrating its wide-ranging prospects in practical applications.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pangenome variation graphs (PVGs) allow for the representation of genetic diversity in a more nuanced way than traditional reference-based approaches. Here I focus on how PVGs are a powerful tool for studying genetic variation in viruses, offering insights into the complexities of viral quasispecies, mutation rates, and population dynamics. PVGs originated in human genomics and hold great promise for viral genomics. Previous work has been constrained by small sample sizes and gene-centric methods, whereas PVGs enable a more comprehensive approach to studying viral diversity. Large viral genome collections should be used to make PVGs, which offer significant advantages. Here, I outline accessible tools to achieve their construction. This spans PVG construction, PVG file formats, PVG manipulation and analysis, PVG visualisation, measuring PVG openness, and mapping reads to PVGs. Additionally, the development of PVG-specific formats for mutation representation and personalised PVGs that reflect specific research questions will further enhance PVG applications. Challenges remain, particularly in managing nested variants, optimising error detection, optimising k-mer/minimizer-based approaches for AT-rich genomes, incorporating long read sequencing data, and scalable visualisation approaches. Nevertheless, PVGs offer a new opportunity for viral population genomics, and a testing ground for tool development prior to application to larger eukaryotic genomes. These advances will enable more accurate and comprehensive detection of viral mutations, contributing to a deeper understanding of viral evolution and genotype-phenotype associations.
{"title":"Approaches to Studying Virus Pangenome Variation Graphs.","authors":"Tim Downing","doi":"10.1093/gpbjnl/qzag003","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag003","url":null,"abstract":"<p><p>Pangenome variation graphs (PVGs) allow for the representation of genetic diversity in a more nuanced way than traditional reference-based approaches. Here I focus on how PVGs are a powerful tool for studying genetic variation in viruses, offering insights into the complexities of viral quasispecies, mutation rates, and population dynamics. PVGs originated in human genomics and hold great promise for viral genomics. Previous work has been constrained by small sample sizes and gene-centric methods, whereas PVGs enable a more comprehensive approach to studying viral diversity. Large viral genome collections should be used to make PVGs, which offer significant advantages. Here, I outline accessible tools to achieve their construction. This spans PVG construction, PVG file formats, PVG manipulation and analysis, PVG visualisation, measuring PVG openness, and mapping reads to PVGs. Additionally, the development of PVG-specific formats for mutation representation and personalised PVGs that reflect specific research questions will further enhance PVG applications. Challenges remain, particularly in managing nested variants, optimising error detection, optimising k-mer/minimizer-based approaches for AT-rich genomes, incorporating long read sequencing data, and scalable visualisation approaches. Nevertheless, PVGs offer a new opportunity for viral population genomics, and a testing ground for tool development prior to application to larger eukaryotic genomes. These advances will enable more accurate and comprehensive detection of viral mutations, contributing to a deeper understanding of viral evolution and genotype-phenotype associations.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Wang, Hong Zhao, He Wu, Sijie Sun, Hongkui Zhang, Yongbiao Xue
Petunia hybrida is a key genetic model for investigating self-incompatibility (SI), a reproductive barrier governed by the multi-allelic S-locus, which encodes a pistil-specific S-RNase and multiple S-locus F-box (SLF) genes. Due to high heterozygosity and abundant repetitive sequences, previous S-locus assemblies in reference genomes have been fragmented and collapsed. Here, we present the telomere-to-telomere (T2T), haplotype-resolved genomes of two homozygous SI lines (P. hybrida S3LS3L and SVSV), enabling the complete reconstruction of both S-loci. Population genomic analyses delineated their boundaries, spanning approximately 14.01 Mb and 20.83 Mb, respectively. Remarkably, both S-loci exhibited extremely low nucleotide polymorphism and structural variation compared with the remainder of the genome. In addition to the S-RNase and the complete repertoire of SLF genes, we identified two pollen-specific genes, ubiquitin-like and MYB, which may contribute to SI regulation. Our results demonstrate that the genomic architecture of the Petunia S-locus continues to evolve dynamically while retaining the core genetic components essential for SI. Furthermore, we propose six evolutionary scenarios, providing new insights into the processes driving the generation, diversification, loss, functional maintenance, and structural reorganization of SLF genes in Petunia. Overall, the T2T genomes reported here establish P. hybrida as a premier model for comparative genomics and SI research in the Solanaceae family.
{"title":"The Gap-free Petunia Genome Assemblies Reveal the Evolutionary Dynamics of the S-locus Supergene.","authors":"Chen Wang, Hong Zhao, He Wu, Sijie Sun, Hongkui Zhang, Yongbiao Xue","doi":"10.1093/gpbjnl/qzag011","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag011","url":null,"abstract":"<p><p>Petunia hybrida is a key genetic model for investigating self-incompatibility (SI), a reproductive barrier governed by the multi-allelic S-locus, which encodes a pistil-specific S-RNase and multiple S-locus F-box (SLF) genes. Due to high heterozygosity and abundant repetitive sequences, previous S-locus assemblies in reference genomes have been fragmented and collapsed. Here, we present the telomere-to-telomere (T2T), haplotype-resolved genomes of two homozygous SI lines (P. hybrida S3LS3L and SVSV), enabling the complete reconstruction of both S-loci. Population genomic analyses delineated their boundaries, spanning approximately 14.01 Mb and 20.83 Mb, respectively. Remarkably, both S-loci exhibited extremely low nucleotide polymorphism and structural variation compared with the remainder of the genome. In addition to the S-RNase and the complete repertoire of SLF genes, we identified two pollen-specific genes, ubiquitin-like and MYB, which may contribute to SI regulation. Our results demonstrate that the genomic architecture of the Petunia S-locus continues to evolve dynamically while retaining the core genetic components essential for SI. Furthermore, we propose six evolutionary scenarios, providing new insights into the processes driving the generation, diversification, loss, functional maintenance, and structural reorganization of SLF genes in Petunia. Overall, the T2T genomes reported here establish P. hybrida as a premier model for comparative genomics and SI research in the Solanaceae family.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146115299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fatemeh Aminzadeh, Jun Wu, Jingrui He, Morteza Saberi, Fatemeh Vafaee
Single-cell sequencing technologies have enabled in-depth analysis of cellular heterogeneity across tissues and disease contexts. However, as datasets increase in size and complexity, characterizing diverse cellular populations, integrating data across multiple modalities, and correcting batch effects remain challenges. We present SAFAARI (Single-cell Annotation and Fusion with Adversarial Open-set Domain Adaptation Reliable for Data Integration), a unified deep learning framework designed for cell annotation, batch correction, and multi-omics integration. SAFAARI leverages supervised contrastive learning and adversarial domain adaptation to achieve domain-invariant embeddings and enables label transfer across datasets, addressing challenges posed by batch effects, biological domain shifts, and multi-omics modalities. SAFAARI identifies novel cell types and mitigates class imbalance to enhance the detection of rare cell types. Through comprehensive benchmarking, we evaluated SAFAARI against existing annotation and integration methods across real-world datasets exhibiting batch effects and domain shifts, as well as simulated and multi-omics data. SAFAARI demonstrated scalability and robust performance in cell annotation via label transfer across heterogeneous datasets, detection of unknown cell types, correction of batch effects, and cross-omics data integration while leveraging available annotations for improved integration. SAFAARI's innovative approach outperformed competing methods in both qualitative and quantitative metrics, offering a flexible, accurate, and scalable solution for single-cell analysis with broad applicability to diverse biological and clinical research questions. An open-source implementation of the SAFAARI algorithm is available at https://github.com/VafaeeLab/SAFAARI.
{"title":"SAFAARI: Contrastive Adversarial Open-set Domain Adaptation for Single-cell Integration & Annotation.","authors":"Fatemeh Aminzadeh, Jun Wu, Jingrui He, Morteza Saberi, Fatemeh Vafaee","doi":"10.1093/gpbjnl/qzag008","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag008","url":null,"abstract":"<p><p>Single-cell sequencing technologies have enabled in-depth analysis of cellular heterogeneity across tissues and disease contexts. However, as datasets increase in size and complexity, characterizing diverse cellular populations, integrating data across multiple modalities, and correcting batch effects remain challenges. We present SAFAARI (Single-cell Annotation and Fusion with Adversarial Open-set Domain Adaptation Reliable for Data Integration), a unified deep learning framework designed for cell annotation, batch correction, and multi-omics integration. SAFAARI leverages supervised contrastive learning and adversarial domain adaptation to achieve domain-invariant embeddings and enables label transfer across datasets, addressing challenges posed by batch effects, biological domain shifts, and multi-omics modalities. SAFAARI identifies novel cell types and mitigates class imbalance to enhance the detection of rare cell types. Through comprehensive benchmarking, we evaluated SAFAARI against existing annotation and integration methods across real-world datasets exhibiting batch effects and domain shifts, as well as simulated and multi-omics data. SAFAARI demonstrated scalability and robust performance in cell annotation via label transfer across heterogeneous datasets, detection of unknown cell types, correction of batch effects, and cross-omics data integration while leveraging available annotations for improved integration. SAFAARI's innovative approach outperformed competing methods in both qualitative and quantitative metrics, offering a flexible, accurate, and scalable solution for single-cell analysis with broad applicability to diverse biological and clinical research questions. An open-source implementation of the SAFAARI algorithm is available at https://github.com/VafaeeLab/SAFAARI.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146088653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yue He, Zifan Li, Wenxiang Wang, Xu Liu, Shanshan Lu, Jing Bai, Lin Weng, Qingna Zhang, Jun Wang, Kezhong Chen
Lung cancer is a highly malignant disease, posing a significant threat to global health. The presence of tumor heterogeneity results in substantial variations in prognosis and therapeutic responses among patients. Advances in bulk RNA sequencing and single-cell RNA sequencing have facilitated the identification of driver gene mutations and the exploration of cellular diversity within tumors. However, tumors are complex ecosystems comprising both tumor cells and their microenvironment, where interactions among different cell types give rise to specific functional structural units that collectively drive tumorigenesis and progression. The emergence of spatial omics technologies has allowed for the analysis of tumor ecosystems, providing unprecedented insights into tumor heterogeneity. This review aims to present updates on spatial omics technologies and data analysis algorithms, discuss current technical limitations, and explore potential future developments. Furthermore, we summarize the latest applications of spatial omics in elucidating lung cancer heterogeneity, investigating mechanisms of lung cancer progression and drug resistance, and identifying novel biomarkers. Based on these findings, we propose strategies for integrating spatial omics into lung cancer research, offering new perspectives for precision medicine.
{"title":"The Evolution of Spatial Omics Technologies Introduces A Novel Avenue for Lung Cancer Research.","authors":"Yue He, Zifan Li, Wenxiang Wang, Xu Liu, Shanshan Lu, Jing Bai, Lin Weng, Qingna Zhang, Jun Wang, Kezhong Chen","doi":"10.1093/gpbjnl/qzag010","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag010","url":null,"abstract":"<p><p>Lung cancer is a highly malignant disease, posing a significant threat to global health. The presence of tumor heterogeneity results in substantial variations in prognosis and therapeutic responses among patients. Advances in bulk RNA sequencing and single-cell RNA sequencing have facilitated the identification of driver gene mutations and the exploration of cellular diversity within tumors. However, tumors are complex ecosystems comprising both tumor cells and their microenvironment, where interactions among different cell types give rise to specific functional structural units that collectively drive tumorigenesis and progression. The emergence of spatial omics technologies has allowed for the analysis of tumor ecosystems, providing unprecedented insights into tumor heterogeneity. This review aims to present updates on spatial omics technologies and data analysis algorithms, discuss current technical limitations, and explore potential future developments. Furthermore, we summarize the latest applications of spatial omics in elucidating lung cancer heterogeneity, investigating mechanisms of lung cancer progression and drug resistance, and identifying novel biomarkers. Based on these findings, we propose strategies for integrating spatial omics into lung cancer research, offering new perspectives for precision medicine.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As an emerging important regulatory noncoding RNA, circular RNAs (circRNAs) present significant spatiotemporal expression patterns in a variety of physiological processes and diseases. Thus, accurate identification and quantification of circRNA is crucial to understanding its functions and clinical significance. However, obvious inconsistencies exist between mainstream high-throughput circRNA identification workflows based on next-generation sequencing and third-generation sequencing technologies, likely due to uncertainties inherent to each workflow. In the current study, we first confirmed that sequencing error introduced in the library preparation is a considerable contributor to the observed inconsistencies. To assess this challenge, we established a UMI-based full-length circRNA sequencing method, ucircFL-seq. By employing UMI and optimizing signal amplification procedures, ucircFL-seq achieved a substantial improvement in the accuracy of both circRNA detection and quantification, leading to stronger cross-platform concordance. Furthermore, our study revealed that the two platforms identify distinct pools of circRNAs, which exhibited differences in length and secondary structure, suggesting the complementary nature of the two platforms in circRNA identification. Overall, our study presents a UMI-guided workflow, ucircFL-seq, which enhances full-length circRNA identification and quantification accuracy, facilitating further functional exploration of circRNAs.
{"title":"Amplification Optimized and Unique Molecular Identifier Guided High Accuracy Full-length CircRNA Sequencing.","authors":"Yueqi Jin, Xueyan Hu, Yun Zhang, Jianqi She, Changyu Tao, Ence Yang","doi":"10.1093/gpbjnl/qzag009","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag009","url":null,"abstract":"<p><p>As an emerging important regulatory noncoding RNA, circular RNAs (circRNAs) present significant spatiotemporal expression patterns in a variety of physiological processes and diseases. Thus, accurate identification and quantification of circRNA is crucial to understanding its functions and clinical significance. However, obvious inconsistencies exist between mainstream high-throughput circRNA identification workflows based on next-generation sequencing and third-generation sequencing technologies, likely due to uncertainties inherent to each workflow. In the current study, we first confirmed that sequencing error introduced in the library preparation is a considerable contributor to the observed inconsistencies. To assess this challenge, we established a UMI-based full-length circRNA sequencing method, ucircFL-seq. By employing UMI and optimizing signal amplification procedures, ucircFL-seq achieved a substantial improvement in the accuracy of both circRNA detection and quantification, leading to stronger cross-platform concordance. Furthermore, our study revealed that the two platforms identify distinct pools of circRNAs, which exhibited differences in length and secondary structure, suggesting the complementary nature of the two platforms in circRNA identification. Overall, our study presents a UMI-guided workflow, ucircFL-seq, which enhances full-length circRNA identification and quantification accuracy, facilitating further functional exploration of circRNAs.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Liu, Xiao Zhang, Xiaoran Chai, Zhenqian Fan, Huazhen Lin, Jinmiao Chen, Lei Sun, Tianwei Yu, Joe Yeong, Jin Liu
Biological techniques for spatially resolved transcriptomics (SRT) have advanced rapidly in both throughput and spatial resolution. This progress calls for efficient and scalable spatial dimension reduction methods capable of handling large-scale SRT data from multiple tissue sections. Here, we developed FAST, a fast and efficient generalized probabilistic factor analysis model for spatially aware dimension reduction. FAST simultaneously accounts for the count-based nature of SRT data and extracts low-dimensional representations across multiple sections, while preserving biological signals and incorporating spatial smoothness among neighboring locations. Unlike existing methods, FAST explicitly models count data across sections and leverages local spatial dependencies with scalable computational complexity. Using both simulated and real datasets, we demonstrated that embeddings estimated by FAST show improved correlation with annotated cell and domain types. Notably, FAST was the only method capable of analyzing a mouse embryo Stereo-seq dataset with > 2.3 million spatial locations in just 2 hours. FAST also identified differential activity of immune-related transcription factors between tumor and non-tumor clusters and predicted the carcinogenesis factor CCNH as an upstream regulator of differentially expressed genes in a breast cancer Xenium dataset. FAST is available for non-commercial use at https://github.com/feiyoung/ProFAST.
{"title":"FAST: Scalable Factor Analysis for Spatial Dimension Reduction of Multi-section Spatial Transcriptomics.","authors":"Wei Liu, Xiao Zhang, Xiaoran Chai, Zhenqian Fan, Huazhen Lin, Jinmiao Chen, Lei Sun, Tianwei Yu, Joe Yeong, Jin Liu","doi":"10.1093/gpbjnl/qzag006","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag006","url":null,"abstract":"<p><p>Biological techniques for spatially resolved transcriptomics (SRT) have advanced rapidly in both throughput and spatial resolution. This progress calls for efficient and scalable spatial dimension reduction methods capable of handling large-scale SRT data from multiple tissue sections. Here, we developed FAST, a fast and efficient generalized probabilistic factor analysis model for spatially aware dimension reduction. FAST simultaneously accounts for the count-based nature of SRT data and extracts low-dimensional representations across multiple sections, while preserving biological signals and incorporating spatial smoothness among neighboring locations. Unlike existing methods, FAST explicitly models count data across sections and leverages local spatial dependencies with scalable computational complexity. Using both simulated and real datasets, we demonstrated that embeddings estimated by FAST show improved correlation with annotated cell and domain types. Notably, FAST was the only method capable of analyzing a mouse embryo Stereo-seq dataset with > 2.3 million spatial locations in just 2 hours. FAST also identified differential activity of immune-related transcription factors between tumor and non-tumor clusters and predicted the carcinogenesis factor CCNH as an upstream regulator of differentially expressed genes in a breast cancer Xenium dataset. FAST is available for non-commercial use at https://github.com/feiyoung/ProFAST.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146047617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pseudouridine (Ψ) is a C5 glycoside isomer of uridine, formed by breaking the N1 glycosyl bond and undergoing a 180° base rotation. This modification stands as one of the most widespread post-transcriptional alterations in RNA and is universally distributed among diverse RNA species. The pervasiveness of this modification enhances RNA structural integrity, bestows unique structural and functional attributes upon the RNA molecules it adorns, and facilitates additional hydrogen bonding. However, the absence of a convenient and integrated intuitive visualization database encompassing all currently reported species and RNA types is evident. Here, we present the Ψ-Atlas, an extensive database meticulously curated for the comprehensive collection and annotation of RNA pseudouridine. This database encompasses 554,895 Ψ modification sites across various RNA categories, including mRNA, ncRNA, tRNA, rRNA, and others, in 78 distinct species reported in current literature. The sequencing methodologies employed comprise the next-generation sequencing techniques such as Ψ-Seq, Pseudo-Seq, CeU-Seq, PSI-Seq, RBS-Seq, HydraPsiSeq, BID-Seq, PRAISE-Seq, as well as third-generation sequencing methods like Direct RNA sequencing. Ψ-Atlas is the most comprehensive and integrated resource for RNA pseudouridine modifications to date. The Ψ-Atlas database offers an intuitive interface for information display and a myriad of analytical tools, including PsiVar and PsiFinder. In essence, this platform serves as a robust search and visualization tool for the study of pseudouridylation in epitranscriptomics. Ψ-Atlas is available at https://rnainformatics.org.cn/PsiAtlas.
{"title":"Ψ-Atlas: An Integrated Atlas for Pseudouridine Epitranscriptome.","authors":"Xiaochen Wang, Jinjing Luo, Xiaoqiang Lang, Yongqing Ling, Yiming Zhou, Guoxian Liu, Xiangye Chen, Yibo Chen, Yingshun Zhou, Yi Cao, Zhonghui Zhang, Changjun Ding, Demeng Chen, Qi Liu","doi":"10.1093/gpbjnl/qzag004","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag004","url":null,"abstract":"<p><p>Pseudouridine (Ψ) is a C5 glycoside isomer of uridine, formed by breaking the N1 glycosyl bond and undergoing a 180° base rotation. This modification stands as one of the most widespread post-transcriptional alterations in RNA and is universally distributed among diverse RNA species. The pervasiveness of this modification enhances RNA structural integrity, bestows unique structural and functional attributes upon the RNA molecules it adorns, and facilitates additional hydrogen bonding. However, the absence of a convenient and integrated intuitive visualization database encompassing all currently reported species and RNA types is evident. Here, we present the Ψ-Atlas, an extensive database meticulously curated for the comprehensive collection and annotation of RNA pseudouridine. This database encompasses 554,895 Ψ modification sites across various RNA categories, including mRNA, ncRNA, tRNA, rRNA, and others, in 78 distinct species reported in current literature. The sequencing methodologies employed comprise the next-generation sequencing techniques such as Ψ-Seq, Pseudo-Seq, CeU-Seq, PSI-Seq, RBS-Seq, HydraPsiSeq, BID-Seq, PRAISE-Seq, as well as third-generation sequencing methods like Direct RNA sequencing. Ψ-Atlas is the most comprehensive and integrated resource for RNA pseudouridine modifications to date. The Ψ-Atlas database offers an intuitive interface for information display and a myriad of analytical tools, including PsiVar and PsiFinder. In essence, this platform serves as a robust search and visualization tool for the study of pseudouridylation in epitranscriptomics. Ψ-Atlas is available at https://rnainformatics.org.cn/PsiAtlas.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146041965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Single-cell RNA sequencing as well as bulk RNA sequencing data provide valuable insights into both physiological and pathological processes. The effective interpretation of these data relies on the availability of sophisticated analysis and visualization tools. Here, we introduce ClusterGVis, an advanced bioinformatics software specifically designed to simplify the analysis and visualization of gene expression data. ClusterGVis provides a user-friendly interface that allows researchers to perform fuzzy c-means and k-means clustering on transcriptomic datasets. It enables researchers to uncover patterns and relationships within complex gene expression profiles effectively. The integrated heatmap visualization features support intuitive exploration of co-expression networks and the identification of differentially expressed genes across diverse experimental conditions. ClusterGVis serves a dual purpose by aiding in the identification of potential biomarkers and enriching the understanding of gene function and regulatory mechanisms. The tutorials, manual, source code, and demo data of ClusterGVis are publicly available at https://github.com/junjunlab/ClusterGVis and https://bioconductor.org/packages/ClusterGVis. The ClusterGVis Shiny App has been deployed on shinyapps.io and is accessible at https://laojunjun.shinyapps.io/clustergvis_app_v0/. The Shiny App source code is hosted on GitHub at https://github.com/junjunlab/ClusterGvis-app.
{"title":"ClusterGVis: An Advanced Visualization and Clustering Tool for Gene Expression Analysis.","authors":"Jun Zhang, Hongyuan Li, Wenjun Tao, Jun Zhou","doi":"10.1093/gpbjnl/qzag005","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzag005","url":null,"abstract":"<p><p>Single-cell RNA sequencing as well as bulk RNA sequencing data provide valuable insights into both physiological and pathological processes. The effective interpretation of these data relies on the availability of sophisticated analysis and visualization tools. Here, we introduce ClusterGVis, an advanced bioinformatics software specifically designed to simplify the analysis and visualization of gene expression data. ClusterGVis provides a user-friendly interface that allows researchers to perform fuzzy c-means and k-means clustering on transcriptomic datasets. It enables researchers to uncover patterns and relationships within complex gene expression profiles effectively. The integrated heatmap visualization features support intuitive exploration of co-expression networks and the identification of differentially expressed genes across diverse experimental conditions. ClusterGVis serves a dual purpose by aiding in the identification of potential biomarkers and enriching the understanding of gene function and regulatory mechanisms. The tutorials, manual, source code, and demo data of ClusterGVis are publicly available at https://github.com/junjunlab/ClusterGVis and https://bioconductor.org/packages/ClusterGVis. The ClusterGVis Shiny App has been deployed on shinyapps.io and is accessible at https://laojunjun.shinyapps.io/clustergvis_app_v0/. The Shiny App source code is hosted on GitHub at https://github.com/junjunlab/ClusterGvis-app.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146021006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}