Pub Date : 2026-02-11eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag043
Orsolya Lapohos, Gregory J Fonseca
Motivation: Inference of candidate upstream regulators via motif enrichment analysis is a common step in the interpretation of genomic data. However, redundancy in motif databases can negatively impact predictive value, especially when relying on regression-based motif enrichment analysis. Although various forms of motif clustering have been used to mitigate problems caused by redundancy, an algorithm optimized for downstream regression-based analysis is needed.
Results: We introduce AmalgaMo, an efficient and flexible command-line tool for merging highly similar motifs. Using publicly available human datasets, we demonstrate that merging motifs with our optimized settings greatly benefits regression-based motif enrichment analysis and provide detailed documentation that can serve as a reference for researchers inferring upstream regulators from genomic data.
Availability and implementation: AmalgaMo is available on GitHub at https://github.com/lapohosorsolya/AmalgaMo.
{"title":"AmalgaMo: flexible DNA motif merging.","authors":"Orsolya Lapohos, Gregory J Fonseca","doi":"10.1093/bioadv/vbag043","DOIUrl":"https://doi.org/10.1093/bioadv/vbag043","url":null,"abstract":"<p><strong>Motivation: </strong>Inference of candidate upstream regulators via motif enrichment analysis is a common step in the interpretation of genomic data. However, redundancy in motif databases can negatively impact predictive value, especially when relying on regression-based motif enrichment analysis. Although various forms of motif clustering have been used to mitigate problems caused by redundancy, an algorithm optimized for downstream regression-based analysis is needed.</p><p><strong>Results: </strong>We introduce AmalgaMo, an efficient and flexible command-line tool for merging highly similar motifs. Using publicly available human datasets, we demonstrate that merging motifs with our optimized settings greatly benefits regression-based motif enrichment analysis and provide detailed documentation that can serve as a reference for researchers inferring upstream regulators from genomic data.</p><p><strong>Availability and implementation: </strong>AmalgaMo is available on GitHub at https://github.com/lapohosorsolya/AmalgaMo.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag043"},"PeriodicalIF":2.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12947577/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag009
Alex Bateman
{"title":"Amos Bairoch (1957-2025): pioneer of bioinformatics and founder of Swiss-Prot.","authors":"Alex Bateman","doi":"10.1093/bioadv/vbag009","DOIUrl":"https://doi.org/10.1093/bioadv/vbag009","url":null,"abstract":"","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag009"},"PeriodicalIF":2.8,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12884847/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146159441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag040
Aditya Syam, Chris Adonizio, Xinzhu Wei
Motivation: The Genotype Representation Graph (GRG) is a graph representation of whole genome polymorphisms, designed to encode the variant hard-call information in phased whole genomes. It encodes the genotypes as an extremely compact graph that can be traversed efficiently, enabling dynamic programming-style algorithms on applications such as genome-wide association studies that run faster on biobank-scale data than existing alternatives. To facilitate scalable statistical genetics, we present GrgPhenoSim, an extremely fast phenotype simulator for GRGs, suitable for simulating phenotypes on biobank-scale datasets.
Results: GrgPhenoSim contains all the primary functionalities of a phenotype simulator, uses a standardized output, and supports customized simulations. GrgPhenoSim is dozens to hundreds of times faster than tstrait, a fast ancestral recombination graph-based phenotype simulator, when the sample size ranges from thousands to hundreds of thousands of samples.
Availability and implementation: The GrgPhenoSim library and use-case demonstrations are available at https://github.com/aprilweilab/grg_pheno_sim. The documentation for GrgPhenoSim is hosted at https://grgl.readthedocs.io/en/stable/examples_and_applications.html#phenotype-simulation.
{"title":"Fast phenotype simulation for genotype representation graphs.","authors":"Aditya Syam, Chris Adonizio, Xinzhu Wei","doi":"10.1093/bioadv/vbag040","DOIUrl":"10.1093/bioadv/vbag040","url":null,"abstract":"<p><strong>Motivation: </strong>The Genotype Representation Graph (GRG) is a graph representation of whole genome polymorphisms, designed to encode the variant hard-call information in phased whole genomes. It encodes the genotypes as an extremely compact graph that can be traversed efficiently, enabling dynamic programming-style algorithms on applications such as genome-wide association studies that run faster on biobank-scale data than existing alternatives. To facilitate scalable statistical genetics, we present <i>GrgPhenoSim</i>, an extremely fast phenotype simulator for GRGs, suitable for simulating phenotypes on biobank-scale datasets.</p><p><strong>Results: </strong><i>GrgPhenoSim</i> contains all the primary functionalities of a phenotype simulator, uses a standardized output, and supports customized simulations. <i>GrgPhenoSim</i> is dozens to hundreds of times faster than <i>tstrait</i>, a fast ancestral recombination graph-based phenotype simulator, when the sample size ranges from thousands to hundreds of thousands of samples.</p><p><strong>Availability and implementation: </strong>The <i>GrgPhenoSim</i> library and use-case demonstrations are available at https://github.com/aprilweilab/grg_pheno_sim. The documentation for GrgPhenoSim is hosted at https://grgl.readthedocs.io/en/stable/examples_and_applications.html#phenotype-simulation.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag040"},"PeriodicalIF":2.8,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12927419/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147286345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag038
Seungjun Ahn, Eun Jeong Oh
Motivation: Network theory has proven invaluable in unraveling complex protein interactions. Previous studies have employed statistical methods rooted in network theory, including the Gaussian graphical model, to infer networks among proteins, identifying hub proteins based on key structural properties of networks such as degree centrality. However, there has been limited research examining a prognostic role of hub proteins on outcomes, while adjusting for clinical covariates in the context of high-dimensional data.
Results: To address this gap, we propose a network-guided penalized regression method. First, we construct a network using the Gaussian graphical model to identify hub proteins. Next, we preserve these identified hub proteins along with clinically relevant factors, while applying adaptive Lasso to non-hub proteins for variable selection. Our network-guided estimators are shown to have variable selection consistency and asymptotic normality. Simulation results suggest that our method produces better results compared to existing methods and demonstrates promise for advancing biomarker identification in proteomics research. Lastly, we apply our method to the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data and identified hub proteins that may serve as prognostic biomarkers for various diseases, including rare genetic disorders and immune checkpoint for cancer immunotherapy.
Availability and implementation: R package is freely available on CRAN repository (https://CRAN.R-project.org/package=NetGreg) and published under General Public License version 3.
{"title":"A network-guided penalized regression with application to proteomics data.","authors":"Seungjun Ahn, Eun Jeong Oh","doi":"10.1093/bioadv/vbag038","DOIUrl":"https://doi.org/10.1093/bioadv/vbag038","url":null,"abstract":"<p><strong>Motivation: </strong>Network theory has proven invaluable in unraveling complex protein interactions. Previous studies have employed statistical methods rooted in network theory, including the Gaussian graphical model, to infer networks among proteins, identifying hub proteins based on key structural properties of networks such as degree centrality. However, there has been limited research examining a prognostic role of hub proteins on outcomes, while adjusting for clinical covariates in the context of high-dimensional data.</p><p><strong>Results: </strong>To address this gap, we propose a network-guided penalized regression method. First, we construct a network using the Gaussian graphical model to identify hub proteins. Next, we preserve these identified hub proteins along with clinically relevant factors, while applying adaptive Lasso to non-hub proteins for variable selection. Our network-guided estimators are shown to have variable selection consistency and asymptotic normality. Simulation results suggest that our method produces better results compared to existing methods and demonstrates promise for advancing biomarker identification in proteomics research. Lastly, we apply our method to the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data and identified hub proteins that may serve as prognostic biomarkers for various diseases, including rare genetic disorders and immune checkpoint for cancer immunotherapy.</p><p><strong>Availability and implementation: </strong>R package is freely available on CRAN repository (https://CRAN.R-project.org/package=NetGreg) and published under General Public License version 3.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag038"},"PeriodicalIF":2.8,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12949433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag003
Joao Carlos Gomes-Neto, Alexandra Crook, Rachel Hestrin, Guoming Li, Chia-Sin Liew, Guilherme Rosa, Keshav D Singh, Christopher K Tuggle, Katie L Summers, Camilo Valdes, Noah Fahlgren, Jennifer Clarke
Motivation: The world of agriculture is rapidly changing with advances in artificial intelligence and demands for greater feed and food security considering environmental and sustainability challenges. The 30th Conference on Intelligent Systems in Molecular Biology (ISMB) held in July 2022 featured an invited session on the role of computational biology in Digital and Precision Agriculture. This session featured presentations by experts from various subdisciplines on novel research discoveries and a panel discussion on Digital Agriculture at Scale. Topics discussed during the session included genetics, epigenetics, and genomics of agriculturally relevant species; foodborne pathogen genomics and epidemiology; plant and animal phenomics; AI/machine learning; image analysis; remote sensing; educational innovations; discoveries resulting from public-private partnerships; data sharing and findable, accessible, interoperable, and reproducible (FAIR) data standards; biotechnology; and soil microbial ecology and biogeochemistry.
Results: We present several of the current and future challenges and opportunities for computational biology in agriculture including why these challenges are important to address, what barriers exist, and what skills and competencies are required to be successful as a computational biologist in agriculture. We intend this summary to engage the computational biology community and attract them to the opportunities available for interesting and impactful work toward ensuring sustainable food security.
{"title":"Challenges and opportunities: computational biology and the future of agriculture.","authors":"Joao Carlos Gomes-Neto, Alexandra Crook, Rachel Hestrin, Guoming Li, Chia-Sin Liew, Guilherme Rosa, Keshav D Singh, Christopher K Tuggle, Katie L Summers, Camilo Valdes, Noah Fahlgren, Jennifer Clarke","doi":"10.1093/bioadv/vbag003","DOIUrl":"https://doi.org/10.1093/bioadv/vbag003","url":null,"abstract":"<p><strong>Motivation: </strong>The world of agriculture is rapidly changing with advances in artificial intelligence and demands for greater feed and food security considering environmental and sustainability challenges. The 30th Conference on Intelligent Systems in Molecular Biology (ISMB) held in July 2022 featured an invited session on the role of computational biology in Digital and Precision Agriculture. This session featured presentations by experts from various subdisciplines on novel research discoveries and a panel discussion on Digital Agriculture at Scale. Topics discussed during the session included genetics, epigenetics, and genomics of agriculturally relevant species; foodborne pathogen genomics and epidemiology; plant and animal phenomics; AI/machine learning; image analysis; remote sensing; educational innovations; discoveries resulting from public-private partnerships; data sharing and findable, accessible, interoperable, and reproducible (FAIR) data standards; biotechnology; and soil microbial ecology and biogeochemistry.</p><p><strong>Results: </strong>We present several of the current and future challenges and opportunities for computational biology in agriculture including why these challenges are important to address, what barriers exist, and what skills and competencies are required to be successful as a computational biologist in agriculture. We intend this summary to engage the computational biology community and attract them to the opportunities available for interesting and impactful work toward ensuring sustainable food security.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag003"},"PeriodicalIF":2.8,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12916170/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146229937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motivation: Single-cell RNA sequencing has substantially advanced our understanding of gene expression dynamics and cellular heterogeneity. In recent years, deep learning (DL) has emerged as a promising approach to infer genetic regulation. However, these methods still face challenges in representing complex regulatory mechanisms. Thus, it remains imperative to develop new algorithms to enhance both effectiveness and reliability.
Results: We propose DeepCE, a DL framework for correlation-enhanced gene regulatory network (GRN) inference. DeepCE strengthens the extraction of dynamic regulation by integrating bidirectional gated recurrent units with convolutional neural networks (CNNs). Specifically, bidirectional gated recurrent units captures dynamic temporal dependencies, while CNNs focuses on local spatial patterns within single-cell data, enabling the model to uncover complex gene-gene interactions and generate high-quality GRNs. This framework improves the accuracy and robustness of GRN inference by smoothing noisy gene expression data, extracting time-lagged regulatory signals, and filtering out spurious correlations. Experiments conducted on mouse and human datasets demonstrate the strong performance of DeepCE. Performance evaluations show that DeepCE outperforms existing methods, achieving the highest AUROC and AUPR scores.
Availability and implementation: Codes for DeepCE are free available in the GitHub https://github.com/sxiaodai/DeepCE.
{"title":"DeepCE: a deep learning framework for correlation-enhanced gene regulatory network inference in single-cell RNA sequencing data.","authors":"Qianqian Wu, Xingmiao Dai, Shiyi Lou, Siyuan Wu, Tianhai Tian","doi":"10.1093/bioadv/vbag033","DOIUrl":"https://doi.org/10.1093/bioadv/vbag033","url":null,"abstract":"<p><strong>Motivation: </strong>Single-cell RNA sequencing has substantially advanced our understanding of gene expression dynamics and cellular heterogeneity. In recent years, deep learning (DL) has emerged as a promising approach to infer genetic regulation. However, these methods still face challenges in representing complex regulatory mechanisms. Thus, it remains imperative to develop new algorithms to enhance both effectiveness and reliability.</p><p><strong>Results: </strong>We propose DeepCE, a DL framework for correlation-enhanced gene regulatory network (GRN) inference. DeepCE strengthens the extraction of dynamic regulation by integrating bidirectional gated recurrent units with convolutional neural networks (CNNs). Specifically, bidirectional gated recurrent units captures dynamic temporal dependencies, while CNNs focuses on local spatial patterns within single-cell data, enabling the model to uncover complex gene-gene interactions and generate high-quality GRNs. This framework improves the accuracy and robustness of GRN inference by smoothing noisy gene expression data, extracting time-lagged regulatory signals, and filtering out spurious correlations. Experiments conducted on mouse and human datasets demonstrate the strong performance of DeepCE. Performance evaluations show that DeepCE outperforms existing methods, achieving the highest AUROC and AUPR scores.</p><p><strong>Availability and implementation: </strong>Codes for DeepCE are free available in the GitHub https://github.com/sxiaodai/DeepCE.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag033"},"PeriodicalIF":2.8,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12916171/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146230011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag027
Alexey Kazakov, Adam M Deutschbauer
Summary: GenomeDepot is an open-source web-based platform for annotation, management, and comparative analysis of microbial genomic sequences and associated data including ortholog families, protein domains, operons, regulatory interactions, strain taxonomy, and sample metadata. GenomeDepot supports rapid creation of websites for user-defined genome collections that include bioinformatic tools for interactive genome browsing, Basic Local Alignment Search Tool (BLAST) search, annotation search, comparative genomic neighborhood visualization, and sequence download. Gene function annotations are generated by a customizable annotation pipeline. The pipeline runs annotation tools in Conda environments and can be easily extended with additional user-specified tools.
Availability and implementation: GenomeDepot is open source and distributed under the GNU General Public License via GitHub (https://github.com/aekazakov/genome-depot). GenomeDepot is implemented in Python and was tested in Ubuntu Linux. Full installation instructions and documentation are available at https://aekazakov.github.io/genome-depot/. GenomeDepot demo server is freely accessible at https://iseq.lbl.gov/demogd/.
{"title":"GenomeDepot: data management system for microbial comparative genomics.","authors":"Alexey Kazakov, Adam M Deutschbauer","doi":"10.1093/bioadv/vbag027","DOIUrl":"10.1093/bioadv/vbag027","url":null,"abstract":"<p><strong>Summary: </strong>GenomeDepot is an open-source web-based platform for annotation, management, and comparative analysis of microbial genomic sequences and associated data including ortholog families, protein domains, operons, regulatory interactions, strain taxonomy, and sample metadata. GenomeDepot supports rapid creation of websites for user-defined genome collections that include bioinformatic tools for interactive genome browsing, Basic Local Alignment Search Tool (BLAST) search, annotation search, comparative genomic neighborhood visualization, and sequence download. Gene function annotations are generated by a customizable annotation pipeline. The pipeline runs annotation tools in Conda environments and can be easily extended with additional user-specified tools.</p><p><strong>Availability and implementation: </strong>GenomeDepot is open source and distributed under the GNU General Public License via GitHub (https://github.com/aekazakov/genome-depot). GenomeDepot is implemented in Python and was tested in Ubuntu Linux. Full installation instructions and documentation are available at https://aekazakov.github.io/genome-depot/. GenomeDepot demo server is freely accessible at https://iseq.lbl.gov/demogd/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag027"},"PeriodicalIF":2.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12895066/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag026
Menghui Chen, Mingrui Li, Ronnie Y Li, Jie Jiang, Zhaohui S Qin
Motivation: Understanding age-related transcriptional changes in human tissues is crucial for elucidating molecular mechanisms of aging and disease. Current genomic analysis tools often require programming expertise, limiting accessibility for comprehensive aging studies. Here, we present Age Effect Explorer, an interactive R Shiny application for systematically analyzing age- and sex-related gene expression pattern changes across 54 human tissues using Genotype-Tissue Expression (GTEx) v10 data.
Results: We obtained gene-level expression profiles from 981 individuals, and fitted ordinary least squares linear models including age, sex, and technical covariates with FDR correction. Pre-calculated results are stored in a cloud database enabling rapid, code-free exploration through an intuitive web interface. Age Effect Explorer validated known aging markers including age-correlated EDA2R. This resource democratizes access to aging transcriptomics, facilitating the discovery of tissue-specific aging mechanisms.
Availability and implementation: The Age Effect Explorer can be accessed using a web browser at https://menghui.shinyapps.io/ageeffectexplorer/. The code used to create the Shiny application, along with a tutorial, can be found on GitHub at https://github.com/ML198/GTEx-Explorer.
{"title":"Age effect explorer: a Shiny application to browse and visualize tissue-specific age-related gene expression changes.","authors":"Menghui Chen, Mingrui Li, Ronnie Y Li, Jie Jiang, Zhaohui S Qin","doi":"10.1093/bioadv/vbag026","DOIUrl":"10.1093/bioadv/vbag026","url":null,"abstract":"<p><strong>Motivation: </strong>Understanding age-related transcriptional changes in human tissues is crucial for elucidating molecular mechanisms of aging and disease. Current genomic analysis tools often require programming expertise, limiting accessibility for comprehensive aging studies. Here, we present Age Effect Explorer, an interactive R Shiny application for systematically analyzing age- and sex-related gene expression pattern changes across 54 human tissues using Genotype-Tissue Expression (GTEx) v10 data.</p><p><strong>Results: </strong>We obtained gene-level expression profiles from 981 individuals, and fitted ordinary least squares linear models including age, sex, and technical covariates with FDR correction. Pre-calculated results are stored in a cloud database enabling rapid, code-free exploration through an intuitive web interface. Age Effect Explorer validated known aging markers including age-correlated EDA2R. This resource democratizes access to aging transcriptomics, facilitating the discovery of tissue-specific aging mechanisms.</p><p><strong>Availability and implementation: </strong>The Age Effect Explorer can be accessed using a web browser at https://menghui.shinyapps.io/ageeffectexplorer/. The code used to create the Shiny application, along with a tutorial, can be found on GitHub at https://github.com/ML198/GTEx-Explorer.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag026"},"PeriodicalIF":2.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889165/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag031
Eilidh L Ward, Isabel Birds, Mary J O'Connell, David R Westhead, Julie L Aspden
Motivation: The advent of ribosome profiling (an adaptation of RNA sequencing) to determine the translatome, has led to a huge improvement in our understanding of what parts of the transcriptome are translated. Many alternative open reading frames (ORFs) are now regularly being detected such as out-of-frame, overlapping, upstream or downstream reading frames, and alternative reading frames using non-canonical start codons. Various tools have been developed for the detection of such novel ORFs, but they lack the capacity to visually inspect reads-an important aspect of validation and prediction of translation.
Results: The integrated and visualisation of ribosome profiling and RNA sequencing reads enables discrimination between transcriptional and translational signals, facilitating validation of predicted novel open reading frames. Furthermore, the inclusion of complementary evidence such as proteomic and long-read sequencing enables further validation of predicted novel open reading frames.
Availability and implementation: Here, we present, InspectorORF (https://www.github.com/aylz83/inspectorORF), an R package that readily plots ribosome profiling reads, alongside RNA sequencing reads across transcripts and/or ORFs. Additionally, custom information can be plotted including data from additional conditions and samples, proteomic analyses and reads from long-read sequencing.
{"title":"InspectorORF: a tool for visualizing Ribo-Seq and additional genomic or transcriptomic data.","authors":"Eilidh L Ward, Isabel Birds, Mary J O'Connell, David R Westhead, Julie L Aspden","doi":"10.1093/bioadv/vbag031","DOIUrl":"10.1093/bioadv/vbag031","url":null,"abstract":"<p><strong>Motivation: </strong>The advent of ribosome profiling (an adaptation of RNA sequencing) to determine the translatome, has led to a huge improvement in our understanding of what parts of the transcriptome are translated. Many alternative open reading frames (ORFs) are now regularly being detected such as out-of-frame, overlapping, upstream or downstream reading frames, and alternative reading frames using non-canonical start codons. Various tools have been developed for the detection of such novel ORFs, but they lack the capacity to visually inspect reads-an important aspect of validation and prediction of translation.</p><p><strong>Results: </strong>The integrated and visualisation of ribosome profiling and RNA sequencing reads enables discrimination between transcriptional and translational signals, facilitating validation of predicted novel open reading frames. Furthermore, the inclusion of complementary evidence such as proteomic and long-read sequencing enables further validation of predicted novel open reading frames.</p><p><strong>Availability and implementation: </strong>Here, we present, InspectorORF (https://www.github.com/aylz83/inspectorORF), an R package that readily plots ribosome profiling reads, alongside RNA sequencing reads across transcripts and/or ORFs. Additionally, custom information can be plotted including data from additional conditions and samples, proteomic analyses and reads from long-read sequencing.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag031"},"PeriodicalIF":2.8,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12904772/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27eCollection Date: 2026-01-01DOI: 10.1093/bioadv/vbag030
Jade M Davis, Kristina K Gagalova, Lilian M V P Sanglard, Sabrina Cuellar, Mark R Gibberd, Fatima Naim
Motivation: High-quality genome annotations are essential for transcriptomic analyses investigating plant responses to environmental stress. While nanopore long-read direct RNA sequencing offers a powerful approach for improving genome annotations, studies benchmarking optimal tools for this process have primarily focused on animal models. In this study, we benchmarked five annotation tools: StringTie3, IsoQuant, Bambu, FLAIR, and FLAMES, using direct RNA data from barley infected with Net Form Net Blotch disease.
Results: We observed substantial variation across tools in isoform detection, structural completeness, splicing classification, and handling of 5' read truncation. Several tools successfully identified novel transcripts, with the two top-performing reference-guided approaches both detecting over 700 previously unannotated transcripts, including candidates with predicted roles in disease response. Our results highlight the importance of plant-specific benchmarking of bioinformatic tools and demonstrate the utility of direct RNA sequencing for improving genome annotations, supporting ongoing efforts to enhance reference resources for non-model plant species.
Availability and implementation: Benchmarking code is available at https://github.com/jadedavis5/benchmarking_paper. Datasets are described in the 'Data availability' section.
动机:高质量的基因组注释对于研究植物对环境胁迫的反应的转录组学分析至关重要。虽然纳米孔长读直接RNA测序为改进基因组注释提供了一种强大的方法,但对这一过程的最佳工具进行基准测试的研究主要集中在动物模型上。在这项研究中,我们对五种注释工具:StringTie3、IsoQuant、Bambu、FLAIR和FLAMES进行了基准测试,使用了感染Net Form Net Blotch病的大麦的直接RNA数据。结果:我们观察到不同工具在异构体检测、结构完整性、拼接分类和处理5'读截断方面存在实质性差异。几种工具成功地鉴定了新的转录本,其中两种表现最好的参考指导方法都检测了超过700种以前未注释的转录本,包括在疾病反应中具有预测作用的候选转录本。我们的研究结果强调了植物特异性生物信息学工具基准化的重要性,并证明了直接RNA测序在改进基因组注释方面的实用性,支持了正在进行的增加非模式植物物种参考资源的努力。可用性和实现:基准测试代码可从https://github.com/jadedavis5/benchmarking_paper获得。数据集的描述见“数据可用性”部分。
{"title":"Benchmarking methods for genome annotation using nanopore direct RNA in a non-model crop plant.","authors":"Jade M Davis, Kristina K Gagalova, Lilian M V P Sanglard, Sabrina Cuellar, Mark R Gibberd, Fatima Naim","doi":"10.1093/bioadv/vbag030","DOIUrl":"10.1093/bioadv/vbag030","url":null,"abstract":"<p><strong>Motivation: </strong>High-quality genome annotations are essential for transcriptomic analyses investigating plant responses to environmental stress. While nanopore long-read direct RNA sequencing offers a powerful approach for improving genome annotations, studies benchmarking optimal tools for this process have primarily focused on animal models. In this study, we benchmarked five annotation tools: StringTie3, IsoQuant, Bambu, FLAIR, and FLAMES, using direct RNA data from barley infected with Net Form Net Blotch disease.</p><p><strong>Results: </strong>We observed substantial variation across tools in isoform detection, structural completeness, splicing classification, and handling of 5' read truncation. Several tools successfully identified novel transcripts, with the two top-performing reference-guided approaches both detecting over 700 previously unannotated transcripts, including candidates with predicted roles in disease response. Our results highlight the importance of plant-specific benchmarking of bioinformatic tools and demonstrate the utility of direct RNA sequencing for improving genome annotations, supporting ongoing efforts to enhance reference resources for non-model plant species.</p><p><strong>Availability and implementation: </strong>Benchmarking code is available at https://github.com/jadedavis5/benchmarking_paper. Datasets are described in the 'Data availability' section.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag030"},"PeriodicalIF":2.8,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12967217/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147379538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}