首页 > 最新文献

Bioinformatics advances最新文献

英文 中文
AmalgaMo: flexible DNA motif merging. 汞合金:灵活的DNA基序合并。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-11 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag043
Orsolya Lapohos, Gregory J Fonseca

Motivation: Inference of candidate upstream regulators via motif enrichment analysis is a common step in the interpretation of genomic data. However, redundancy in motif databases can negatively impact predictive value, especially when relying on regression-based motif enrichment analysis. Although various forms of motif clustering have been used to mitigate problems caused by redundancy, an algorithm optimized for downstream regression-based analysis is needed.

Results: We introduce AmalgaMo, an efficient and flexible command-line tool for merging highly similar motifs. Using publicly available human datasets, we demonstrate that merging motifs with our optimized settings greatly benefits regression-based motif enrichment analysis and provide detailed documentation that can serve as a reference for researchers inferring upstream regulators from genomic data.

Availability and implementation: AmalgaMo is available on GitHub at https://github.com/lapohosorsolya/AmalgaMo.

动机:通过基序富集分析推断候选上游调控因子是基因组数据解释的常见步骤。然而,motif数据库中的冗余会对预测价值产生负面影响,特别是当依赖于基于回归的motif富集分析时。虽然各种形式的基序聚类已被用于减轻冗余引起的问题,但需要一种优化的基于下游回归的分析算法。结果:我们介绍了AmalgaMo,一个高效灵活的命令行工具,用于合并高度相似的motif。利用公开可用的人类数据集,我们证明了将基序与我们的优化设置合并在一起极大地有利于基于回归的基序富集分析,并提供了详细的文档,可以作为研究人员从基因组数据推断上游调控因子的参考。可用性和实现:AmalgaMo可在GitHub上获得https://github.com/lapohosorsolya/AmalgaMo。
{"title":"AmalgaMo: flexible DNA motif merging.","authors":"Orsolya Lapohos, Gregory J Fonseca","doi":"10.1093/bioadv/vbag043","DOIUrl":"https://doi.org/10.1093/bioadv/vbag043","url":null,"abstract":"<p><strong>Motivation: </strong>Inference of candidate upstream regulators via motif enrichment analysis is a common step in the interpretation of genomic data. However, redundancy in motif databases can negatively impact predictive value, especially when relying on regression-based motif enrichment analysis. Although various forms of motif clustering have been used to mitigate problems caused by redundancy, an algorithm optimized for downstream regression-based analysis is needed.</p><p><strong>Results: </strong>We introduce AmalgaMo, an efficient and flexible command-line tool for merging highly similar motifs. Using publicly available human datasets, we demonstrate that merging motifs with our optimized settings greatly benefits regression-based motif enrichment analysis and provide detailed documentation that can serve as a reference for researchers inferring upstream regulators from genomic data.</p><p><strong>Availability and implementation: </strong>AmalgaMo is available on GitHub at https://github.com/lapohosorsolya/AmalgaMo.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag043"},"PeriodicalIF":2.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12947577/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Amos Bairoch (1957-2025): pioneer of bioinformatics and founder of Swiss-Prot. Amos Bairoch(1957-2025):生物信息学的先驱和Swiss-Prot的创始人。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-06 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag009
Alex Bateman
{"title":"Amos Bairoch (1957-2025): pioneer of bioinformatics and founder of Swiss-Prot.","authors":"Alex Bateman","doi":"10.1093/bioadv/vbag009","DOIUrl":"https://doi.org/10.1093/bioadv/vbag009","url":null,"abstract":"","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag009"},"PeriodicalIF":2.8,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12884847/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146159441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast phenotype simulation for genotype representation graphs. 快速表型模拟基因型表示图。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-06 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag040
Aditya Syam, Chris Adonizio, Xinzhu Wei

Motivation: The Genotype Representation Graph (GRG) is a graph representation of whole genome polymorphisms, designed to encode the variant hard-call information in phased whole genomes. It encodes the genotypes as an extremely compact graph that can be traversed efficiently, enabling dynamic programming-style algorithms on applications such as genome-wide association studies that run faster on biobank-scale data than existing alternatives. To facilitate scalable statistical genetics, we present GrgPhenoSim, an extremely fast phenotype simulator for GRGs, suitable for simulating phenotypes on biobank-scale datasets.

Results: GrgPhenoSim contains all the primary functionalities of a phenotype simulator, uses a standardized output, and supports customized simulations. GrgPhenoSim is dozens to hundreds of times faster than tstrait, a fast ancestral recombination graph-based phenotype simulator, when the sample size ranges from thousands to hundreds of thousands of samples.

Availability and implementation: The GrgPhenoSim library and use-case demonstrations are available at https://github.com/aprilweilab/grg_pheno_sim. The documentation for GrgPhenoSim is hosted at https://grgl.readthedocs.io/en/stable/examples_and_applications.html#phenotype-simulation.

动机:基因型表示图(GRG)是全基因组多态性的图形表示,旨在编码分阶段全基因组中的变体硬调用信息。它将基因型编码成一个非常紧凑的图,可以有效地遍历,使动态规划风格的算法能够应用于全基因组关联研究等应用,在生物银行规模的数据上比现有的替代方法运行得更快。为了促进可扩展的统计遗传学,我们提出了GrgPhenoSim,一个非常快速的grg表型模拟器,适用于模拟生物库规模数据集上的表型。结果:GrgPhenoSim包含表型模拟器的所有主要功能,使用标准化输出,并支持自定义模拟。当样本量范围从数千到数十万个样本时,GrgPhenoSim比快速的基于祖先重组图的表型模拟器tstrait快几十到几百倍。可用性和实现:GrgPhenoSim库和用例演示可在https://github.com/aprilweilab/grg_pheno_sim上获得。GrgPhenoSim的文档托管在https://grgl.readthedocs.io/en/stable/examples_and_applications.html#phenotype-simulation。
{"title":"Fast phenotype simulation for genotype representation graphs.","authors":"Aditya Syam, Chris Adonizio, Xinzhu Wei","doi":"10.1093/bioadv/vbag040","DOIUrl":"10.1093/bioadv/vbag040","url":null,"abstract":"<p><strong>Motivation: </strong>The Genotype Representation Graph (GRG) is a graph representation of whole genome polymorphisms, designed to encode the variant hard-call information in phased whole genomes. It encodes the genotypes as an extremely compact graph that can be traversed efficiently, enabling dynamic programming-style algorithms on applications such as genome-wide association studies that run faster on biobank-scale data than existing alternatives. To facilitate scalable statistical genetics, we present <i>GrgPhenoSim</i>, an extremely fast phenotype simulator for GRGs, suitable for simulating phenotypes on biobank-scale datasets.</p><p><strong>Results: </strong><i>GrgPhenoSim</i> contains all the primary functionalities of a phenotype simulator, uses a standardized output, and supports customized simulations. <i>GrgPhenoSim</i> is dozens to hundreds of times faster than <i>tstrait</i>, a fast ancestral recombination graph-based phenotype simulator, when the sample size ranges from thousands to hundreds of thousands of samples.</p><p><strong>Availability and implementation: </strong>The <i>GrgPhenoSim</i> library and use-case demonstrations are available at https://github.com/aprilweilab/grg_pheno_sim. The documentation for GrgPhenoSim is hosted at https://grgl.readthedocs.io/en/stable/examples_and_applications.html#phenotype-simulation.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag040"},"PeriodicalIF":2.8,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12927419/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147286345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A network-guided penalized regression with application to proteomics data. 应用于蛋白质组学数据的网络引导惩罚回归。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-03 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag038
Seungjun Ahn, Eun Jeong Oh

Motivation: Network theory has proven invaluable in unraveling complex protein interactions. Previous studies have employed statistical methods rooted in network theory, including the Gaussian graphical model, to infer networks among proteins, identifying hub proteins based on key structural properties of networks such as degree centrality. However, there has been limited research examining a prognostic role of hub proteins on outcomes, while adjusting for clinical covariates in the context of high-dimensional data.

Results: To address this gap, we propose a network-guided penalized regression method. First, we construct a network using the Gaussian graphical model to identify hub proteins. Next, we preserve these identified hub proteins along with clinically relevant factors, while applying adaptive Lasso to non-hub proteins for variable selection. Our network-guided estimators are shown to have variable selection consistency and asymptotic normality. Simulation results suggest that our method produces better results compared to existing methods and demonstrates promise for advancing biomarker identification in proteomics research. Lastly, we apply our method to the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data and identified hub proteins that may serve as prognostic biomarkers for various diseases, including rare genetic disorders and immune checkpoint for cancer immunotherapy.

Availability and implementation: R package is freely available on CRAN repository (https://CRAN.R-project.org/package=NetGreg) and published under General Public License version 3.

动机:网络理论在揭示复杂的蛋白质相互作用方面被证明是无价的。先前的研究采用了基于网络理论的统计方法,包括高斯图模型,来推断蛋白质之间的网络,根据网络的关键结构特性(如度中心性)来识别枢纽蛋白。然而,在高维数据背景下调整临床协变量时,检验枢纽蛋白对预后的预后作用的研究有限。结果:为了解决这一差距,我们提出了一种网络引导的惩罚回归方法。首先,我们使用高斯图模型构建网络来识别枢纽蛋白。接下来,我们保留这些鉴定出的枢纽蛋白以及临床相关因素,同时对非枢纽蛋白应用自适应Lasso进行变量选择。我们的网络引导估计具有变量选择一致性和渐近正态性。模拟结果表明,与现有方法相比,我们的方法产生了更好的结果,并证明了在蛋白质组学研究中推进生物标志物鉴定的前景。最后,我们将我们的方法应用于临床蛋白质组学肿瘤分析联盟(CPTAC)的数据,并鉴定出可能作为各种疾病预后生物标志物的枢纽蛋白,包括罕见的遗传疾病和癌症免疫治疗的免疫检查点。可用性和实现:R包在CRAN存储库(https://CRAN.R-project.org/package=NetGreg)上免费提供,并在通用公共许可版本3下发布。
{"title":"A network-guided penalized regression with application to proteomics data.","authors":"Seungjun Ahn, Eun Jeong Oh","doi":"10.1093/bioadv/vbag038","DOIUrl":"https://doi.org/10.1093/bioadv/vbag038","url":null,"abstract":"<p><strong>Motivation: </strong>Network theory has proven invaluable in unraveling complex protein interactions. Previous studies have employed statistical methods rooted in network theory, including the Gaussian graphical model, to infer networks among proteins, identifying hub proteins based on key structural properties of networks such as degree centrality. However, there has been limited research examining a prognostic role of hub proteins on outcomes, while adjusting for clinical covariates in the context of high-dimensional data.</p><p><strong>Results: </strong>To address this gap, we propose a network-guided penalized regression method. First, we construct a network using the Gaussian graphical model to identify hub proteins. Next, we preserve these identified hub proteins along with clinically relevant factors, while applying adaptive Lasso to non-hub proteins for variable selection. Our network-guided estimators are shown to have variable selection consistency and asymptotic normality. Simulation results suggest that our method produces better results compared to existing methods and demonstrates promise for advancing biomarker identification in proteomics research. Lastly, we apply our method to the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data and identified hub proteins that may serve as prognostic biomarkers for various diseases, including rare genetic disorders and immune checkpoint for cancer immunotherapy.</p><p><strong>Availability and implementation: </strong>R package is freely available on CRAN repository (https://CRAN.R-project.org/package=NetGreg) and published under General Public License version 3.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag038"},"PeriodicalIF":2.8,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12949433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges and opportunities: computational biology and the future of agriculture. 挑战与机遇:计算生物学与农业的未来。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-03 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag003
Joao Carlos Gomes-Neto, Alexandra Crook, Rachel Hestrin, Guoming Li, Chia-Sin Liew, Guilherme Rosa, Keshav D Singh, Christopher K Tuggle, Katie L Summers, Camilo Valdes, Noah Fahlgren, Jennifer Clarke

Motivation: The world of agriculture is rapidly changing with advances in artificial intelligence and demands for greater feed and food security considering environmental and sustainability challenges. The 30th Conference on Intelligent Systems in Molecular Biology (ISMB) held in July 2022 featured an invited session on the role of computational biology in Digital and Precision Agriculture. This session featured presentations by experts from various subdisciplines on novel research discoveries and a panel discussion on Digital Agriculture at Scale. Topics discussed during the session included genetics, epigenetics, and genomics of agriculturally relevant species; foodborne pathogen genomics and epidemiology; plant and animal phenomics; AI/machine learning; image analysis; remote sensing; educational innovations; discoveries resulting from public-private partnerships; data sharing and findable, accessible, interoperable, and reproducible (FAIR) data standards; biotechnology; and soil microbial ecology and biogeochemistry.

Results: We present several of the current and future challenges and opportunities for computational biology in agriculture including why these challenges are important to address, what barriers exist, and what skills and competencies are required to be successful as a computational biologist in agriculture. We intend this summary to engage the computational biology community and attract them to the opportunities available for interesting and impactful work toward ensuring sustainable food security.

动机:随着人工智能的进步,以及考虑到环境和可持续性挑战,对更高饲料和粮食安全的需求,农业世界正在迅速变化。2022年7月举行的第30届分子生物学智能系统会议(ISMB)邀请了一场关于计算生物学在数字和精准农业中的作用的会议。本次会议的特色是来自不同分支学科的专家介绍了新的研究发现,并就大规模数字农业进行了小组讨论。会议期间讨论的主题包括遗传学、表观遗传学和农业相关物种的基因组学;食源性病原体基因组学和流行病学;动植物表型组学;AI /机器学习;图像分析;遥感;教育创新;公私伙伴关系产生的发现;数据共享和可查找、可访问、可互操作和可复制(FAIR)数据标准;生物技术;土壤微生物生态学和生物地球化学。结果:我们提出了农业计算生物学当前和未来的几个挑战和机遇,包括为什么这些挑战需要解决,存在哪些障碍,以及作为一名成功的农业计算生物学家需要哪些技能和能力。我们希望这篇摘要能够吸引计算生物学社区的参与,并吸引他们参与到确保可持续粮食安全的有趣和有影响力的工作中来。
{"title":"Challenges and opportunities: computational biology and the future of agriculture.","authors":"Joao Carlos Gomes-Neto, Alexandra Crook, Rachel Hestrin, Guoming Li, Chia-Sin Liew, Guilherme Rosa, Keshav D Singh, Christopher K Tuggle, Katie L Summers, Camilo Valdes, Noah Fahlgren, Jennifer Clarke","doi":"10.1093/bioadv/vbag003","DOIUrl":"https://doi.org/10.1093/bioadv/vbag003","url":null,"abstract":"<p><strong>Motivation: </strong>The world of agriculture is rapidly changing with advances in artificial intelligence and demands for greater feed and food security considering environmental and sustainability challenges. The 30th Conference on Intelligent Systems in Molecular Biology (ISMB) held in July 2022 featured an invited session on the role of computational biology in Digital and Precision Agriculture. This session featured presentations by experts from various subdisciplines on novel research discoveries and a panel discussion on Digital Agriculture at Scale. Topics discussed during the session included genetics, epigenetics, and genomics of agriculturally relevant species; foodborne pathogen genomics and epidemiology; plant and animal phenomics; AI/machine learning; image analysis; remote sensing; educational innovations; discoveries resulting from public-private partnerships; data sharing and findable, accessible, interoperable, and reproducible (FAIR) data standards; biotechnology; and soil microbial ecology and biogeochemistry.</p><p><strong>Results: </strong>We present several of the current and future challenges and opportunities for computational biology in agriculture including why these challenges are important to address, what barriers exist, and what skills and competencies are required to be successful as a computational biologist in agriculture. We intend this summary to engage the computational biology community and attract them to the opportunities available for interesting and impactful work toward ensuring sustainable food security.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag003"},"PeriodicalIF":2.8,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12916170/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146229937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepCE: a deep learning framework for correlation-enhanced gene regulatory network inference in single-cell RNA sequencing data. DeepCE:用于单细胞RNA测序数据中相关增强基因调控网络推断的深度学习框架。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-30 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag033
Qianqian Wu, Xingmiao Dai, Shiyi Lou, Siyuan Wu, Tianhai Tian

Motivation: Single-cell RNA sequencing has substantially advanced our understanding of gene expression dynamics and cellular heterogeneity. In recent years, deep learning (DL) has emerged as a promising approach to infer genetic regulation. However, these methods still face challenges in representing complex regulatory mechanisms. Thus, it remains imperative to develop new algorithms to enhance both effectiveness and reliability.

Results: We propose DeepCE, a DL framework for correlation-enhanced gene regulatory network (GRN) inference. DeepCE strengthens the extraction of dynamic regulation by integrating bidirectional gated recurrent units with convolutional neural networks (CNNs). Specifically, bidirectional gated recurrent units captures dynamic temporal dependencies, while CNNs focuses on local spatial patterns within single-cell data, enabling the model to uncover complex gene-gene interactions and generate high-quality GRNs. This framework improves the accuracy and robustness of GRN inference by smoothing noisy gene expression data, extracting time-lagged regulatory signals, and filtering out spurious correlations. Experiments conducted on mouse and human datasets demonstrate the strong performance of DeepCE. Performance evaluations show that DeepCE outperforms existing methods, achieving the highest AUROC and AUPR scores.

Availability and implementation: Codes for DeepCE are free available in the GitHub https://github.com/sxiaodai/DeepCE.

动机:单细胞RNA测序极大地促进了我们对基因表达动力学和细胞异质性的理解。近年来,深度学习(DL)已经成为一种很有前途的方法来推断基因调控。然而,这些方法在代表复杂的监管机制方面仍然面临挑战。因此,开发新的算法以提高有效性和可靠性仍然是当务之急。结果:我们提出了DeepCE,一个相关增强基因调控网络(GRN)推理的深度学习框架。DeepCE通过将双向门控循环单元与卷积神经网络(cnn)相结合,加强了动态调节的提取。具体来说,双向门控循环单元捕获动态时间依赖性,而cnn专注于单细胞数据中的局部空间模式,使模型能够揭示复杂的基因-基因相互作用并生成高质量的grn。该框架通过平滑噪声基因表达数据、提取滞后调控信号和滤除虚假相关性来提高GRN推断的准确性和鲁棒性。在小鼠和人类数据集上进行的实验证明了DeepCE的强大性能。性能评估表明,DeepCE优于现有方法,实现了最高的AUROC和AUPR分数。可用性和实现:DeepCE的代码可以在GitHub https://github.com/sxiaodai/DeepCE中免费获得。
{"title":"DeepCE: a deep learning framework for correlation-enhanced gene regulatory network inference in single-cell RNA sequencing data.","authors":"Qianqian Wu, Xingmiao Dai, Shiyi Lou, Siyuan Wu, Tianhai Tian","doi":"10.1093/bioadv/vbag033","DOIUrl":"https://doi.org/10.1093/bioadv/vbag033","url":null,"abstract":"<p><strong>Motivation: </strong>Single-cell RNA sequencing has substantially advanced our understanding of gene expression dynamics and cellular heterogeneity. In recent years, deep learning (DL) has emerged as a promising approach to infer genetic regulation. However, these methods still face challenges in representing complex regulatory mechanisms. Thus, it remains imperative to develop new algorithms to enhance both effectiveness and reliability.</p><p><strong>Results: </strong>We propose DeepCE, a DL framework for correlation-enhanced gene regulatory network (GRN) inference. DeepCE strengthens the extraction of dynamic regulation by integrating bidirectional gated recurrent units with convolutional neural networks (CNNs). Specifically, bidirectional gated recurrent units captures dynamic temporal dependencies, while CNNs focuses on local spatial patterns within single-cell data, enabling the model to uncover complex gene-gene interactions and generate high-quality GRNs. This framework improves the accuracy and robustness of GRN inference by smoothing noisy gene expression data, extracting time-lagged regulatory signals, and filtering out spurious correlations. Experiments conducted on mouse and human datasets demonstrate the strong performance of DeepCE. Performance evaluations show that DeepCE outperforms existing methods, achieving the highest AUROC and AUPR scores.</p><p><strong>Availability and implementation: </strong>Codes for DeepCE are free available in the GitHub https://github.com/sxiaodai/DeepCE.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag033"},"PeriodicalIF":2.8,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12916171/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146230011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GenomeDepot: data management system for microbial comparative genomics. 基因组库:微生物比较基因组学数据管理系统。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-29 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag027
Alexey Kazakov, Adam M Deutschbauer

Summary: GenomeDepot is an open-source web-based platform for annotation, management, and comparative analysis of microbial genomic sequences and associated data including ortholog families, protein domains, operons, regulatory interactions, strain taxonomy, and sample metadata. GenomeDepot supports rapid creation of websites for user-defined genome collections that include bioinformatic tools for interactive genome browsing, Basic Local Alignment Search Tool (BLAST) search, annotation search, comparative genomic neighborhood visualization, and sequence download. Gene function annotations are generated by a customizable annotation pipeline. The pipeline runs annotation tools in Conda environments and can be easily extended with additional user-specified tools.

Availability and implementation: GenomeDepot is open source and distributed under the GNU General Public License via GitHub (https://github.com/aekazakov/genome-depot). GenomeDepot is implemented in Python and was tested in Ubuntu Linux. Full installation instructions and documentation are available at https://aekazakov.github.io/genome-depot/. GenomeDepot demo server is freely accessible at https://iseq.lbl.gov/demogd/.

GenomeDepot是一个基于web的开源平台,用于注释、管理和比较分析微生物基因组序列和相关数据,包括同源家族、蛋白质结构域、操作子、调控相互作用、菌株分类和样本元数据。GenomeDepot支持为用户定义的基因组集合快速创建网站,其中包括用于交互式基因组浏览的生物信息学工具、基本局部比对搜索工具(BLAST)搜索、注释搜索、比较基因组邻域可视化和序列下载。基因函数注释由可定制的注释管道生成。该管道在Conda环境中运行注释工具,并且可以使用其他用户指定的工具轻松扩展。可用性和实现:GenomeDepot是开源的,并通过GitHub (https://github.com/aekazakov/genome-depot)在GNU通用公共许可证下发布。GenomeDepot是用Python实现的,并在Ubuntu Linux中进行了测试。完整的安装说明和文档可在https://aekazakov.github.io/genome-depot/上获得。GenomeDepot演示服务器免费访问https://iseq.lbl.gov/demogd/。
{"title":"GenomeDepot: data management system for microbial comparative genomics.","authors":"Alexey Kazakov, Adam M Deutschbauer","doi":"10.1093/bioadv/vbag027","DOIUrl":"10.1093/bioadv/vbag027","url":null,"abstract":"<p><strong>Summary: </strong>GenomeDepot is an open-source web-based platform for annotation, management, and comparative analysis of microbial genomic sequences and associated data including ortholog families, protein domains, operons, regulatory interactions, strain taxonomy, and sample metadata. GenomeDepot supports rapid creation of websites for user-defined genome collections that include bioinformatic tools for interactive genome browsing, Basic Local Alignment Search Tool (BLAST) search, annotation search, comparative genomic neighborhood visualization, and sequence download. Gene function annotations are generated by a customizable annotation pipeline. The pipeline runs annotation tools in Conda environments and can be easily extended with additional user-specified tools.</p><p><strong>Availability and implementation: </strong>GenomeDepot is open source and distributed under the GNU General Public License via GitHub (https://github.com/aekazakov/genome-depot). GenomeDepot is implemented in Python and was tested in Ubuntu Linux. Full installation instructions and documentation are available at https://aekazakov.github.io/genome-depot/. GenomeDepot demo server is freely accessible at https://iseq.lbl.gov/demogd/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag027"},"PeriodicalIF":2.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12895066/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Age effect explorer: a Shiny application to browse and visualize tissue-specific age-related gene expression changes. 年龄效应浏览器:一个闪亮的应用程序,浏览和可视化组织特异性年龄相关的基因表达变化。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-29 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag026
Menghui Chen, Mingrui Li, Ronnie Y Li, Jie Jiang, Zhaohui S Qin

Motivation: Understanding age-related transcriptional changes in human tissues is crucial for elucidating molecular mechanisms of aging and disease. Current genomic analysis tools often require programming expertise, limiting accessibility for comprehensive aging studies. Here, we present Age Effect Explorer, an interactive R Shiny application for systematically analyzing age- and sex-related gene expression pattern changes across 54 human tissues using Genotype-Tissue Expression (GTEx) v10 data.

Results: We obtained gene-level expression profiles from 981 individuals, and fitted ordinary least squares linear models including age, sex, and technical covariates with FDR correction. Pre-calculated results are stored in a cloud database enabling rapid, code-free exploration through an intuitive web interface. Age Effect Explorer validated known aging markers including age-correlated EDA2R. This resource democratizes access to aging transcriptomics, facilitating the discovery of tissue-specific aging mechanisms.

Availability and implementation: The Age Effect Explorer can be accessed using a web browser at https://menghui.shinyapps.io/ageeffectexplorer/. The code used to create the Shiny application, along with a tutorial, can be found on GitHub at https://github.com/ML198/GTEx-Explorer.

动机:了解人体组织中与年龄相关的转录变化对于阐明衰老和疾病的分子机制至关重要。目前的基因组分析工具通常需要编程专业知识,限制了全面衰老研究的可及性。在这里,我们展示了年龄效应探索者,一个交互式的R Shiny应用程序,用于系统地分析54个人体组织中年龄和性别相关的基因表达模式变化,使用基因型-组织表达(GTEx) v10数据。结果:我们获得了981个个体的基因水平表达谱,并拟合了包括年龄、性别和技术协变量在内的普通最小二乘线性模型,并进行了FDR校正。预先计算的结果存储在云数据库中,通过直观的web界面实现快速,无代码的探索。年龄效应探索者验证了已知的衰老标记,包括与年龄相关的EDA2R。这一资源使衰老转录组学大众化,促进了组织特异性衰老机制的发现。可用性和实现:年龄效应浏览器可以使用web浏览器访问https://menghui.shinyapps.io/ageeffectexplorer/。用于创建Shiny应用程序的代码以及教程可以在GitHub上找到https://github.com/ML198/GTEx-Explorer。
{"title":"Age effect explorer: a Shiny application to browse and visualize tissue-specific age-related gene expression changes.","authors":"Menghui Chen, Mingrui Li, Ronnie Y Li, Jie Jiang, Zhaohui S Qin","doi":"10.1093/bioadv/vbag026","DOIUrl":"10.1093/bioadv/vbag026","url":null,"abstract":"<p><strong>Motivation: </strong>Understanding age-related transcriptional changes in human tissues is crucial for elucidating molecular mechanisms of aging and disease. Current genomic analysis tools often require programming expertise, limiting accessibility for comprehensive aging studies. Here, we present Age Effect Explorer, an interactive R Shiny application for systematically analyzing age- and sex-related gene expression pattern changes across 54 human tissues using Genotype-Tissue Expression (GTEx) v10 data.</p><p><strong>Results: </strong>We obtained gene-level expression profiles from 981 individuals, and fitted ordinary least squares linear models including age, sex, and technical covariates with FDR correction. Pre-calculated results are stored in a cloud database enabling rapid, code-free exploration through an intuitive web interface. Age Effect Explorer validated known aging markers including age-correlated EDA2R. This resource democratizes access to aging transcriptomics, facilitating the discovery of tissue-specific aging mechanisms.</p><p><strong>Availability and implementation: </strong>The Age Effect Explorer can be accessed using a web browser at https://menghui.shinyapps.io/ageeffectexplorer/. The code used to create the Shiny application, along with a tutorial, can be found on GitHub at https://github.com/ML198/GTEx-Explorer.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag026"},"PeriodicalIF":2.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12889165/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
InspectorORF: a tool for visualizing Ribo-Seq and additional genomic or transcriptomic data. InspectorORF:可视化核糖测序和其他基因组或转录组数据的工具。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-27 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag031
Eilidh L Ward, Isabel Birds, Mary J O'Connell, David R Westhead, Julie L Aspden

Motivation: The advent of ribosome profiling (an adaptation of RNA sequencing) to determine the translatome, has led to a huge improvement in our understanding of what parts of the transcriptome are translated. Many alternative open reading frames (ORFs) are now regularly being detected such as out-of-frame, overlapping, upstream or downstream reading frames, and alternative reading frames using non-canonical start codons. Various tools have been developed for the detection of such novel ORFs, but they lack the capacity to visually inspect reads-an important aspect of validation and prediction of translation.

Results: The integrated and visualisation of ribosome profiling and RNA sequencing reads enables discrimination between transcriptional and translational signals, facilitating validation of predicted novel open reading frames. Furthermore, the inclusion of complementary evidence such as proteomic and long-read sequencing enables further validation of predicted novel open reading frames.

Availability and implementation: Here, we present, InspectorORF (https://www.github.com/aylz83/inspectorORF), an R package that readily plots ribosome profiling reads, alongside RNA sequencing reads across transcripts and/or ORFs. Additionally, custom information can be plotted including data from additional conditions and samples, proteomic analyses and reads from long-read sequencing.

动机:用于确定翻译组的核糖体分析(RNA测序的一种改编)的出现,使我们对转录组中哪些部分被翻译的理解有了巨大的提高。许多可选的开放阅读帧(orf)现在经常被检测到,如帧外、重叠、上游或下游阅读帧,以及使用非规范起始密码子的可选阅读帧。已经开发了各种工具来检测这种新颖的orf,但它们缺乏视觉检查读取的能力——这是验证和预测翻译的一个重要方面。结果:核糖体分析和RNA测序读取的集成和可视化使得转录和翻译信号之间的区分,促进了预测的新型开放阅读框的验证。此外,包括补充证据,如蛋白质组学和长读区测序,可以进一步验证预测的新型开放阅读框。可用性和实现:在这里,我们介绍了InspectorORF (https://www.github.com/aylz83/inspectorORF),这是一个R包,可以轻松绘制核糖体分析读取,以及转录本和/或orf之间的RNA测序读取。此外,还可以绘制自定义信息,包括来自附加条件和样品的数据,蛋白质组学分析和长读测序的读数。
{"title":"InspectorORF: a tool for visualizing Ribo-Seq and additional genomic or transcriptomic data.","authors":"Eilidh L Ward, Isabel Birds, Mary J O'Connell, David R Westhead, Julie L Aspden","doi":"10.1093/bioadv/vbag031","DOIUrl":"10.1093/bioadv/vbag031","url":null,"abstract":"<p><strong>Motivation: </strong>The advent of ribosome profiling (an adaptation of RNA sequencing) to determine the translatome, has led to a huge improvement in our understanding of what parts of the transcriptome are translated. Many alternative open reading frames (ORFs) are now regularly being detected such as out-of-frame, overlapping, upstream or downstream reading frames, and alternative reading frames using non-canonical start codons. Various tools have been developed for the detection of such novel ORFs, but they lack the capacity to visually inspect reads-an important aspect of validation and prediction of translation.</p><p><strong>Results: </strong>The integrated and visualisation of ribosome profiling and RNA sequencing reads enables discrimination between transcriptional and translational signals, facilitating validation of predicted novel open reading frames. Furthermore, the inclusion of complementary evidence such as proteomic and long-read sequencing enables further validation of predicted novel open reading frames.</p><p><strong>Availability and implementation: </strong>Here, we present, InspectorORF (https://www.github.com/aylz83/inspectorORF), an R package that readily plots ribosome profiling reads, alongside RNA sequencing reads across transcripts and/or ORFs. Additionally, custom information can be plotted including data from additional conditions and samples, proteomic analyses and reads from long-read sequencing.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag031"},"PeriodicalIF":2.8,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12904772/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146203954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking methods for genome annotation using nanopore direct RNA in a non-model crop plant. 在非模式作物植物中使用纳米孔直接RNA进行基因组注释的基准方法。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-27 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbag030
Jade M Davis, Kristina K Gagalova, Lilian M V P Sanglard, Sabrina Cuellar, Mark R Gibberd, Fatima Naim

Motivation: High-quality genome annotations are essential for transcriptomic analyses investigating plant responses to environmental stress. While nanopore long-read direct RNA sequencing offers a powerful approach for improving genome annotations, studies benchmarking optimal tools for this process have primarily focused on animal models. In this study, we benchmarked five annotation tools: StringTie3, IsoQuant, Bambu, FLAIR, and FLAMES, using direct RNA data from barley infected with Net Form Net Blotch disease.

Results: We observed substantial variation across tools in isoform detection, structural completeness, splicing classification, and handling of 5' read truncation. Several tools successfully identified novel transcripts, with the two top-performing reference-guided approaches both detecting over 700 previously unannotated transcripts, including candidates with predicted roles in disease response. Our results highlight the importance of plant-specific benchmarking of bioinformatic tools and demonstrate the utility of direct RNA sequencing for improving genome annotations, supporting ongoing efforts to enhance reference resources for non-model plant species.

Availability and implementation: Benchmarking code is available at https://github.com/jadedavis5/benchmarking_paper. Datasets are described in the 'Data availability' section.

动机:高质量的基因组注释对于研究植物对环境胁迫的反应的转录组学分析至关重要。虽然纳米孔长读直接RNA测序为改进基因组注释提供了一种强大的方法,但对这一过程的最佳工具进行基准测试的研究主要集中在动物模型上。在这项研究中,我们对五种注释工具:StringTie3、IsoQuant、Bambu、FLAIR和FLAMES进行了基准测试,使用了感染Net Form Net Blotch病的大麦的直接RNA数据。结果:我们观察到不同工具在异构体检测、结构完整性、拼接分类和处理5'读截断方面存在实质性差异。几种工具成功地鉴定了新的转录本,其中两种表现最好的参考指导方法都检测了超过700种以前未注释的转录本,包括在疾病反应中具有预测作用的候选转录本。我们的研究结果强调了植物特异性生物信息学工具基准化的重要性,并证明了直接RNA测序在改进基因组注释方面的实用性,支持了正在进行的增加非模式植物物种参考资源的努力。可用性和实现:基准测试代码可从https://github.com/jadedavis5/benchmarking_paper获得。数据集的描述见“数据可用性”部分。
{"title":"Benchmarking methods for genome annotation using nanopore direct RNA in a non-model crop plant.","authors":"Jade M Davis, Kristina K Gagalova, Lilian M V P Sanglard, Sabrina Cuellar, Mark R Gibberd, Fatima Naim","doi":"10.1093/bioadv/vbag030","DOIUrl":"10.1093/bioadv/vbag030","url":null,"abstract":"<p><strong>Motivation: </strong>High-quality genome annotations are essential for transcriptomic analyses investigating plant responses to environmental stress. While nanopore long-read direct RNA sequencing offers a powerful approach for improving genome annotations, studies benchmarking optimal tools for this process have primarily focused on animal models. In this study, we benchmarked five annotation tools: StringTie3, IsoQuant, Bambu, FLAIR, and FLAMES, using direct RNA data from barley infected with Net Form Net Blotch disease.</p><p><strong>Results: </strong>We observed substantial variation across tools in isoform detection, structural completeness, splicing classification, and handling of 5' read truncation. Several tools successfully identified novel transcripts, with the two top-performing reference-guided approaches both detecting over 700 previously unannotated transcripts, including candidates with predicted roles in disease response. Our results highlight the importance of plant-specific benchmarking of bioinformatic tools and demonstrate the utility of direct RNA sequencing for improving genome annotations, supporting ongoing efforts to enhance reference resources for non-model plant species.</p><p><strong>Availability and implementation: </strong>Benchmarking code is available at https://github.com/jadedavis5/benchmarking_paper. Datasets are described in the 'Data availability' section.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbag030"},"PeriodicalIF":2.8,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12967217/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147379538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Bioinformatics advances
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1