Briefings in bioinformatics最新文献_第10页

Corrections to the following abstracts. 对以下摘要的更正。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag080

引用次数: 0

Signal-based spatial domain identification of spatially resolved transcriptomics with multigraph fusion. 基于多图融合的空间分解转录组学的信号空间域识别。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag052

Yaxiong Ma, Yu Wang, Xiaoke Ma

Spatially resolved transcriptomics (SRT) measures transcriptomes of cells within intact biological tissues, providing unprecedented opportunities to investigate tissue micro-environments, where spatial domains are modeled as clusters of spatially neighboring cells. Current methods for the identification of spatial domain from SRT mainly rely on expression profiles and spatial coordinates of cells, which ignore intercellular interactions among them, resulting in high sensitivity and low accuracy. To bridge these gaps, we introduce a novel framework, called SiDMGF (Signal-based Domain identification with Multi-Graph Fusion), that integrates gene set-derived signaling and spatial graphs to jointly model biological context, spatial information, and gene expression of cell embedding, thereby dramatically improving accuracy and robustness of performance of algorithms for spatial domain identification. Experimental results demonstrate that SiDMGF consistently outperforms state-of-the-art methods across multiple benchmark datasets and achieves superior domain identification performance on diverse spatial sequence platforms. Furthermore, we demonstrate that the proposed SiDMGF can also be effectively applied to cancer-related tissue samples, accurately delineating micro-environment heterogeneity within tumor slice.

空间解析转录组学（SRT）测量完整生物组织内细胞的转录组，为研究组织微环境提供了前所未有的机会，其中空间域被建模为空间相邻细胞的集群。目前的SRT空间域识别方法主要依赖于细胞的表达谱和空间坐标，忽略了细胞间的相互作用，灵敏度高，精度低。为了弥补这些差距，我们引入了一个新的框架，称为SiDMGF（基于信号的多图融合域识别），它集成了基因集衍生的信号和空间图，共同模拟生物背景、空间信息和细胞嵌入的基因表达，从而显著提高了空间域识别算法的准确性和鲁棒性。实验结果表明，在多个基准数据集上，SiDMGF始终优于最先进的方法，并在不同的空间序列平台上取得了优异的域识别性能。此外，我们证明了所提出的SiDMGF也可以有效地应用于癌症相关组织样本，准确地描绘肿瘤切片内的微环境异质性。

{"title":"Signal-based spatial domain identification of spatially resolved transcriptomics with multigraph fusion.","authors":"Yaxiong Ma, Yu Wang, Xiaoke Ma","doi":"10.1093/bib/bbag052","DOIUrl":"10.1093/bib/bbag052","url":null,"abstract":"Spatially resolved transcriptomics (SRT) measures transcriptomes of cells within intact biological tissues, providing unprecedented opportunities to investigate tissue micro-environments, where spatial domains are modeled as clusters of spatially neighboring cells. Current methods for the identification of spatial domain from SRT mainly rely on expression profiles and spatial coordinates of cells, which ignore intercellular interactions among them, resulting in high sensitivity and low accuracy. To bridge these gaps, we introduce a novel framework, called SiDMGF (Signal-based Domain identification with Multi-Graph Fusion), that integrates gene set-derived signaling and spatial graphs to jointly model biological context, spatial information, and gene expression of cell embedding, thereby dramatically improving accuracy and robustness of performance of algorithms for spatial domain identification. Experimental results demonstrate that SiDMGF consistently outperforms state-of-the-art methods across multiple benchmark datasets and achieves superior domain identification performance on diverse spatial sequence platforms. Furthermore, we demonstrate that the proposed SiDMGF can also be effectively applied to cancer-related tissue samples, accurately delineating micro-environment heterogeneity within tumor slice.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12893220/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146164232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Global and local integrated gradient-based diffusion model for de novo drug design. 基于全局和局部集成梯度的新药物设计扩散模型。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag033

Sejin Park, Minjae Chung, Hyunju Lee

In de novo drug design, deep learning-based approaches have become essential to efficiently navigate the vast chemical space of drug-like molecules. Recently, diffusion-based models have attracted significant attention in the generation of target-binding molecules. However, these models have difficulty in simultaneously optimizing the binding affinity and drug-like properties and require high computational costs because of the long and sequential denoising process. To address these limitations, we propose the Global and local integrated gradient-based Diffusion Model (GlintDM). GlintDM introduces a significantly faster denoising process, namely skip transition, by leveraging global gradients and local gradients. Due to the fast denoising process, GlintDM can perform the following three phases during the molecule generation: position refinement, candidate evaluation, and ligand resampling. These phases allow GlintDM to identify optimal binding positions to the target protein and generate molecules satisfying multi-objective molecular properties. As a result, GlintDM outperforms other methods on both the CrossDocked and Binding MOAD datasets for Vina-related scores. Further validation through the PoseBusters test and assessment of molecular properties, such as steric clash and geometric properties, confirm that GlintDM can generate stable and high-quality molecules.

在新药设计中，基于深度学习的方法对于有效地驾驭药物类分子的巨大化学空间至关重要。近年来，基于扩散的模型在靶结合分子的生成中引起了广泛的关注。然而，这些模型难以同时优化结合亲和力和类药物性质，并且由于去噪过程漫长且顺序，需要较高的计算成本。为了解决这些限制，我们提出了基于梯度的全局和局部集成扩散模型（GlintDM）。GlintDM引入了一个明显更快的去噪过程，即跳跃过渡，通过利用全局梯度和局部梯度。由于去噪过程快速，GlintDM在分子生成过程中可以完成以下三个阶段：位置细化、候选评估和配体重采样。这些阶段允许GlintDM识别与靶蛋白的最佳结合位置，并生成满足多目标分子特性的分子。因此，GlintDM在cross - docked和Binding MOAD数据集上的vina相关评分都优于其他方法。通过PoseBusters测试和分子特性（如空间碰撞和几何特性）的评估，进一步验证了GlintDM可以生成稳定、高质量的分子。

{"title":"Global and local integrated gradient-based diffusion model for de novo drug design.","authors":"Sejin Park, Minjae Chung, Hyunju Lee","doi":"10.1093/bib/bbag033","DOIUrl":"10.1093/bib/bbag033","url":null,"abstract":"In de novo drug design, deep learning-based approaches have become essential to efficiently navigate the vast chemical space of drug-like molecules. Recently, diffusion-based models have attracted significant attention in the generation of target-binding molecules. However, these models have difficulty in simultaneously optimizing the binding affinity and drug-like properties and require high computational costs because of the long and sequential denoising process. To address these limitations, we propose the Global and local integrated gradient-based Diffusion Model (GlintDM). GlintDM introduces a significantly faster denoising process, namely skip transition, by leveraging global gradients and local gradients. Due to the fast denoising process, GlintDM can perform the following three phases during the molecule generation: position refinement, candidate evaluation, and ligand resampling. These phases allow GlintDM to identify optimal binding positions to the target protein and generate molecules satisfying multi-objective molecular properties. As a result, GlintDM outperforms other methods on both the CrossDocked and Binding MOAD datasets for Vina-related scores. Further validation through the PoseBusters test and assessment of molecular properties, such as steric clash and geometric properties, confirm that GlintDM can generate stable and high-quality molecules.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12874906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146123869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

scGACL: a generative adversarial network with multi-scale contrastive learning for accurate single-cell RNA sequencing imputation. scGACL：一个具有多尺度对比学习的生成对抗网络，用于精确的单细胞RNA测序植入。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag018

Yanlin Jiang, Mengyuan Zhao, Jiahui Yan, Jijun Tang, Fei Guo

Single-cell RNA sequencing is a powerful technology for investigating cell-to-cell heterogeneity, yet its application is often hindered by dropout events, making accurate imputation essential for downstream analyses. Existing imputation methods, however, frequently suffer from the over-smoothing problem, which results in the loss of cell-to-cell heterogeneity in the imputed outcomes and affects downstream analyses. To overcome this limitation, we propose scGACL, a generative adversarial network (GAN) integrated with multi-scale contrastive learning. The GAN architecture facilitates the distribution of the imputed data to approximate that of the real data. To fundamentally address over-smoothing, the model incorporates a multi-scale contrastive learning mechanism: cell-level contrastive learning preserves fine-grained cell-to-cell heterogeneity, while cell-type-level contrastive learning maintains macroscopic biological variation across different cellular groups. These mechanisms function synergistically to ensure accurate imputation and effectively address the over-smoothing challenge. Comprehensive evaluations across diverse simulated and real-world datasets confirm that scGACL consistently outperforms existing methods in accurately recovering gene expression and improving downstream analyses such as cell clustering, gene differential expression analysis, and cell trajectory inference.

单细胞RNA测序是研究细胞间异质性的一项强大技术，但其应用经常受到辍学事件的阻碍，这使得准确的植入对下游分析至关重要。然而，现有的归算方法经常存在过度平滑问题，这导致在归算结果中失去细胞间的异质性，并影响下游分析。为了克服这一限制，我们提出了scGACL，一种集成了多尺度对比学习的生成对抗网络（GAN）。GAN结构使得输入数据的分布更接近真实数据的分布。为了从根本上解决过度平滑问题，该模型采用了一种多尺度对比学习机制：细胞水平的对比学习保留了细粒度的细胞间异质性，而细胞类型水平的对比学习维持了不同细胞群之间的宏观生物变异。这些机制协同作用，以确保准确的imputation和有效地解决过度平滑的挑战。对各种模拟和真实数据集的综合评估证实，scGACL在准确恢复基因表达和改善下游分析（如细胞聚类、基因差异表达分析和细胞轨迹推断）方面始终优于现有方法。

{"title":"scGACL: a generative adversarial network with multi-scale contrastive learning for accurate single-cell RNA sequencing imputation.","authors":"Yanlin Jiang, Mengyuan Zhao, Jiahui Yan, Jijun Tang, Fei Guo","doi":"10.1093/bib/bbag018","DOIUrl":"10.1093/bib/bbag018","url":null,"abstract":"Single-cell RNA sequencing is a powerful technology for investigating cell-to-cell heterogeneity, yet its application is often hindered by dropout events, making accurate imputation essential for downstream analyses. Existing imputation methods, however, frequently suffer from the over-smoothing problem, which results in the loss of cell-to-cell heterogeneity in the imputed outcomes and affects downstream analyses. To overcome this limitation, we propose scGACL, a generative adversarial network (GAN) integrated with multi-scale contrastive learning. The GAN architecture facilitates the distribution of the imputed data to approximate that of the real data. To fundamentally address over-smoothing, the model incorporates a multi-scale contrastive learning mechanism: cell-level contrastive learning preserves fine-grained cell-to-cell heterogeneity, while cell-type-level contrastive learning maintains macroscopic biological variation across different cellular groups. These mechanisms function synergistically to ensure accurate imputation and effectively address the over-smoothing challenge. Comprehensive evaluations across diverse simulated and real-world datasets confirm that scGACL consistently outperforms existing methods in accurately recovering gene expression and improving downstream analyses such as cell clustering, gene differential expression analysis, and cell trajectory inference.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12866930/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146112245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating multi-structure covalent docking with machine-learning consensus scoring enhances potency ranking of human acetylcholinesterase inhibitors. 将多结构共价对接与机器学习共识评分相结合，提高了人乙酰胆碱酯酶抑制剂的效价排序。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag028

Chaitanya K Jaladanki, Achal Ajeet Rayakar, Yap Xiu Huan, Hao Fan

Acetylcholinesterase (AChE) inhibition is a key mechanism in the treatment of neurodegenerative diseases and in counteracting toxic exposures to pesticides and nerve agents. However, accurately ranking the potency of covalently binding AChE inhibitors remains challenging due to the enzyme's structural flexibility and the chemical diversity of their covalent warheads. In this study, we developed an in silico protocol that integrates multi-structure covalent docking and machine-learning (ML) consensus scoring to improve docking-based potency ranking among covalent AChE inhibitors. We analyzed 65 ligand-bound (holo) human AChE crystal structures using hierarchical clustering to identify four representative conformations, along with one high-resolution apo structure, for multi-structure docking. A curated library of 412 organophosphate and carbamate inhibitors was then docked covalently and non-covalently into each receptor conformation. The resulting docking scores were evaluated against inhibitors' experimental logIC50 values using Spearman's rank correlation coefficient (rs). Covalent docking outperformed non-covalent docking (rs values up to 0.54 versus 0.18), and our ML consensus model trained on the five structures' covalent docking scores achieved the highest predictive accuracy (rs = 0.70), surpassing all single-structure and heuristic consensus baselines. Chemical cluster analysis revealed structure-activity trends based on ligand flexibility, polarity, and aromaticity. SHapley Additive exPlanations analysis highlighted the ML consensus model's ability to flexibly distribute the influence each structure's scores played on its predictions. It identified and exploited relationships based on its training dataset that would be difficult to anticipate through a manual analysis of individual structures' docking performance metrics. This framework is broadly applicable to other covalently targeted proteins, offering a generalizable and interpretable strategy for docking-based potency ranking.

乙酰胆碱酯酶（AChE）抑制是治疗神经退行性疾病和对抗农药和神经毒剂中毒暴露的关键机制。然而，由于酶的结构灵活性和其共价弹头的化学多样性，准确地对共价结合AChE抑制剂的效力进行排名仍然具有挑战性。在这项研究中，我们开发了一种集成了多结构共价对接和机器学习（ML）共识评分的硅协议，以提高共价AChE抑制剂之间基于对接的效价排名。我们使用分层聚类分析了65个配体结合（holo）人类AChE晶体结构，确定了四个具有代表性的构象，以及一个高分辨率载脂蛋白结构，用于多结构对接。然后将412种有机磷和氨基甲酸酯抑制剂以共价和非共价方式停靠到每个受体构象中。使用Spearman等级相关系数（rs）对抑制剂的实验logIC50值进行评估。共价对接优于非共价对接（rs值高达0.54对0.18），我们的机器学习共识模型在五种结构的共价对接得分上训练获得了最高的预测精度（rs = 0.70），超过了所有单一结构和启发式共识基线。化学聚类分析揭示了基于配体柔韧性、极性和芳香性的结构-活性趋势。SHapley加性解释分析强调了ML共识模型灵活分配每个结构分数对其预测的影响的能力。它根据训练数据集识别并利用了难以通过人工分析单个结构对接性能指标来预测的关系。该框架广泛适用于其他共价靶向蛋白，为基于对接的效价排序提供了一种通用且可解释的策略。

{"title":"Integrating multi-structure covalent docking with machine-learning consensus scoring enhances potency ranking of human acetylcholinesterase inhibitors.","authors":"Chaitanya K Jaladanki, Achal Ajeet Rayakar, Yap Xiu Huan, Hao Fan","doi":"10.1093/bib/bbag028","DOIUrl":"10.1093/bib/bbag028","url":null,"abstract":"Acetylcholinesterase (AChE) inhibition is a key mechanism in the treatment of neurodegenerative diseases and in counteracting toxic exposures to pesticides and nerve agents. However, accurately ranking the potency of covalently binding AChE inhibitors remains challenging due to the enzyme's structural flexibility and the chemical diversity of their covalent warheads. In this study, we developed an in silico protocol that integrates multi-structure covalent docking and machine-learning (ML) consensus scoring to improve docking-based potency ranking among covalent AChE inhibitors. We analyzed 65 ligand-bound (holo) human AChE crystal structures using hierarchical clustering to identify four representative conformations, along with one high-resolution apo structure, for multi-structure docking. A curated library of 412 organophosphate and carbamate inhibitors was then docked covalently and non-covalently into each receptor conformation. The resulting docking scores were evaluated against inhibitors' experimental logIC50 values using Spearman's rank correlation coefficient (rs). Covalent docking outperformed non-covalent docking (rs values up to 0.54 versus 0.18), and our ML consensus model trained on the five structures' covalent docking scores achieved the highest predictive accuracy (rs = 0.70), surpassing all single-structure and heuristic consensus baselines. Chemical cluster analysis revealed structure-activity trends based on ligand flexibility, polarity, and aromaticity. SHapley Additive exPlanations analysis highlighted the ML consensus model's ability to flexibly distribute the influence each structure's scores played on its predictions. It identified and exploited relationships based on its training dataset that would be difficult to anticipate through a manual analysis of individual structures' docking performance metrics. This framework is broadly applicable to other covalently targeted proteins, offering a generalizable and interpretable strategy for docking-based potency ranking.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12866926/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146112252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comprehensive review and assessment of machine learning approaches for host-pathogen protein-protein interaction prediction. 宿主-病原体蛋白质-蛋白质相互作用预测的机器学习方法综述与评估。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag051

Fatima Noor, Muhammad Tahir Ul Qamar

Predicting host-pathogen protein-protein interactions (PPIs) is a cornerstone of modern infectious disease research, offering unparalleled insights into the molecular mechanisms underlying infection and immune evasion. Despite its transformative potential, the field faces persistent challenges, including limited experimental data, class imbalance, and the dynamic evolution of pathogens. The current study explores cutting-edge computational approaches that have redefined host-pathogen protein-protein interaction (HP-PPI) prediction. Notably, transfer learning has emerged as a game changer, enabling models to leverage knowledge from well-characterized systems to predict interactions in previously underexplored pathogens. Hybrid and ensemble models have proven highly effective, combining the strengths of diverse algorithms to capture the complexity of biological interactions. Explainable AI tools are now bridging the gap between computational predictions and biological interpretability, offering actionable insights into key interaction drivers. Additionally, the review discusses advanced data integration techniques, such as multi-omics fusion and graph-based learning, which explore new dimensions in HP-PPI research. This synthesis of challenges, solutions, and future perspectives highlights a paradigm shift in computational biology, in which scalable, interpretable, and biologically informed models pave the way for breakthroughs in therapeutic discovery, vaccine development, and precision medicine. Our review sets the stage for future advancements, emphasizing the potential of next-generation technologies to unravel the intricate dance between hosts and pathogens.

预测宿主-病原体蛋白质-蛋白质相互作用（PPIs）是现代传染病研究的基石，为感染和免疫逃避的分子机制提供了无与伦比的见解。尽管具有变革潜力，但该领域仍面临着持续的挑战，包括实验数据有限，类别不平衡以及病原体的动态进化。目前的研究探索了重新定义宿主-病原体蛋白质-蛋白质相互作用（HP-PPI）预测的尖端计算方法。值得注意的是，迁移学习已经成为游戏规则的改变者，使模型能够利用来自特征良好的系统的知识来预测以前未被充分探索的病原体的相互作用。混合和集成模型已被证明是非常有效的，结合了不同算法的优势来捕捉生物相互作用的复杂性。可解释的人工智能工具现在正在弥合计算预测和生物可解释性之间的差距，为关键的交互驱动因素提供可操作的见解。此外，本文还讨论了先进的数据集成技术，如多组学融合和基于图的学习，这些技术为HP-PPI研究探索了新的维度。这种挑战、解决方案和未来前景的综合凸显了计算生物学的范式转变，其中可扩展、可解释和生物学信息的模型为治疗发现、疫苗开发和精准医学的突破铺平了道路。我们的综述为未来的进展奠定了基础，强调了下一代技术解开宿主和病原体之间复杂舞蹈的潜力。

{"title":"Comprehensive review and assessment of machine learning approaches for host-pathogen protein-protein interaction prediction.","authors":"Fatima Noor, Muhammad Tahir Ul Qamar","doi":"10.1093/bib/bbag051","DOIUrl":"10.1093/bib/bbag051","url":null,"abstract":"Predicting host-pathogen protein-protein interactions (PPIs) is a cornerstone of modern infectious disease research, offering unparalleled insights into the molecular mechanisms underlying infection and immune evasion. Despite its transformative potential, the field faces persistent challenges, including limited experimental data, class imbalance, and the dynamic evolution of pathogens. The current study explores cutting-edge computational approaches that have redefined host-pathogen protein-protein interaction (HP-PPI) prediction. Notably, transfer learning has emerged as a game changer, enabling models to leverage knowledge from well-characterized systems to predict interactions in previously underexplored pathogens. Hybrid and ensemble models have proven highly effective, combining the strengths of diverse algorithms to capture the complexity of biological interactions. Explainable AI tools are now bridging the gap between computational predictions and biological interpretability, offering actionable insights into key interaction drivers. Additionally, the review discusses advanced data integration techniques, such as multi-omics fusion and graph-based learning, which explore new dimensions in HP-PPI research. This synthesis of challenges, solutions, and future perspectives highlights a paradigm shift in computational biology, in which scalable, interpretable, and biologically informed models pave the way for breakthroughs in therapeutic discovery, vaccine development, and precision medicine. Our review sets the stage for future advancements, emphasizing the potential of next-generation technologies to unravel the intricate dance between hosts and pathogens.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12888821/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146156175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

UBD: incorporating uncertainty in cell type proportion estimates from bulk samples to infer cell-type-specific profiles. UBD：从大量样本中纳入细胞类型比例估计的不确定性，以推断细胞类型特异性概况。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf711

Youshu Cheng, Chen Lin, Hongyu Li, Ke Xu, Hongyu Zhao

Statistical deconvolution methods offer a powerful solution for estimating cell-type-specific (CTS) profiles from readily available bulk tissue data. However, a critical limitation of existing methods is that they require the knowledge of cell type proportions of individuals in the bulk data. While the ground truth of cell type proportions in bulk samples are unknown, those methods use the estimated proportions to approximate the truth, which potentially introduces additional uncertainties in the inferred CTS profiles. To address this challenge, we propose Uncertainty-aware Bayesian Deconvolution (UBD) to incorporate uncertainty in cell type proportion estimates. By explicitly modeling the uncertainty in the initial estimates, UBD refines cell type proportions and estimates sample-level CTS data simultaneously. We show that UBD can improve the estimates of CTS profiles through extensive simulations. We further demonstrate the utility of UBD to reveal more CTS signals in its applications to two real datasets.

统计反褶积方法提供了一个强大的解决方案，估计细胞类型特异性（CTS）档案从现成的大块组织数据。然而，现有方法的一个关键限制是，它们需要了解大量数据中个体的细胞类型比例。虽然散装样品中细胞类型比例的基本真相是未知的，但这些方法使用估计的比例来近似真相，这可能会在推断的CTS剖面中引入额外的不确定性。为了解决这一挑战，我们提出了不确定性感知贝叶斯反卷积（UBD），将不确定性纳入细胞类型比例估计。通过明确建模初始估计中的不确定性，UBD精炼细胞类型比例并同时估计样本水平的CTS数据。我们通过广泛的模拟表明，UBD可以改善CTS剖面的估计。我们进一步展示了UBD在两个真实数据集的应用中揭示更多CTS信号的效用。

引用次数: 0

Dynamic-GLEP: a dynamics-informed deep learning framework for ligand efficacy prediction in representative Class A GPCRs. 动态- glep：一个动态信息深度学习框架，用于代表性a类gpcr的配体功效预测。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag049

Zhiyi Chen, Yongxin Hao, Yuhong Su, Hans Ågren, Mingan Chen, Zhehuan Fan, Duanhua Cao, Jiacheng Xiong, Wei Zhang, Jin Liu, Xutong Li, Mingyue Zheng, Xi Cheng, Dingyan Wang, Dan Teng

G protein-coupled receptors (GPCRs) represent the largest membrane protein family and remain central targets in drug discovery. Ligand efficacy reflects the ability to modulate receptor conformational states and extends beyond binding affinity to underpin functional selectivity. However, most computational approaches still emphasize affinity prediction, with limited capacity to capture the conformational dynamics driving efficacy. Here, we introduce Dynamic-GLEP, a structure- and mechanism-aware framework that integrates molecular dynamics (MD)-derived conformational ensembles with transfer learning on equivariant graph neural networks. By constructing multi-conformation receptor-ligand complexes and fine-tuning the EquiScore model, Dynamic-GLEP identifies conformation-dependent interaction features to distinguish agonists from nonagonists. Applied to the 5-HT1A receptor, the framework achieved an area under the curve (AUC) of 0.74 in cross-validation and 0.71 on an external Food and Drug Administration (FDA)-related dataset. Comparative analyses showed that Holo-based models are advantageous for scaffold optimization, whereas Apo-derived ensembles provided greater adaptability to chemically diverse ligands. Furthermore, extension to the adenosine A2A receptor yielded high performance (AUC > 0.85), underscoring the method's robustness and transferability under data-scarce conditions. Collectively, these results highlight Dynamic-GLEP as a reliable and interpretable platform for ligand efficacy prediction in Class A GPCRs, with broad potential to support virtual screening, candidate prioritization, and mechanism-driven drug design.

G蛋白偶联受体（gpcr）是最大的膜蛋白家族，是药物发现的中心靶点。配体功效反映了调节受体构象状态的能力，并延伸到结合亲和力之外，以支持功能选择性。然而，大多数计算方法仍然强调亲和预测，与有限的能力捕捉构象动力学驱动效能。在这里，我们介绍了Dynamic-GLEP，这是一个结构和机制感知框架，将分子动力学（MD）衍生的构象集成与等变图神经网络上的迁移学习集成在一起。通过构建多构象受体配体复合物和微调EquiScore模型，Dynamic-GLEP识别构象依赖的相互作用特征，以区分激动剂和非激动剂。应用于5-HT1A受体，该框架在交叉验证中的曲线下面积（AUC）为0.74，在外部食品和药物管理局（FDA）相关数据集上的AUC为0.71。对比分析表明，基于全息的模型有利于支架优化，而载子衍生的集成体对化学上不同的配体具有更大的适应性。此外，扩展到腺苷A2A受体获得了高性能（AUC > 0.85），强调了该方法在数据稀缺条件下的鲁棒性和可移植性。总的来说，这些结果突出了Dynamic-GLEP作为a类gpcr中配体功效预测的可靠且可解释的平台，具有支持虚拟筛选，候选优先排序和机制驱动的药物设计的广泛潜力。

{"title":"Dynamic-GLEP: a dynamics-informed deep learning framework for ligand efficacy prediction in representative Class A GPCRs.","authors":"Zhiyi Chen, Yongxin Hao, Yuhong Su, Hans Ågren, Mingan Chen, Zhehuan Fan, Duanhua Cao, Jiacheng Xiong, Wei Zhang, Jin Liu, Xutong Li, Mingyue Zheng, Xi Cheng, Dingyan Wang, Dan Teng","doi":"10.1093/bib/bbag049","DOIUrl":"10.1093/bib/bbag049","url":null,"abstract":"G protein-coupled receptors (GPCRs) represent the largest membrane protein family and remain central targets in drug discovery. Ligand efficacy reflects the ability to modulate receptor conformational states and extends beyond binding affinity to underpin functional selectivity. However, most computational approaches still emphasize affinity prediction, with limited capacity to capture the conformational dynamics driving efficacy. Here, we introduce Dynamic-GLEP, a structure- and mechanism-aware framework that integrates molecular dynamics (MD)-derived conformational ensembles with transfer learning on equivariant graph neural networks. By constructing multi-conformation receptor-ligand complexes and fine-tuning the EquiScore model, Dynamic-GLEP identifies conformation-dependent interaction features to distinguish agonists from nonagonists. Applied to the 5-HT1A receptor, the framework achieved an area under the curve (AUC) of 0.74 in cross-validation and 0.71 on an external Food and Drug Administration (FDA)-related dataset. Comparative analyses showed that Holo-based models are advantageous for scaffold optimization, whereas Apo-derived ensembles provided greater adaptability to chemically diverse ligands. Furthermore, extension to the adenosine A2A receptor yielded high performance (AUC > 0.85), underscoring the method's robustness and transferability under data-scarce conditions. Collectively, these results highlight Dynamic-GLEP as a reliable and interpretable platform for ligand efficacy prediction in Class A GPCRs, with broad potential to support virtual screening, candidate prioritization, and mechanism-driven drug design.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12900074/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146177725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MCPmed: a call for Model Context Protocol-enabled bioinformatics web services for LLM-driven discovery. MCPmed：为llm驱动的发现调用支持模型上下文协议的生物信息学网络服务。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag076

Matthias Flotho, Ian Ferenc Diks, Philipp Flotho, Leidy-Alejandra G Molano, Pascal Hirsch, Andreas Keller

Bioinformatics web servers are critical resources in modern biomedical research, facilitating interactive exploration of datasets through custom-built interfaces with rich visualization capabilities. However, this mostly human-centric design limits machine readability for large language models (LLMs) and deep research agents. We address this gap by adapting model context protocol (MCP) to bioinformatics web server backends, a standardized, machine-actionable layer that explicitly associates web service endpoints with scientific concepts and detailed metadata. Our implementations across widely used databases (GEO, STRING, and UCSC Cell Browser) demonstrate enhanced exploration capabilities through MCP-enabled LLMs. To accelerate adoption, we propose MCPmed, a community effort supplemented by lightweight breadcrumbs for services not yet fully MCP-enabled and templates for setting up new servers. This structured transition aims to significantly enhance automation, reproducibility, and interoperability, preparing bioinformatics web services for next-generation research agents.

生物信息学web服务器是现代生物医学研究的关键资源，通过具有丰富可视化功能的定制接口促进数据集的交互式探索。然而，这种主要以人为中心的设计限制了大型语言模型（llm）和深度研究代理的机器可读性。我们通过将模型上下文协议（MCP）适应于生物信息学web服务器后端来解决这一差距，这是一个标准化的、机器可操作的层，它明确地将web服务端点与科学概念和详细的元数据联系起来。我们在广泛使用的数据库（GEO、STRING和UCSC Cell Browser）上的实现展示了通过启用mcp的llm增强的勘探能力。为了加速采用，我们提出了MCPmed，这是一项社区努力，由尚未完全启用mcp的服务的轻量级面包屑和用于设置新服务器的模板补充。这种结构化的过渡旨在显著提高自动化、可重复性和互操作性，为下一代研究代理准备生物信息学网络服务。

{"title":"MCPmed: a call for Model Context Protocol-enabled bioinformatics web services for LLM-driven discovery.","authors":"Matthias Flotho, Ian Ferenc Diks, Philipp Flotho, Leidy-Alejandra G Molano, Pascal Hirsch, Andreas Keller","doi":"10.1093/bib/bbag076","DOIUrl":"10.1093/bib/bbag076","url":null,"abstract":"Bioinformatics web servers are critical resources in modern biomedical research, facilitating interactive exploration of datasets through custom-built interfaces with rich visualization capabilities. However, this mostly human-centric design limits machine readability for large language models (LLMs) and deep research agents. We address this gap by adapting model context protocol (MCP) to bioinformatics web server backends, a standardized, machine-actionable layer that explicitly associates web service endpoints with scientific concepts and detailed metadata. Our implementations across widely used databases (GEO, STRING, and UCSC Cell Browser) demonstrate enhanced exploration capabilities through MCP-enabled LLMs. To accelerate adoption, we propose MCPmed, a community effort supplemented by lightweight breadcrumbs for services not yet fully MCP-enabled and templates for setting up new servers. This structured transition aims to significantly enhance automation, reproducibility, and interoperability, preparing bioinformatics web services for next-generation research agents.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12927880/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147275624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GPCRact: a hierarchical framework for predicting ligand-induced GPCR activity via allosteric communication modeling. GPCRact：通过变构通信模型预测配体诱导的GPCR活性的分层框架。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics

Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf719

Hyojin Son, Gwan-Su Yi

Accurate prediction of ligand-induced activity for G-protein-coupled receptors (GPCRs) is a cornerstone of drug discovery, yet it is challenged by the need to model allosteric communication-the long-range signaling linking ligand binding to distal conformational changes. Prevailing sequence-based models often fail to capture these three-dimensional dynamics, a limitation frequently masked by averaged performance on simpler Class A targets. To address this, we introduce GPCRact, a novel framework that models the biophysical principles of allosteric modulation in GPCR activation. It first constructs a high-resolution, three-dimensional structure-aware graph from the heavy-atom coordinates of functionally critical residues at binding and allosteric sites. A dual attention architecture then captures the activation process: cross-attention encodes the initial ligand-protein interaction at the binding site, whereas self-attention learns the subsequent intra-protein signal propagation. This hierarchical architecture is built upon an E(n)-Equivariant Graph Neural Network (EGNN) to explicitly model conformational consequences of ligand binding, and is further refined with a tailored loss function and inference logic to mitigate error propagation. Underpinned by GPCRactDB, a comprehensive database we constructed for this study, GPCRact not only achieves state-of-the-art performance but also demonstrates robustly superior accuracy on a curated benchmark of allosterically complex receptors where existing models systematically underperform. Crucially, analysis of the learned attention weights confirms that the model identifies biologically validated allosteric pathways, offering a significant step toward resolving the black box nature of previous methods. Thus, GPCRact provides a more accurate, interpretable, and mechanistically-grounded solution to a long-standing challenge, paving the way for effective structure-guided drug discovery.

准确预测配体诱导的g蛋白偶联受体（gpcr）的活性是药物发现的基石，但它受到变构通信模型的挑战，变构通信是连接配体结合和远端构象变化的远程信号。主流的基于序列的模型常常不能捕捉到这些三维动态，这一限制常常被更简单的a类目标的平均性能所掩盖。为了解决这个问题，我们引入了GPCRact，这是一个新的框架，模拟了GPCR激活中变构调节的生物物理原理。它首先从结合位点和变构位点的功能关键残基的重原子坐标构建了一个高分辨率的三维结构感知图。双注意结构捕获了激活过程：交叉注意编码结合位点的初始配体-蛋白质相互作用，而自注意学习随后的蛋白质内信号传播。这种分层结构建立在E(n)-等变图神经网络（EGNN）的基础上，以明确地模拟配体结合的构象后果，并通过定制的损失函数和推理逻辑进一步改进，以减轻错误传播。在GPCRactDB（我们为本研究构建的一个综合数据库）的支持下，GPCRact不仅实现了最先进的性能，而且在现有模型系统表现不佳的变构复杂受体的精心基准上显示出强大的优越准确性。至关重要的是，对学习到的注意力权重的分析证实了该模型识别了生物学上有效的变构途径，为解决以前方法的黑箱性质提供了重要的一步。因此，GPCRact为长期存在的挑战提供了更准确、可解释和机械基础的解决方案，为有效的结构导向药物发现铺平了道路。

{"title":"GPCRact: a hierarchical framework for predicting ligand-induced GPCR activity via allosteric communication modeling.","authors":"Hyojin Son, Gwan-Su Yi","doi":"10.1093/bib/bbaf719","DOIUrl":"10.1093/bib/bbaf719","url":null,"abstract":"Accurate prediction of ligand-induced activity for G-protein-coupled receptors (GPCRs) is a cornerstone of drug discovery, yet it is challenged by the need to model allosteric communication-the long-range signaling linking ligand binding to distal conformational changes. Prevailing sequence-based models often fail to capture these three-dimensional dynamics, a limitation frequently masked by averaged performance on simpler Class A targets. To address this, we introduce GPCRact, a novel framework that models the biophysical principles of allosteric modulation in GPCR activation. It first constructs a high-resolution, three-dimensional structure-aware graph from the heavy-atom coordinates of functionally critical residues at binding and allosteric sites. A dual attention architecture then captures the activation process: cross-attention encodes the initial ligand-protein interaction at the binding site, whereas self-attention learns the subsequent intra-protein signal propagation. This hierarchical architecture is built upon an E(n)-Equivariant Graph Neural Network (EGNN) to explicitly model conformational consequences of ligand binding, and is further refined with a tailored loss function and inference logic to mitigate error propagation. Underpinned by GPCRactDB, a comprehensive database we constructed for this study, GPCRact not only achieves state-of-the-art performance but also demonstrates robustly superior accuracy on a curated benchmark of allosterically complex receptors where existing models systematically underperform. Crucially, analysis of the learned attention weights confirms that the model identifies biologically validated allosteric pathways, offering a significant step toward resolving the black box nature of previous methods. Thus, GPCRact provides a more accurate, interpretable, and mechanistically-grounded solution to a long-standing challenge, paving the way for effective structure-guided drug discovery.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805254/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145970617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0