首页 > 最新文献

PLoS Computational Biology最新文献

英文 中文
FKSUDDAPre: A drug-disease association prediction framework based on F-TEST feature selection and AMDKSU resampling with interpretability analysis. FKSUDDAPre:基于F-TEST特征选择和AMDKSU重采样和可解释性分析的药物-疾病关联预测框架。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-05 DOI: 10.1371/journal.pcbi.1013947
Yun Zuo, Chenyi Zhang, Ge Hua, Qiao Ning, Xiangrong Liu, Xiangxiang Zeng, Zhaohong Deng
<p><p>In drug discovery and therapeutic research, the prediction of drug-disease associations (DDAs) holds significant scientific and clinical value. Drug molecules exert their effects by precisely identifying disease-related biological targets, systematically modulating the entire pharmacological process from absorption, distribution, and metabolism to final efficacy. Accurate prediction of drug-disease associations not only facilitates an in-depth understanding of molecular mechanisms of drug action but also provides critical theoretical foundations for drug repositioning and personalized medicine. While traditional prediction methods based on in vitro experiments and clinical statistics yield reliable results, they suffer from inherent drawbacks such as long development cycles, substantial resource consumption, and low throughput. In contrast, emerging machine learning techniques offer a promising solution to these bottlenecks, enabling the intelligent and efficient discovery of potential drug-disease association networks and significantly improving drug development efficiency. However, it is noteworthy that existing machine learning methods still face significant challenges in practical applications: the complexity of feature construction raises the threshold for data processing; data sparsity constrains the depth of information mining; and the pervasive issue of sample imbalance poses a severe challenge to the model's predictive accuracy and generalization performance. In this study, we developed an efficient and accurate framework for drug-disease association prediction named FKSUDDAPre. The model employs a multi-modal feature fusion strategy: on one hand, it leverages an ensemble of Mol2vec and K- BERT to deeply capture the semantic features of drug molecular fingerprints; on the other hand, it integrates Medical Subject Headings (MeSH) with DeepWalk to effectively reduce the dimensionality of disease features while preserving their relational structure. To address the class imbalance problem, FKSUDDAPre designed an optimization algorithm called AMDKSU, which combined clustering with an improved distance metric strategy, significantly enhancing the discriminative power of the sample set. For data processing, F-test was employed for feature importance ranking, effectively reducing data dimensionality and improving model generalization. For the predictive architecture, FKSUDDAPre proposed a novel ensemble framework composed of XGBoost, Decision Tree, Random Forest, and HyperFast. By employing a dynamic weight allocation strategy, this ensemble effectively harnesses the complementary strengths of these models to achieve significantly enhanced predictive performance. Rigorous validation demonstrated the system's outstanding performance across multiple evaluation metrics, with an average AUC of 0.9725, improving the AUC by approximately 3.88% compared to the best-performing baseline model. In the prediction of Alzheimer's disease and Parkinson'
在药物发现和治疗研究中,药物-疾病关联预测(DDAs)具有重要的科学和临床价值。药物分子通过精确识别与疾病相关的生物靶点,系统调节从吸收、分布、代谢到最终药效的整个药理过程来发挥作用。准确预测药物-疾病关联不仅有助于深入了解药物作用的分子机制,而且为药物重新定位和个性化医疗提供重要的理论基础。传统的基于体外实验和临床统计的预测方法虽然结果可靠,但存在开发周期长、资源消耗大、通量低等固有缺陷。相比之下,新兴的机器学习技术为这些瓶颈提供了一个有希望的解决方案,能够智能有效地发现潜在的药物-疾病关联网络,并显着提高药物开发效率。然而,值得注意的是,现有的机器学习方法在实际应用中仍然面临着重大挑战:特征构建的复杂性提高了数据处理的门槛;数据稀疏性限制了信息挖掘的深度;而普遍存在的样本不平衡问题对模型的预测精度和泛化性能提出了严峻的挑战。在这项研究中,我们开发了一个高效准确的药物-疾病关联预测框架,名为FKSUDDAPre。该模型采用多模态特征融合策略:一方面利用Mol2vec和K- BERT的集合深度捕获药物分子指纹的语义特征;另一方面,将医学主题词(Medical Subject heading, MeSH)与DeepWalk相结合,在保持疾病特征关系结构的同时,有效降低疾病特征的维数。为了解决类不平衡问题,FKSUDDAPre设计了一种名为AMDKSU的优化算法,该算法将聚类与改进的距离度量策略相结合,显著增强了样本集的判别能力。在数据处理上,采用f检验进行特征重要度排序,有效降低了数据维数,提高了模型的泛化能力。对于预测体系结构,FKSUDDAPre提出了一种由XGBoost、Decision Tree、Random Forest和HyperFast组成的新型集成框架。通过采用动态权重分配策略,该集成有效地利用了这些模型的互补优势,从而显著提高了预测性能。严格的验证证明了该系统在多个评估指标上的出色性能,平均AUC为0.9725,与性能最好的基线模型相比,AUC提高了约3.88%。在对阿尔茨海默病和帕金森病的预测中,FKSUDDAPre推荐的前10名候选药物中分别有80%和60%得到了文献证实,表明该模型具有良好的实际应用潜力。此外,我们对模型的预测进行了基于lime的特征重要性分析,将特征与目标变量之间的相关性可视化,以证明模型的可解释性。使用PyQt5框架还开发了一个跨平台、用户友好的可视化工具。
{"title":"FKSUDDAPre: A drug-disease association prediction framework based on F-TEST feature selection and AMDKSU resampling with interpretability analysis.","authors":"Yun Zuo, Chenyi Zhang, Ge Hua, Qiao Ning, Xiangrong Liu, Xiangxiang Zeng, Zhaohong Deng","doi":"10.1371/journal.pcbi.1013947","DOIUrl":"https://doi.org/10.1371/journal.pcbi.1013947","url":null,"abstract":"&lt;p&gt;&lt;p&gt;In drug discovery and therapeutic research, the prediction of drug-disease associations (DDAs) holds significant scientific and clinical value. Drug molecules exert their effects by precisely identifying disease-related biological targets, systematically modulating the entire pharmacological process from absorption, distribution, and metabolism to final efficacy. Accurate prediction of drug-disease associations not only facilitates an in-depth understanding of molecular mechanisms of drug action but also provides critical theoretical foundations for drug repositioning and personalized medicine. While traditional prediction methods based on in vitro experiments and clinical statistics yield reliable results, they suffer from inherent drawbacks such as long development cycles, substantial resource consumption, and low throughput. In contrast, emerging machine learning techniques offer a promising solution to these bottlenecks, enabling the intelligent and efficient discovery of potential drug-disease association networks and significantly improving drug development efficiency. However, it is noteworthy that existing machine learning methods still face significant challenges in practical applications: the complexity of feature construction raises the threshold for data processing; data sparsity constrains the depth of information mining; and the pervasive issue of sample imbalance poses a severe challenge to the model's predictive accuracy and generalization performance. In this study, we developed an efficient and accurate framework for drug-disease association prediction named FKSUDDAPre. The model employs a multi-modal feature fusion strategy: on one hand, it leverages an ensemble of Mol2vec and K- BERT to deeply capture the semantic features of drug molecular fingerprints; on the other hand, it integrates Medical Subject Headings (MeSH) with DeepWalk to effectively reduce the dimensionality of disease features while preserving their relational structure. To address the class imbalance problem, FKSUDDAPre designed an optimization algorithm called AMDKSU, which combined clustering with an improved distance metric strategy, significantly enhancing the discriminative power of the sample set. For data processing, F-test was employed for feature importance ranking, effectively reducing data dimensionality and improving model generalization. For the predictive architecture, FKSUDDAPre proposed a novel ensemble framework composed of XGBoost, Decision Tree, Random Forest, and HyperFast. By employing a dynamic weight allocation strategy, this ensemble effectively harnesses the complementary strengths of these models to achieve significantly enhanced predictive performance. Rigorous validation demonstrated the system's outstanding performance across multiple evaluation metrics, with an average AUC of 0.9725, improving the AUC by approximately 3.88% compared to the best-performing baseline model. In the prediction of Alzheimer's disease and Parkinson'","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013947"},"PeriodicalIF":3.6,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling human visuomotor adaptation with a disturbance observer framework. 基于扰动观测器框架的人体视觉运动自适应建模。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-04 DOI: 10.1371/journal.pcbi.1013937
Gaurav Sharma, Bernard Marius 't Hart, Jean-Jacques Orban de Xivry, Denise Y P Henriques, Mireille E Broucke

A fundamental problem of visuomotor adaptation research is to understand how the brain is capable to asymptotically remove a predictable exogenous disturbance from a visual error signal using limited sensor information by re-calibration of hand movement. From a control theory perspective, the most striking aspect of this problem is that it falls squarely in the realm of the internal model principle of control theory. Despite this fact, the relationship between the internal model principle and models of visuomotor adaptation is currently not well developed. This paper aims to close this gap by proposing an abstract discrete-time state space model of visuomotor adaptation based on the internal model principle. The proposed DO Model, a metonym for its most important component, a disturbance observer, addresses key modeling requirements: modular architecture, physically relevant signals, parameters tied to atomic behaviors, and capacity for abstraction. The two main computational modules are a disturbance observer, a recently developed class of internal models, and a feedforward system that learns from the disturbance observer to improve feedforward motor commands.

视觉运动适应研究的一个基本问题是了解大脑如何能够利用有限的传感器信息,通过重新校准手部运动,逐步消除视觉误差信号中可预测的外源干扰。从控制理论的角度来看,这个问题最引人注目的方面是它完全属于控制理论的内部模型原理领域。尽管如此,内部模型原理与视觉运动适应模型之间的关系目前还没有得到很好的发展。本文提出了一种基于内模原理的视觉运动自适应的抽象离散时间状态空间模型来弥补这一空白。所提出的DO模型是其最重要的组成部分干扰观测器的代名词,它解决了关键的建模要求:模块化体系结构、物理相关信号、与原子行为相关的参数以及抽象能力。两个主要的计算模块是一个干扰观测器,一个最近开发的内部模型类,以及一个从干扰观测器学习以改进前馈电机命令的前馈系统。
{"title":"Modeling human visuomotor adaptation with a disturbance observer framework.","authors":"Gaurav Sharma, Bernard Marius 't Hart, Jean-Jacques Orban de Xivry, Denise Y P Henriques, Mireille E Broucke","doi":"10.1371/journal.pcbi.1013937","DOIUrl":"https://doi.org/10.1371/journal.pcbi.1013937","url":null,"abstract":"<p><p>A fundamental problem of visuomotor adaptation research is to understand how the brain is capable to asymptotically remove a predictable exogenous disturbance from a visual error signal using limited sensor information by re-calibration of hand movement. From a control theory perspective, the most striking aspect of this problem is that it falls squarely in the realm of the internal model principle of control theory. Despite this fact, the relationship between the internal model principle and models of visuomotor adaptation is currently not well developed. This paper aims to close this gap by proposing an abstract discrete-time state space model of visuomotor adaptation based on the internal model principle. The proposed DO Model, a metonym for its most important component, a disturbance observer, addresses key modeling requirements: modular architecture, physically relevant signals, parameters tied to atomic behaviors, and capacity for abstraction. The two main computational modules are a disturbance observer, a recently developed class of internal models, and a feedforward system that learns from the disturbance observer to improve feedforward motor commands.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013937"},"PeriodicalIF":3.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146119555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phase resetting in human stem cell derived cardiomyocytes explains complex cardiac arrhythmias. 人类干细胞来源的心肌细胞的期重置解释了复杂的心律失常。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-04 DOI: 10.1371/journal.pcbi.1013935
Khady Diagne, Thomas M Bury, Morgan E Pettebone, Marc W Deyell, Zachary Laksman, Alvin Shrier, Leon Glass, Gil Bub, Emilia Entcheva

Phase resetting of cardiac oscillators underlies some complex arrhythmias. Here we use optogenetic stimulation to construct phase response curves (PRC) for spheroids of human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CM) and a computational cardiomyocyte model to identify ionic mechanisms shaping the PRC. The clinical utility of the human PRCs is demonstrated by adding a patient-based conduction delay to the same equations to explain complex multi-day Holter ECG dynamics and cardiac arrhythmias. Periodic stimulation of these patient-based models and the computational model of human iPSC-CM reveal similar bifurcation patterns and entrainment zones. Cell therapy by injecting iPSC-CM into diseased hearts can induce ectopic foci-based engraftment arrhythmias. The PRC analysis offers a potential strategy to entrain these foci in a parameter space that avoids such arrhythmias.

心律振荡的相位重置是一些复杂心律失常的基础。在这里,我们使用光遗传学刺激来构建人类诱导多能干细胞衍生的心肌细胞(hiPSC-CM)球体的相响应曲线(PRC),并使用计算心肌细胞模型来确定形成PRC的离子机制。通过在相同的方程中添加基于患者的传导延迟来解释复杂的多日动态心电图和心律失常,证明了人类prc的临床应用。这些基于患者的模型和人类iPSC-CM的计算模型的周期性刺激显示出相似的分岔模式和夹带区。通过向病变心脏注射iPSC-CM细胞治疗可诱导异位病灶性心律失常。PRC分析提供了一种潜在的策略,可以将这些焦点集中在参数空间中,从而避免此类心律失常。
{"title":"Phase resetting in human stem cell derived cardiomyocytes explains complex cardiac arrhythmias.","authors":"Khady Diagne, Thomas M Bury, Morgan E Pettebone, Marc W Deyell, Zachary Laksman, Alvin Shrier, Leon Glass, Gil Bub, Emilia Entcheva","doi":"10.1371/journal.pcbi.1013935","DOIUrl":"https://doi.org/10.1371/journal.pcbi.1013935","url":null,"abstract":"<p><p>Phase resetting of cardiac oscillators underlies some complex arrhythmias. Here we use optogenetic stimulation to construct phase response curves (PRC) for spheroids of human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CM) and a computational cardiomyocyte model to identify ionic mechanisms shaping the PRC. The clinical utility of the human PRCs is demonstrated by adding a patient-based conduction delay to the same equations to explain complex multi-day Holter ECG dynamics and cardiac arrhythmias. Periodic stimulation of these patient-based models and the computational model of human iPSC-CM reveal similar bifurcation patterns and entrainment zones. Cell therapy by injecting iPSC-CM into diseased hearts can induce ectopic foci-based engraftment arrhythmias. The PRC analysis offers a potential strategy to entrain these foci in a parameter space that avoids such arrhythmias.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013935"},"PeriodicalIF":3.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146119635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TARPON-A Telomere Analysis and Research Pipeline Optimized for Nanopore. TARPON-A端粒分析和纳米孔优化研究管道。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-04 eCollection Date: 2026-02-01 DOI: 10.1371/journal.pcbi.1013915
Nathaniel Deimler, David V Ho, Norbert Paul, Zoë Gill, Peter Baumann

Long-read sequencing has transformed many areas of biology and holds significant promise for telomere research by enabling analysis of nucleotide-level resolution chromosome arm-specific telomere length in both model organisms and humans. However, the adoption of new technologies, particularly in clinical or diagnostic contexts, requires careful validation to recognize potential technical and computational limitations. We present TARPON (Telomere Analysis and Research Pipeline Optimized for Nanopore), a best-practices Nextflow pipeline designed for the analysis of telomeres sequenced on the Oxford Nanopore Technologies (ONT) platform. TARPON can be executed via the command line or integrated into ONT's EPI2ME agent, providing a user-friendly graphical interface for those without computational training. Nextflow's container-based architecture eliminates dependency conflicts, thereby streamlining deployment across platforms. TARPON isolates telomeric repeat-containing reads, assigns strand specificity, and identifies enrichment probes that can be used both for demultiplexing and for confirming capture-based library preparation. To ensure that the analysis is restricted to full-length telomeres, reads lacking a capture probe or non-telomeric sequence on the opposite end are excluded. A sliding-window approach defines the subtelomere-to-telomere boundary, followed by quality filtering to remove low-quality or subtelomeric reads that passed earlier steps. The pipeline generates customizable statistics, text-based summaries, and publication-ready visualizations (HTML, PNG, PDF). While default settings are optimized for diagnostic workflows, all parameters are easily adjustable via the GUI or command line to support diverse applications. These include telomere analyses in variant-rich samples (e.g., ALT-positive tumors) and organisms with non-canonical telomeric repeats such as some insects (GTTAG) and certain plants (GGTTTAG). TARPON is the first complete and experimentally validated pipeline for Nanopore-based telomere analysis requiring no data pre-processing or prior bioinformatics expertise, while offering flexibility for advanced users.

长读测序已经改变了生物学的许多领域,并通过分析模式生物和人类的核苷酸水平分辨率染色体臂特异性端粒长度,为端粒研究带来了重大希望。然而,新技术的采用,特别是在临床或诊断环境中,需要仔细验证,以识别潜在的技术和计算限制。我们提出了TARPON(端粒分析和研究管道优化的纳米孔),一个最佳实践Nextflow管道设计的端粒分析在牛津纳米孔技术(ONT)平台上测序。TARPON可以通过命令行执行,也可以集成到ONT的EPI2ME代理中,为没有受过计算训练的人员提供一个用户友好的图形界面。Nextflow基于容器的架构消除了依赖冲突,从而简化了跨平台的部署。TARPON分离含有端粒重复序列的reads,分配链特异性,并识别可用于解复用和确认基于捕获的文库制备的富集探针。为了确保分析仅限于全长端粒,排除了在另一端缺乏捕获探针或非端粒序列的reads。滑动窗口方法定义了亚端粒到端粒的边界,然后通过高质量过滤去除通过早期步骤的低质量或亚端粒读取。该管道生成可定制的统计数据、基于文本的摘要和可发布的可视化(HTML、PNG、PDF)。虽然默认设置针对诊断工作流程进行了优化,但所有参数都可以通过GUI或命令行轻松调整,以支持各种应用程序。这些包括对富含变异的样品(如alt阳性肿瘤)和具有非规范端粒重复的生物体(如某些昆虫(GTTAG)和某些植物(GGTTTAG)的端粒分析。TARPON是第一个完整的实验验证管道,用于基于纳米孔的端粒分析,不需要数据预处理或先前的生物信息学专业知识,同时为高级用户提供灵活性。
{"title":"TARPON-A Telomere Analysis and Research Pipeline Optimized for Nanopore.","authors":"Nathaniel Deimler, David V Ho, Norbert Paul, Zoë Gill, Peter Baumann","doi":"10.1371/journal.pcbi.1013915","DOIUrl":"10.1371/journal.pcbi.1013915","url":null,"abstract":"<p><p>Long-read sequencing has transformed many areas of biology and holds significant promise for telomere research by enabling analysis of nucleotide-level resolution chromosome arm-specific telomere length in both model organisms and humans. However, the adoption of new technologies, particularly in clinical or diagnostic contexts, requires careful validation to recognize potential technical and computational limitations. We present TARPON (Telomere Analysis and Research Pipeline Optimized for Nanopore), a best-practices Nextflow pipeline designed for the analysis of telomeres sequenced on the Oxford Nanopore Technologies (ONT) platform. TARPON can be executed via the command line or integrated into ONT's EPI2ME agent, providing a user-friendly graphical interface for those without computational training. Nextflow's container-based architecture eliminates dependency conflicts, thereby streamlining deployment across platforms. TARPON isolates telomeric repeat-containing reads, assigns strand specificity, and identifies enrichment probes that can be used both for demultiplexing and for confirming capture-based library preparation. To ensure that the analysis is restricted to full-length telomeres, reads lacking a capture probe or non-telomeric sequence on the opposite end are excluded. A sliding-window approach defines the subtelomere-to-telomere boundary, followed by quality filtering to remove low-quality or subtelomeric reads that passed earlier steps. The pipeline generates customizable statistics, text-based summaries, and publication-ready visualizations (HTML, PNG, PDF). While default settings are optimized for diagnostic workflows, all parameters are easily adjustable via the GUI or command line to support diverse applications. These include telomere analyses in variant-rich samples (e.g., ALT-positive tumors) and organisms with non-canonical telomeric repeats such as some insects (GTTAG) and certain plants (GGTTTAG). TARPON is the first complete and experimentally validated pipeline for Nanopore-based telomere analysis requiring no data pre-processing or prior bioinformatics expertise, while offering flexibility for advanced users.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013915"},"PeriodicalIF":3.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12871981/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146119561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SHADE: A multilevel Bayesian framework for modeling directional spatial interactions in tissue microenvironments. SHADE:用于组织微环境中定向空间相互作用建模的多层次贝叶斯框架。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-04 DOI: 10.1371/journal.pcbi.1013930
Joel Eliason, Michele Peruzzi, Arvind Rao

Motivation: Understanding how different cell types interact spatially within tissue microenvironments is critical for deciphering immune dynamics, tumor progression, and tissue organization. Many current spatial analysis methods assume symmetric associations or compute image-level summaries separately without sharing information across patients and cohorts, limiting biological interpretability and statistical power.

Results: We present SHADE (Spatial Hierarchical Asymmetry via Directional Estimation), a multilevel Bayesian framework for modeling asymmetric spatial interactions across scales. SHADE quantifies direction-specific cell-cell associations using smooth spatial interaction curves (SICs) and integrates data across tissue sections, patients, and cohorts. Through simulation studies, SHADE demonstrates improved accuracy, robustness, and interpretability over existing methods. Application to colorectal cancer multiplexed imaging data demonstrates SHADE's ability to quantify directional spatial patterns while controlling for tissue architecture confounders and capturing substantial patient-level heterogeneity. The framework successfully identifies biologically interpretable spatial organization patterns, revealing that local microenvironmental structure varies considerably across patients within molecular subtypes.

动机:了解不同类型的细胞如何在组织微环境中空间相互作用,对于破译免疫动力学、肿瘤进展和组织组织至关重要。许多当前的空间分析方法假设对称关联或单独计算图像级总结,而没有在患者和队列之间共享信息,限制了生物学可解释性和统计能力。结果:我们提出了SHADE(通过方向估计的空间层次不对称),这是一个多层次贝叶斯框架,用于模拟跨尺度的不对称空间相互作用。SHADE使用平滑的空间相互作用曲线(SICs)量化方向特异性细胞-细胞关联,并整合组织切片、患者和队列的数据。通过仿真研究,与现有方法相比,SHADE证明了更高的准确性、鲁棒性和可解释性。在结直肠癌多路复用成像数据中的应用表明,SHADE能够定量定向空间模式,同时控制组织结构混杂因素,并捕获大量患者水平的异质性。该框架成功地识别了生物学上可解释的空间组织模式,揭示了不同分子亚型患者的局部微环境结构差异很大。
{"title":"SHADE: A multilevel Bayesian framework for modeling directional spatial interactions in tissue microenvironments.","authors":"Joel Eliason, Michele Peruzzi, Arvind Rao","doi":"10.1371/journal.pcbi.1013930","DOIUrl":"https://doi.org/10.1371/journal.pcbi.1013930","url":null,"abstract":"<p><strong>Motivation: </strong>Understanding how different cell types interact spatially within tissue microenvironments is critical for deciphering immune dynamics, tumor progression, and tissue organization. Many current spatial analysis methods assume symmetric associations or compute image-level summaries separately without sharing information across patients and cohorts, limiting biological interpretability and statistical power.</p><p><strong>Results: </strong>We present SHADE (Spatial Hierarchical Asymmetry via Directional Estimation), a multilevel Bayesian framework for modeling asymmetric spatial interactions across scales. SHADE quantifies direction-specific cell-cell associations using smooth spatial interaction curves (SICs) and integrates data across tissue sections, patients, and cohorts. Through simulation studies, SHADE demonstrates improved accuracy, robustness, and interpretability over existing methods. Application to colorectal cancer multiplexed imaging data demonstrates SHADE's ability to quantify directional spatial patterns while controlling for tissue architecture confounders and capturing substantial patient-level heterogeneity. The framework successfully identifies biologically interpretable spatial organization patterns, revealing that local microenvironmental structure varies considerably across patients within molecular subtypes.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013930"},"PeriodicalIF":3.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146119567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BiCLUM: Bilateral contrastive learning for unpaired single-cell multi-omics integration. 非配对单细胞多组学整合的双边对比学习。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-03 DOI: 10.1371/journal.pcbi.1013932
Yin Guo, Izaskun Mallona, Mark D Robinson, Limin Li

The integration of single-cell multi-omics data provides a powerful approach for understanding the complex interplay between different molecular modalities, such as RNA expression, chromatin accessibility and protein abundance, measured through assays like scRNA-seq, scATAC-seq and CITE-seq, at single-cell resolution. However, most existing single-cell technologies focus on individual modalities, limiting a comprehensive understanding of their interconnections. Integrating such diverse and often unpaired datasets remains a challenging task due to unknown cell correspondences across distinct feature spaces and limited insights into cell-type-specific activities in non-scRNA-seq modalities. In this work, we propose BiCLUM, a Bilateral Contrastive Learning approach for Unpaired single-cell Multi-omics integration, which simultaneously enforces cell-level and feature-level alignment across modalities. BiCLUM first transforms one modality, such as scATAC-seq, into the data space of another modality, such as scRNA-seq, using prior genomic knowledge. It then learns cell and gene embeddings simultaneously through a bilateral contrastive learning framework, incorporating both cell-level and feature-level contrastive losses. Across multiple RNA+ATAC and RNA+protein datasets, BiCLUM consistently outperforms or matches existing integration methods in both visualization and quantitative benchmarks. Importantly, BiCLUM embeddings preserve biologically meaningful regulatory relationships between chromatin accessibility and gene expression, as evidenced by significantly higher gene-peak correlations than random controls. Downstream analyses further demonstrate that BiCLUM-derived embeddings facilitate transcription factor activity inference, identification of cell-type-specific marker genes, functional enrichment, and cell-cell interaction mapping. Comprehensive hyperparameter sensitivity and ablation analyses further establish BiCLUM as a robust and interpretable framework that not only achieves effective cross-modal alignment but also retains the underlying regulatory and functional landscape across single-cell modalities.

单细胞多组学数据的整合为理解不同分子模式之间的复杂相互作用提供了一种强大的方法,如RNA表达、染色质可及性和蛋白质丰度,通过scRNA-seq、scATAC-seq和CITE-seq等检测方法在单细胞分辨率下测量。然而,大多数现有的单细胞技术侧重于单个模式,限制了对其互连的全面理解。由于在不同的特征空间中未知的细胞对应关系以及对非scrna -seq模式中细胞类型特异性活动的有限了解,整合这些不同且通常不配对的数据集仍然是一项具有挑战性的任务。在这项工作中,我们提出了BiCLUM,一种用于未配对单细胞多组学整合的双边对比学习方法,它同时强制跨模式的细胞水平和特征水平对齐。BiCLUM首先利用先前的基因组知识将一种模式(如scATAC-seq)转换为另一种模式(如scRNA-seq)的数据空间。然后,它通过双边对比学习框架同时学习细胞和基因嵌入,结合细胞水平和特征水平的对比损失。在多个RNA+ATAC和RNA+蛋白质数据集中,BiCLUM在可视化和定量基准方面始终优于或匹配现有的集成方法。重要的是,BiCLUM嵌入保留了染色质可及性和基因表达之间具有生物学意义的调控关系,这一点被显著高于随机对照的基因峰值相关性所证明。下游分析进一步表明,biclum衍生的嵌入有助于转录因子活性推断、细胞类型特异性标记基因鉴定、功能富集和细胞-细胞相互作用定位。综合的超参数敏感性和消融分析进一步确立了BiCLUM作为一个强大且可解释的框架,它不仅实现了有效的跨模态校准,而且保留了单细胞模态的潜在调节和功能景观。
{"title":"BiCLUM: Bilateral contrastive learning for unpaired single-cell multi-omics integration.","authors":"Yin Guo, Izaskun Mallona, Mark D Robinson, Limin Li","doi":"10.1371/journal.pcbi.1013932","DOIUrl":"https://doi.org/10.1371/journal.pcbi.1013932","url":null,"abstract":"<p><p>The integration of single-cell multi-omics data provides a powerful approach for understanding the complex interplay between different molecular modalities, such as RNA expression, chromatin accessibility and protein abundance, measured through assays like scRNA-seq, scATAC-seq and CITE-seq, at single-cell resolution. However, most existing single-cell technologies focus on individual modalities, limiting a comprehensive understanding of their interconnections. Integrating such diverse and often unpaired datasets remains a challenging task due to unknown cell correspondences across distinct feature spaces and limited insights into cell-type-specific activities in non-scRNA-seq modalities. In this work, we propose BiCLUM, a Bilateral Contrastive Learning approach for Unpaired single-cell Multi-omics integration, which simultaneously enforces cell-level and feature-level alignment across modalities. BiCLUM first transforms one modality, such as scATAC-seq, into the data space of another modality, such as scRNA-seq, using prior genomic knowledge. It then learns cell and gene embeddings simultaneously through a bilateral contrastive learning framework, incorporating both cell-level and feature-level contrastive losses. Across multiple RNA+ATAC and RNA+protein datasets, BiCLUM consistently outperforms or matches existing integration methods in both visualization and quantitative benchmarks. Importantly, BiCLUM embeddings preserve biologically meaningful regulatory relationships between chromatin accessibility and gene expression, as evidenced by significantly higher gene-peak correlations than random controls. Downstream analyses further demonstrate that BiCLUM-derived embeddings facilitate transcription factor activity inference, identification of cell-type-specific marker genes, functional enrichment, and cell-cell interaction mapping. Comprehensive hyperparameter sensitivity and ablation analyses further establish BiCLUM as a robust and interpretable framework that not only achieves effective cross-modal alignment but also retains the underlying regulatory and functional landscape across single-cell modalities.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013932"},"PeriodicalIF":3.6,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modelling chemotaxis of branched cells in complex environments provides insights into immune cell navigation. 模拟分支细胞在复杂环境中的趋化性提供了对免疫细胞导航的见解。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-03 eCollection Date: 2026-02-01 DOI: 10.1371/journal.pcbi.1013934
Jiayi Liu, Jonathan E Ron, Giulia Rinaldi, Ivanna Williantarra, Antonios Georgantzoglou, Ingrid de Vries, Michael Sixt, Milka Sarris, Nir S Gov

Cell migration in vivo is often guided by chemical signaling, i.e., chemotaxis. For immune cells performing chemotaxis in the organism, this process is influenced by the complex geometry of the tissue environment. In this study, we use a theoretical model of branched cell migration on a network to explore the cellular response to chemical gradients. The model predicts the response of a branched cell to a chemical gradient: how the cell reorients its internal polarity and how it navigates through a complex environment up a chemical gradient. We then compare the model's predictions with experimental observations of neutrophils migrating to the site of a laser-inflicted wound in a zebrafish larva fin, and neutrophils migrating in vitro inside a regular lattice of pillars. We find that the model captures the details of the subcellular response to the chemokine gradient, as well as qualitative characteristics of the large-scale migration, suggesting that the neutrophils behave as fast cells, which explains the functionality of these immune cells.

细胞在体内的迁移通常是由化学信号引导的,即趋化性。对于在机体中进行趋化的免疫细胞来说,这一过程受到组织环境复杂几何形状的影响。在这项研究中,我们使用分支细胞在网络上迁移的理论模型来探索细胞对化学梯度的反应。该模型预测了分支细胞对化学梯度的反应:细胞如何重新定位其内部极性,以及它如何在复杂的环境中沿着化学梯度进行导航。然后,我们将模型的预测与实验观察结果进行比较,实验观察到中性粒细胞迁移到斑马鱼幼体鳍中激光造成的伤口部位,以及中性粒细胞在体外规则晶格柱内迁移。我们发现该模型捕获了亚细胞对趋化因子梯度反应的细节,以及大规模迁移的定性特征,表明中性粒细胞表现为快速细胞,这解释了这些免疫细胞的功能。
{"title":"Modelling chemotaxis of branched cells in complex environments provides insights into immune cell navigation.","authors":"Jiayi Liu, Jonathan E Ron, Giulia Rinaldi, Ivanna Williantarra, Antonios Georgantzoglou, Ingrid de Vries, Michael Sixt, Milka Sarris, Nir S Gov","doi":"10.1371/journal.pcbi.1013934","DOIUrl":"10.1371/journal.pcbi.1013934","url":null,"abstract":"<p><p>Cell migration in vivo is often guided by chemical signaling, i.e., chemotaxis. For immune cells performing chemotaxis in the organism, this process is influenced by the complex geometry of the tissue environment. In this study, we use a theoretical model of branched cell migration on a network to explore the cellular response to chemical gradients. The model predicts the response of a branched cell to a chemical gradient: how the cell reorients its internal polarity and how it navigates through a complex environment up a chemical gradient. We then compare the model's predictions with experimental observations of neutrophils migrating to the site of a laser-inflicted wound in a zebrafish larva fin, and neutrophils migrating in vitro inside a regular lattice of pillars. We find that the model captures the details of the subcellular response to the chemokine gradient, as well as qualitative characteristics of the large-scale migration, suggesting that the neutrophils behave as fast cells, which explains the functionality of these immune cells.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013934"},"PeriodicalIF":3.6,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12880755/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Information uncertainty influences learning strategy from sequentially delayed rewards. 信息不确定性影响顺序延迟奖励的学习策略。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-02 DOI: 10.1371/journal.pcbi.1013879
Sean R Maulhardt, Alec Solway, Caroline J Charpentier

When receiving a reward after a sequence of multiple events, how do we determine which event caused the reward? This problem, known as temporal credit assignment, can be difficult for humans to solve given the temporal uncertainty in the environment. Research to date has attempted to isolate dimensions of delay and reward during decision-making, but algorithmic solutions to temporal learning problems and the effect of uncertainty on learning remain underexplored. To further our understanding, we adapted a reward learning task that creates a temporal credit assignment problem by combining sequentially delayed rewards, intervening events, and varying uncertainty via the amount of information presented during feedback. Using computational modeling, two learning strategies were developed: an eligibility trace, whereby previously selected actions are updated as a function of the temporal sequence, and a tabular update, whereby only systematically related past actions (rather than unrelated intervening events) are updated. We hypothesized that reduced information uncertainty would correlate with increased use of the tabular strategy, given the model's capacity to incorporate additional feedback information. Both models effectively learned the task, and predicted choices made by participants (N = 142) as well as specific behavioral signatures of credit assignment. Consistent with our hypothesis, the tabular model outperformed the eligibility model under low information uncertainty, as evidenced by more accurate predictions of participants' behavior and an increase in tabular weight. These findings provide new insights into the mechanisms implemented by humans to solve temporal credit assignment and adapt their strategy in varying environments.

当在一系列多个事件之后获得奖励时,我们如何确定哪个事件导致了奖励?这个问题被称为时间信用分配,由于环境的时间不确定性,人类很难解决这个问题。迄今为止的研究试图在决策过程中隔离延迟和奖励的维度,但对时间学习问题的算法解决方案和不确定性对学习的影响仍未得到充分探索。为了进一步理解,我们调整了一个奖励学习任务,该任务通过结合顺序延迟奖励、干预事件和通过反馈期间提供的信息量而变化的不确定性,创造了一个时间信用分配问题。使用计算建模,开发了两种学习策略:一种是资格跟踪,即以前选择的行动作为时间序列的函数更新;另一种是表格更新,即仅更新系统相关的过去行动(而不是不相关的干预事件)。我们假设,减少信息不确定性将与增加表格策略的使用相关,因为模型有能力纳入额外的反馈信息。两种模型都有效地学习了任务,并预测了参与者(N = 142)的选择以及信用分配的特定行为特征。与我们的假设一致,表格模型在低信息不确定性下优于资格模型,这可以通过更准确地预测参与者的行为和表格权重的增加来证明。这些发现为人类解决时间信用分配和适应不同环境策略的机制提供了新的见解。
{"title":"Information uncertainty influences learning strategy from sequentially delayed rewards.","authors":"Sean R Maulhardt, Alec Solway, Caroline J Charpentier","doi":"10.1371/journal.pcbi.1013879","DOIUrl":"https://doi.org/10.1371/journal.pcbi.1013879","url":null,"abstract":"<p><p>When receiving a reward after a sequence of multiple events, how do we determine which event caused the reward? This problem, known as temporal credit assignment, can be difficult for humans to solve given the temporal uncertainty in the environment. Research to date has attempted to isolate dimensions of delay and reward during decision-making, but algorithmic solutions to temporal learning problems and the effect of uncertainty on learning remain underexplored. To further our understanding, we adapted a reward learning task that creates a temporal credit assignment problem by combining sequentially delayed rewards, intervening events, and varying uncertainty via the amount of information presented during feedback. Using computational modeling, two learning strategies were developed: an eligibility trace, whereby previously selected actions are updated as a function of the temporal sequence, and a tabular update, whereby only systematically related past actions (rather than unrelated intervening events) are updated. We hypothesized that reduced information uncertainty would correlate with increased use of the tabular strategy, given the model's capacity to incorporate additional feedback information. Both models effectively learned the task, and predicted choices made by participants (N = 142) as well as specific behavioral signatures of credit assignment. Consistent with our hypothesis, the tabular model outperformed the eligibility model under low information uncertainty, as evidenced by more accurate predictions of participants' behavior and an increase in tabular weight. These findings provide new insights into the mechanisms implemented by humans to solve temporal credit assignment and adapt their strategy in varying environments.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013879"},"PeriodicalIF":3.6,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146106951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiscale segmentation using hierarchical phase-contrast tomography and deep learning. 使用分层相衬断层扫描和深度学习的多尺度分割。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-02 eCollection Date: 2026-02-01 DOI: 10.1371/journal.pcbi.1013923
Yang Zhou, Shahab Aslani, Yousef Javanmardi, Joseph Brunet, David Stansby, Saskia Carroll, Alexandre Bellier, Maximilian Ackermann, Paul Tafforeau, Peter D Lee, Claire L Walsh

Biomedical systems span multiple spatial scales, encompassing tiny functional units to entire organs. Interpreting these systems through image segmentation requires the effective propagation and integration of information across different scales. However, most existing segmentation methods are optimised for single-scale imaging modalities, limiting their ability to capture and analyse small functional units throughout complete human organs. To facilitate multiscale biomedical image segmentation, we utilised Hierarchical Phase-Contrast Tomography (HiP-CT), an advanced imaging modality that can generate 3D multiscale datasets from high-resolution volumes of interest (VOIs) at ca. 1 [Formula: see text]/voxel to whole-organ scans at ca. 20 [Formula: see text]/voxel. Building on these hierarchical multiscale datasets, we developed a deep learning-based segmentation pipeline that is initially trained on manually annotated high-resolution HiP-CT data and then extended to lower-resolution whole-organ scans using pseudo-labels generated from high-resolution predictions and multiscale image registration. As a case study, we focused on glomeruli in human kidneys, benchmarking four 3D deep learning models for biomedical image segmentation on a manually annotated high-resolution dataset extracted from VOIs, at 2.58 to ca. 5 [Formula: see text]/voxel, of four human kidneys. Among them, nnUNet demonstrated the best performance, achieving an average test Dice score of 0.906, and was subsequently used as the baseline model for multiscale segmentation in the pipeline. Applying this pipeline to two low-resolution full-organ data at ca. 25 [Formula: see text]/voxel, the model identified 1,019,890 and 231,179 glomeruli in a 62-year-old donor without kidney diseases and a 94-year-old hypertensive donor, enabling comprehensive morphological analyses, including cortical spatial statistics and glomerular distributions, which aligned well with previous anatomical studies. Our results highlight the effectiveness of the proposed pipeline for segmenting small functional units in multiscale bioimaging datasets and suggest its broader applicability to other organ systems.

生物医学系统跨越多个空间尺度,包括微小的功能单位到整个器官。通过图像分割来解释这些系统需要有效地传播和整合不同尺度的信息。然而,大多数现有的分割方法针对单尺度成像模式进行了优化,限制了它们在整个完整人体器官中捕获和分析小功能单元的能力。为了促进多尺度生物医学图像分割,我们使用了分层相对比断层扫描(hict),这是一种先进的成像方式,可以从高分辨率感兴趣体积(voi)生成3D多尺度数据集,其速度约为1[公式:见文本]/体素到全器官扫描,速度约为20[公式:见文本]/体素。在这些分层多尺度数据集的基础上,我们开发了一种基于深度学习的分割管道,该管道最初在手动注释的高分辨率HiP-CT数据上进行训练,然后使用高分辨率预测和多尺度图像配准生成的伪标签扩展到低分辨率的全器官扫描。作为一个案例研究,我们专注于人类肾脏的肾小球,在人工注释的高分辨率数据集上对四种用于生物医学图像分割的3D深度学习模型进行基准测试,这些数据集从voi中提取,为2.58至ca. 5 /体素,四个人类肾脏。其中,nnUNet表现出最好的性能,平均测试Dice得分为0.906,随后被用作流水线中多尺度分割的基线模型。将该管道应用于两个低分辨率的全器官数据(约25 /体素),该模型在一名62岁无肾脏疾病的供体和一名94岁高血压供体中分别识别出1,019,890和231,179个肾小球,从而实现了全面的形态学分析,包括皮质空间统计和肾小球分布,这与之前的解剖学研究非常吻合。我们的研究结果强调了该方法在多尺度生物成像数据集中分割小功能单元的有效性,并表明其更广泛地适用于其他器官系统。
{"title":"Multiscale segmentation using hierarchical phase-contrast tomography and deep learning.","authors":"Yang Zhou, Shahab Aslani, Yousef Javanmardi, Joseph Brunet, David Stansby, Saskia Carroll, Alexandre Bellier, Maximilian Ackermann, Paul Tafforeau, Peter D Lee, Claire L Walsh","doi":"10.1371/journal.pcbi.1013923","DOIUrl":"10.1371/journal.pcbi.1013923","url":null,"abstract":"<p><p>Biomedical systems span multiple spatial scales, encompassing tiny functional units to entire organs. Interpreting these systems through image segmentation requires the effective propagation and integration of information across different scales. However, most existing segmentation methods are optimised for single-scale imaging modalities, limiting their ability to capture and analyse small functional units throughout complete human organs. To facilitate multiscale biomedical image segmentation, we utilised Hierarchical Phase-Contrast Tomography (HiP-CT), an advanced imaging modality that can generate 3D multiscale datasets from high-resolution volumes of interest (VOIs) at ca. 1 [Formula: see text]/voxel to whole-organ scans at ca. 20 [Formula: see text]/voxel. Building on these hierarchical multiscale datasets, we developed a deep learning-based segmentation pipeline that is initially trained on manually annotated high-resolution HiP-CT data and then extended to lower-resolution whole-organ scans using pseudo-labels generated from high-resolution predictions and multiscale image registration. As a case study, we focused on glomeruli in human kidneys, benchmarking four 3D deep learning models for biomedical image segmentation on a manually annotated high-resolution dataset extracted from VOIs, at 2.58 to ca. 5 [Formula: see text]/voxel, of four human kidneys. Among them, nnUNet demonstrated the best performance, achieving an average test Dice score of 0.906, and was subsequently used as the baseline model for multiscale segmentation in the pipeline. Applying this pipeline to two low-resolution full-organ data at ca. 25 [Formula: see text]/voxel, the model identified 1,019,890 and 231,179 glomeruli in a 62-year-old donor without kidney diseases and a 94-year-old hypertensive donor, enabling comprehensive morphological analyses, including cortical spatial statistics and glomerular distributions, which aligned well with previous anatomical studies. Our results highlight the effectiveness of the proposed pipeline for segmenting small functional units in multiscale bioimaging datasets and suggest its broader applicability to other organ systems.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013923"},"PeriodicalIF":3.6,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12880754/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146107023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning genetic perturbation effects with variational causal inference. 用变分因果推理学习遗传扰动效应。
IF 3.6 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-02-02 DOI: 10.1371/journal.pcbi.1013194
Emily Liu, Jiaqi Zhang, Caroline Uhler

Advances in sequencing technologies have enhanced the understanding of gene regulation in cells. In particular, Perturb-seq has enabled high-resolution profiling of the transcriptomic response to genetic perturbations at the single-cell level. This understanding has implications in functional genomics and potentially for identifying therapeutic targets. Various computational models have been developed to predict perturbational effects. While deep learning models excel at interpolating observed perturbational data, they tend to overfit in the lack of enough data and may not generalize well to unseen perturbations. In contrast, mechanistic models, such as linear causal models based on gene regulatory networks, hold greater potential for extrapolation, as they encapsulate regulatory information that can predict responses to unseen perturbations. However, their application has been limited to small studies due to overly simplistic assumptions, making them less effective in handling noisy, large-scale single-cell data. We propose a hybrid approach that combines a mechanistic causal model with variational deep learning, termed Single Cell Causal Variational Autoencoder (SCCVAE). The mechanistic model employs a learned regulatory network to represent perturbational changes as shift interventions that propagate through the learned network. SCCVAE integrates this mechanistic causal model into a variational autoencoder, generating rich, comprehensive transcriptomic responses. Our results indicate that SCCVAE exhibits superior performance over current state-of-the-art baselines for extrapolating to predict unseen perturbational responses. Additionally, for the observed perturbations, the latent space learned by SCCVAE allows for the identification of functional perturbation modules and simulation of single-gene knockdown experiments of varying penetrance, presenting a robust tool for interpreting and interpolating perturbational responses at the single-cell level.

测序技术的进步提高了对细胞中基因调控的认识。特别是,Perturb-seq能够在单细胞水平上对遗传扰动的转录组反应进行高分辨率分析。这种理解对功能基因组学和潜在的治疗靶点的识别具有重要意义。已经开发了各种计算模型来预测微扰效应。虽然深度学习模型擅长插值观察到的扰动数据,但它们往往在缺乏足够数据的情况下过度拟合,并且可能无法很好地推广到看不见的扰动。相比之下,机械模型,如基于基因调控网络的线性因果模型,具有更大的外推潜力,因为它们封装了可以预测对看不见的扰动的反应的调控信息。然而,由于过于简单的假设,它们的应用仅限于小型研究,这使得它们在处理嘈杂的大规模单细胞数据时效率较低。我们提出了一种混合方法,将机械因果模型与变分深度学习相结合,称为单细胞因果变分自编码器(SCCVAE)。机制模型采用一个习得的调节网络,将扰动变化表示为通过习得网络传播的移位干预。SCCVAE将这种机制因果模型集成到变分自编码器中,生成丰富、全面的转录组反应。我们的研究结果表明,SCCVAE在外推预测看不见的扰动响应方面表现出优于当前最先进的基线的性能。此外,对于观察到的扰动,SCCVAE学习的潜在空间允许识别功能扰动模块和模拟不同外显率的单基因敲除实验,为解释和插值单细胞水平的扰动响应提供了一个强大的工具。
{"title":"Learning genetic perturbation effects with variational causal inference.","authors":"Emily Liu, Jiaqi Zhang, Caroline Uhler","doi":"10.1371/journal.pcbi.1013194","DOIUrl":"https://doi.org/10.1371/journal.pcbi.1013194","url":null,"abstract":"<p><p>Advances in sequencing technologies have enhanced the understanding of gene regulation in cells. In particular, Perturb-seq has enabled high-resolution profiling of the transcriptomic response to genetic perturbations at the single-cell level. This understanding has implications in functional genomics and potentially for identifying therapeutic targets. Various computational models have been developed to predict perturbational effects. While deep learning models excel at interpolating observed perturbational data, they tend to overfit in the lack of enough data and may not generalize well to unseen perturbations. In contrast, mechanistic models, such as linear causal models based on gene regulatory networks, hold greater potential for extrapolation, as they encapsulate regulatory information that can predict responses to unseen perturbations. However, their application has been limited to small studies due to overly simplistic assumptions, making them less effective in handling noisy, large-scale single-cell data. We propose a hybrid approach that combines a mechanistic causal model with variational deep learning, termed Single Cell Causal Variational Autoencoder (SCCVAE). The mechanistic model employs a learned regulatory network to represent perturbational changes as shift interventions that propagate through the learned network. SCCVAE integrates this mechanistic causal model into a variational autoencoder, generating rich, comprehensive transcriptomic responses. Our results indicate that SCCVAE exhibits superior performance over current state-of-the-art baselines for extrapolating to predict unseen perturbational responses. Additionally, for the observed perturbations, the latent space learned by SCCVAE allows for the identification of functional perturbation modules and simulation of single-gene knockdown experiments of varying penetrance, presenting a robust tool for interpreting and interpolating perturbational responses at the single-cell level.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"22 2","pages":"e1013194"},"PeriodicalIF":3.6,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146106998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
PLoS Computational Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1