首页 > 最新文献

Cell systems最新文献

英文 中文
Evaluation of machine learning-assisted directed evolution across diverse combinatorial landscapes. 跨不同组合景观的机器学习辅助定向进化评价。
IF 7.7 Pub Date : 2025-09-17 Epub Date: 2025-09-10 DOI: 10.1016/j.cels.2025.101387
Francesca-Zhoufan Li, Jason Yang, Kadina E Johnston, Emre Gürsoy, Yisong Yue, Frances H Arnold

Various machine learning-assisted directed evolution (MLDE) strategies have been shown to identify high-fitness protein variants more efficiently than typical directed evolution approaches. However, limited understanding of the factors influencing MLDE performance across diverse proteins has hindered optimal strategy selection for wet-lab campaigns. To address this, we systematically analyzed multiple MLDE strategies, including active learning and focused training using six distinct zero-shot predictors, across 16 diverse protein fitness landscapes. By quantifying landscape navigability with six attributes, we found that MLDE offers a greater advantage on landscapes that are more challenging for directed evolution, especially when focused training is combined with active learning. Despite varying levels of advantage across landscapes, focused training with zero-shot predictors leveraging distinct evolutionary, structural, and stability knowledge sources consistently outperforms random sampling for both binding interactions and enzyme activities. Our findings provide practical guidelines for selecting MLDE strategies for protein engineering. A record of this paper's transparent peer review process is included in the supplemental information.

各种机器学习辅助定向进化(MLDE)策略已被证明比典型的定向进化方法更有效地识别高适应度蛋白质变异。然而,对不同蛋白质影响MLDE性能的因素的了解有限,阻碍了湿实验室运动的最佳策略选择。为了解决这个问题,我们系统地分析了多种MLDE策略,包括主动学习和使用六种不同的零射击预测器的集中训练,涵盖16种不同的蛋白质适应性景观。通过量化具有六个属性的景观可通航性,我们发现MLDE在对定向进化更具挑战性的景观上提供了更大的优势,特别是当集中训练与主动学习相结合时。尽管在不同的环境中有不同程度的优势,但在结合相互作用和酶活性方面,利用不同的进化、结构和稳定性知识来源的零射击预测器的集中训练始终优于随机抽样。我们的发现为蛋白质工程中MLDE策略的选择提供了实用的指导。本文的透明同行评议过程记录包含在补充信息中。
{"title":"Evaluation of machine learning-assisted directed evolution across diverse combinatorial landscapes.","authors":"Francesca-Zhoufan Li, Jason Yang, Kadina E Johnston, Emre Gürsoy, Yisong Yue, Frances H Arnold","doi":"10.1016/j.cels.2025.101387","DOIUrl":"10.1016/j.cels.2025.101387","url":null,"abstract":"<p><p>Various machine learning-assisted directed evolution (MLDE) strategies have been shown to identify high-fitness protein variants more efficiently than typical directed evolution approaches. However, limited understanding of the factors influencing MLDE performance across diverse proteins has hindered optimal strategy selection for wet-lab campaigns. To address this, we systematically analyzed multiple MLDE strategies, including active learning and focused training using six distinct zero-shot predictors, across 16 diverse protein fitness landscapes. By quantifying landscape navigability with six attributes, we found that MLDE offers a greater advantage on landscapes that are more challenging for directed evolution, especially when focused training is combined with active learning. Despite varying levels of advantage across landscapes, focused training with zero-shot predictors leveraging distinct evolutionary, structural, and stability knowledge sources consistently outperforms random sampling for both binding interactions and enzyme activities. Our findings provide practical guidelines for selecting MLDE strategies for protein engineering. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101387"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145042533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
scTrace+: Enhancing cell fate inference by integrating the lineage-tracing and multi-faceted transcriptomic similarity information. scTrace+:通过整合谱系追踪和多方面的转录组相似性信息,增强细胞命运推断。
IF 7.7 Pub Date : 2025-09-17 Epub Date: 2025-09-10 DOI: 10.1016/j.cels.2025.101398
Wenbo Guo, Zeyu Chen, Xinqi Li, Jingmin Huang, Qifan Hu, Jin Gu

Deciphering the cell state dynamics is crucial for understanding biological processes. Single-cell lineage-tracing technologies provide an effective way to track single-cell lineages by heritable DNA barcodes, but the high missing rates of lineage barcodes and the intra-clonal heterogeneity bring great challenges to dissecting the mechanisms of cell fate decision. Here, we systematically evaluate the features of single-cell lineage-tracing data and then develop an algorithm, scTrace+, to enhance the cell dynamic traces by incorporating multi-faceted transcriptomic similarities into lineage relationships via a kernelized probabilistic matrix factorization model. We assess its feasibility and performance by conducting ablation and benchmarking experiments on multiple real datasets and show that scTrace+ can accurately predict the fates of cells. Further, scTrace+ effectively identifies some important driver genes implicated in cellular fate decisions of diverse biological processes, such as cell differentiation or tumor drug responses. A record of this paper's transparent peer review process is included in the supplemental information.

破译细胞状态动力学对于理解生物过程至关重要。单细胞谱系追踪技术为利用可遗传DNA条形码追踪单细胞谱系提供了一种有效的方法,但谱系条形码的高缺失率和克隆内异质性给细胞命运决定机制的剖析带来了巨大的挑战。在这里,我们系统地评估了单细胞谱系追踪数据的特征,然后开发了一种算法,scTrace+,通过核概率矩阵分解模型将多方面的转录组相似性纳入谱系关系,以增强细胞动态轨迹。我们通过在多个真实数据集上进行消融和基准实验来评估其可行性和性能,并表明scTrace+可以准确预测细胞的命运。此外,scTrace+有效地识别了一些重要的驱动基因,这些基因与多种生物过程(如细胞分化或肿瘤药物反应)的细胞命运决定有关。本文的透明同行评议过程记录包含在补充信息中。
{"title":"scTrace+: Enhancing cell fate inference by integrating the lineage-tracing and multi-faceted transcriptomic similarity information.","authors":"Wenbo Guo, Zeyu Chen, Xinqi Li, Jingmin Huang, Qifan Hu, Jin Gu","doi":"10.1016/j.cels.2025.101398","DOIUrl":"10.1016/j.cels.2025.101398","url":null,"abstract":"<p><p>Deciphering the cell state dynamics is crucial for understanding biological processes. Single-cell lineage-tracing technologies provide an effective way to track single-cell lineages by heritable DNA barcodes, but the high missing rates of lineage barcodes and the intra-clonal heterogeneity bring great challenges to dissecting the mechanisms of cell fate decision. Here, we systematically evaluate the features of single-cell lineage-tracing data and then develop an algorithm, scTrace+, to enhance the cell dynamic traces by incorporating multi-faceted transcriptomic similarities into lineage relationships via a kernelized probabilistic matrix factorization model. We assess its feasibility and performance by conducting ablation and benchmarking experiments on multiple real datasets and show that scTrace+ can accurately predict the fates of cells. Further, scTrace+ effectively identifies some important driver genes implicated in cellular fate decisions of diverse biological processes, such as cell differentiation or tumor drug responses. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101398"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145042600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ProtRNA: A protein-derived RNA language model by cross-modality transfer learning. 跨模态迁移学习的蛋白质衍生RNA语言模型。
IF 7.7 Pub Date : 2025-09-17 Epub Date: 2025-08-22 DOI: 10.1016/j.cels.2025.101371
Ruoxi Zhang, Ben Ma, Gang Xu, Jianpeng Ma

Protein language models (PLMs), such as the highly successful ESM-2, have proven particularly effective. However, language models designed for RNA continue to face challenges. A key question is as follows: can the information derived from PLMs be harnessed and transferred to RNA? To investigate this, a model termed ProtRNA has been developed by a cross-modality transfer learning strategy for addressing the challenges posed by RNA's limited and less conserved sequences. By leveraging the evolutionary and physicochemical information encoded in protein sequences, the ESM-2 model is adapted to processing "low-resource" RNA sequence data. The results show comparable or superior performance in various RNA downstream tasks, with only 1/8 the trainable parameters and 1/6 the training data employed by the primary reference baseline RNA language model. This approach highlights the potential of cross-modality transfer learning in biological language models.

蛋白质语言模型(plm),如非常成功的ESM-2,已被证明特别有效。然而,为RNA设计的语言模型继续面临挑战。一个关键问题如下:从PLMs中获得的信息能否被利用并转移到RNA中?为了研究这一点,一个被称为pronna的模型已经通过跨模态迁移学习策略开发,以解决RNA有限和不太保守的序列所带来的挑战。通过利用编码在蛋白质序列中的进化和物理化学信息,ESM-2模型适用于处理“低资源”RNA序列数据。结果显示,在主要参考基线RNA语言模型中,只有1/8的可训练参数和1/6的训练数据,在各种RNA下游任务中具有相当或更好的性能。这种方法强调了跨模态迁移学习在生物语言模型中的潜力。
{"title":"ProtRNA: A protein-derived RNA language model by cross-modality transfer learning.","authors":"Ruoxi Zhang, Ben Ma, Gang Xu, Jianpeng Ma","doi":"10.1016/j.cels.2025.101371","DOIUrl":"10.1016/j.cels.2025.101371","url":null,"abstract":"<p><p>Protein language models (PLMs), such as the highly successful ESM-2, have proven particularly effective. However, language models designed for RNA continue to face challenges. A key question is as follows: can the information derived from PLMs be harnessed and transferred to RNA? To investigate this, a model termed ProtRNA has been developed by a cross-modality transfer learning strategy for addressing the challenges posed by RNA's limited and less conserved sequences. By leveraging the evolutionary and physicochemical information encoded in protein sequences, the ESM-2 model is adapted to processing \"low-resource\" RNA sequence data. The results show comparable or superior performance in various RNA downstream tasks, with only 1/8 the trainable parameters and 1/6 the training data employed by the primary reference baseline RNA language model. This approach highlights the potential of cross-modality transfer learning in biological language models.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101371"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144982416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning antibody sequence constraints from allelic inclusion. 从等位基因包涵体中学习抗体序列约束。
IF 7.7 Pub Date : 2025-09-17 Epub Date: 2025-08-13 DOI: 10.1016/j.cels.2025.101368
Milind Jagota, Chloe Hsu, Thomas Mazumder, Kevin Sung, William S DeWitt, Jennifer Listgarten, Frederick A Matsen Iv, Chun Jimmie Ye, Yun S Song

Although antibody sequences are highly diverse, they are constrained by requirements for expression and limited off-target reactivity. Describing which sequences violate such constraints has proven to be difficult. Here, we introduce a machine-learning framework to leverage a previously underutilized source of data for this problem. We use human single-cell sequencing data to find instances of allelic inclusion, a rare event where B cells express two different antibody light chains as mRNA. Previous studies suggest that one of these chains is either autoreactive or non-expressing as protein. We train machine-learning models to identify abnormal sequences associated with allelic inclusion. The resulting models generalize to predict antibody properties including polyreactivity, surface expression, and mutation usage, outperforming methods that do not use allelic inclusion data. We also investigate similar selection forces on the heavy chain in mice and observe that surrogate light-chain pairing has a large impact on heavy-chain diversity.

虽然抗体序列高度多样化,但它们受到表达要求和有限的脱靶反应性的限制。描述哪些序列违反了这些约束已被证明是困难的。在这里,我们引入了一个机器学习框架来利用以前未充分利用的数据源来解决这个问题。我们使用人类单细胞测序数据来寻找等位基因包含的实例,这是B细胞表达两种不同抗体轻链作为mRNA的罕见事件。先前的研究表明,其中一条链要么是自身反应性的,要么是不作为蛋白质表达的。我们训练机器学习模型来识别与等位基因包含相关的异常序列。由此产生的模型可用于预测抗体特性,包括多反应性、表面表达和突变使用,优于不使用等位基因包含数据的方法。我们还研究了小鼠重链上类似的选择力,并观察到替代轻链配对对重链多样性有很大影响。
{"title":"Learning antibody sequence constraints from allelic inclusion.","authors":"Milind Jagota, Chloe Hsu, Thomas Mazumder, Kevin Sung, William S DeWitt, Jennifer Listgarten, Frederick A Matsen Iv, Chun Jimmie Ye, Yun S Song","doi":"10.1016/j.cels.2025.101368","DOIUrl":"10.1016/j.cels.2025.101368","url":null,"abstract":"<p><p>Although antibody sequences are highly diverse, they are constrained by requirements for expression and limited off-target reactivity. Describing which sequences violate such constraints has proven to be difficult. Here, we introduce a machine-learning framework to leverage a previously underutilized source of data for this problem. We use human single-cell sequencing data to find instances of allelic inclusion, a rare event where B cells express two different antibody light chains as mRNA. Previous studies suggest that one of these chains is either autoreactive or non-expressing as protein. We train machine-learning models to identify abnormal sequences associated with allelic inclusion. The resulting models generalize to predict antibody properties including polyreactivity, surface expression, and mutation usage, outperforming methods that do not use allelic inclusion data. We also investigate similar selection forces on the heavy chain in mice and observe that surrogate light-chain pairing has a large impact on heavy-chain diversity.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101368"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chemotherapy modulation by a cancer-associated microbiota metabolite. 癌症相关微生物代谢物对化疗的调节作用。
IF 7.7 Pub Date : 2025-09-17 Epub Date: 2025-09-10 DOI: 10.1016/j.cels.2025.101397
Daniel Martinez-Martinez, Tanara V Peres, Kristin Gehling, Leonor Quintaneiro, Cecilia Cabrera, Maksym Cherevatenko, Stephen J Cutty, Lena Best, Georgios Marinos, Johannes Zimmerman, Ayesha Safoor, Despoina Chrysostomou, Joao B Mokochinski, Alex Montoya, Susanne Brodesser, Michalina Zatorska, Timothy Scott, Ivan Andrew, Holger Kramer, Masuma Begum, Bian Zhang, Bernard T Golding, Julian R Marchesi, Susumu Hirabayashi, Christoph Kaleta, Alexis R Barr, Christian Frezza, Helena M Cochemé, Filipe Cabreiro

Understanding how the microbiota produces regulatory metabolites is of significance for cancer and cancer therapy. Using a host-microbe-drug-nutrient 4-way screening approach, we evaluated the role of nutrition at the molecular level in the context of 5-fluorouracil toxicity. Notably, our screens identified the metabolite 2-methylisocitrate, which was found to be produced and enriched in human tumor-associated microbiota. 2-methylisocitrate exhibits anti-proliferative properties across genetically and tissue-diverse cancer cell lines, three-dimensional (3D) spheroids, and an in vivo Drosophila gut tumor model, where it reduced tumor dissemination and increased survival. Chemical landscape interaction screens identified drug-metabolite signatures and highlighted the synergy between 5-fluorouracil and 2-methylisocitrate. Multi-omic analyses revealed that 2-methylisocitrate acts via multiple cellular pathways linking metabolism and DNA damage to regulate chemotherapy. Finally, we converted 2-methylisocitrate into its trimethyl ester, thereby enhancing its potency. This work highlights the great impact of microbiome-derived metabolites on tumor proliferation and their potential as promising co-adjuvants for cancer treatment.

了解微生物群如何产生调节代谢物对癌症和癌症治疗具有重要意义。采用宿主-微生物-药物-营养4向筛选方法,我们在分子水平上评估了营养在5-氟尿嘧啶毒性背景下的作用。值得注意的是,我们的筛选鉴定了代谢物2-甲基异柠檬酸盐,发现它在人类肿瘤相关微生物群中产生和富集。2-甲基异柠檬酸盐在遗传和组织多样化的癌细胞系、三维(3D)球体和体内果蝇肠道肿瘤模型中显示出抗增殖特性,在那里它减少了肿瘤的传播并提高了生存率。化学景观相互作用筛选确定了药物代谢物特征,并强调了5-氟尿嘧啶和2-甲基异柠檬酸盐之间的协同作用。多组学分析显示,2-甲基异柠檬酸盐通过连接代谢和DNA损伤的多种细胞途径调节化疗。最后,我们将2-甲基异柠檬酸酯转化为其三甲酯,从而提高其效力。这项工作强调了微生物衍生的代谢物对肿瘤增殖的巨大影响及其作为癌症治疗的有前途的辅助剂的潜力。
{"title":"Chemotherapy modulation by a cancer-associated microbiota metabolite.","authors":"Daniel Martinez-Martinez, Tanara V Peres, Kristin Gehling, Leonor Quintaneiro, Cecilia Cabrera, Maksym Cherevatenko, Stephen J Cutty, Lena Best, Georgios Marinos, Johannes Zimmerman, Ayesha Safoor, Despoina Chrysostomou, Joao B Mokochinski, Alex Montoya, Susanne Brodesser, Michalina Zatorska, Timothy Scott, Ivan Andrew, Holger Kramer, Masuma Begum, Bian Zhang, Bernard T Golding, Julian R Marchesi, Susumu Hirabayashi, Christoph Kaleta, Alexis R Barr, Christian Frezza, Helena M Cochemé, Filipe Cabreiro","doi":"10.1016/j.cels.2025.101397","DOIUrl":"10.1016/j.cels.2025.101397","url":null,"abstract":"<p><p>Understanding how the microbiota produces regulatory metabolites is of significance for cancer and cancer therapy. Using a host-microbe-drug-nutrient 4-way screening approach, we evaluated the role of nutrition at the molecular level in the context of 5-fluorouracil toxicity. Notably, our screens identified the metabolite 2-methylisocitrate, which was found to be produced and enriched in human tumor-associated microbiota. 2-methylisocitrate exhibits anti-proliferative properties across genetically and tissue-diverse cancer cell lines, three-dimensional (3D) spheroids, and an in vivo Drosophila gut tumor model, where it reduced tumor dissemination and increased survival. Chemical landscape interaction screens identified drug-metabolite signatures and highlighted the synergy between 5-fluorouracil and 2-methylisocitrate. Multi-omic analyses revealed that 2-methylisocitrate acts via multiple cellular pathways linking metabolism and DNA damage to regulate chemotherapy. Finally, we converted 2-methylisocitrate into its trimethyl ester, thereby enhancing its potency. This work highlights the great impact of microbiome-derived metabolites on tumor proliferation and their potential as promising co-adjuvants for cancer treatment.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101397"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145042552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TissueMosaic: Self-supervised learning of tissue representations enables differential spatial transcriptomics across samples. 组织嵌合:组织表征的自我监督学习使跨样本的差异空间转录组学成为可能。
IF 7.7 Pub Date : 2025-09-17 Epub Date: 2025-09-08 DOI: 10.1016/j.cels.2025.101394
Sandeep Kambhampati, Luca D'Alessio, Fedor Grab, Stephen Fleming, Sophia Liu, Ruth Raichur, Fei Chen, Mehrtash Babadi

Spatial transcriptomics allows for the measurement of gene expression within the native tissue context. However, despite technological advancements, computational methods to link cell states with their microenvironment and compare these relationships across samples and conditions remain limited. To address this, we introduce Tissue Motif-Based Spatial Inference across Conditions (TissueMosaic), a self-supervised convolutional neural network designed to discover and represent tissue architectural motifs from multi-sample spatial transcriptomic datasets. TissueMosaic further links these motifs to gene expression, enabling the study of how changes in tissue structure impact cell-intrinsic function. TissueMosaic increases the signal-to-noise ratio of spatial differential expression analysis through a motif enrichment strategy, resulting in more reliable detection of genes that covary with tissue structure changes. Here, we demonstrate that TissueMosaic learns representations that outperform neighborhood cell-type composition baselines and existing methods on downstream tasks. These findings underscore the potential of self-supervised learning to advance spatial transcriptomics discovery.

空间转录组学允许在原生组织环境中测量基因表达。然而,尽管技术进步,将细胞状态与其微环境联系起来并在样本和条件下比较这些关系的计算方法仍然有限。为了解决这个问题,我们引入了基于组织基序的跨条件空间推理(TissueMosaic),这是一个自监督卷积神经网络,旨在从多样本空间转录组数据集中发现和表示组织结构基序。TissueMosaic进一步将这些基序与基因表达联系起来,使研究组织结构的变化如何影响细胞的内在功能成为可能。TissueMosaic通过基序富集策略提高了空间差异表达分析的信噪比,从而更可靠地检测到与组织结构变化共变的基因。在这里,我们证明了TissueMosaic学习表征在下游任务上优于邻域细胞类型组成基线和现有方法。这些发现强调了自我监督学习在推进空间转录组学发现方面的潜力。
{"title":"TissueMosaic: Self-supervised learning of tissue representations enables differential spatial transcriptomics across samples.","authors":"Sandeep Kambhampati, Luca D'Alessio, Fedor Grab, Stephen Fleming, Sophia Liu, Ruth Raichur, Fei Chen, Mehrtash Babadi","doi":"10.1016/j.cels.2025.101394","DOIUrl":"10.1016/j.cels.2025.101394","url":null,"abstract":"<p><p>Spatial transcriptomics allows for the measurement of gene expression within the native tissue context. However, despite technological advancements, computational methods to link cell states with their microenvironment and compare these relationships across samples and conditions remain limited. To address this, we introduce Tissue Motif-Based Spatial Inference across Conditions (TissueMosaic), a self-supervised convolutional neural network designed to discover and represent tissue architectural motifs from multi-sample spatial transcriptomic datasets. TissueMosaic further links these motifs to gene expression, enabling the study of how changes in tissue structure impact cell-intrinsic function. TissueMosaic increases the signal-to-noise ratio of spatial differential expression analysis through a motif enrichment strategy, resulting in more reliable detection of genes that covary with tissue structure changes. Here, we demonstrate that TissueMosaic learns representations that outperform neighborhood cell-type composition baselines and existing methods on downstream tasks. These findings underscore the potential of self-supervised learning to advance spatial transcriptomics discovery.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101394"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible and robust cell-type annotation for highly multiplexed tissue images. 灵活和鲁棒的细胞类型注释高度复用的组织图像。
IF 7.7 Pub Date : 2025-09-17 Epub Date: 2025-09-08 DOI: 10.1016/j.cels.2025.101374
Huangqingbo Sun, Shiqiu Yu, Anna Martinez Casals, Anna Bäckström, Yuxin Lu, Cecilia Lindskog, Matthew Ruffalo, Emma Lundberg, Robert F Murphy

Identifying cell types in highly multiplexed images is essential for understanding tissue spatial organization. Current cell-type annotation methods often rely on extensive reference images and manual adjustments. In this work, we present a tool, the Robust Image-Based Cell Annotator (RIBCA), that enables accurate, automated, unbiased, and fine-grained cell-type annotation for images with a wide range of antibody panels without requiring additional model training or human intervention. Our tool has successfully annotated over 3 million cells, revealing the spatial organization of various cell types across more than 40 different human tissues. It is open source and features a modular design, allowing for easy extension to additional cell types.

在高度复用的图像中识别细胞类型对于理解组织空间组织是必不可少的。当前的单元格类型注释方法通常依赖于大量的参考图像和手动调整。在这项工作中,我们提出了一种工具,稳健的基于图像的细胞注释器(RIBCA),它可以对具有广泛抗体面板的图像进行准确,自动化,无偏和细粒度的细胞类型注释,而无需额外的模型训练或人为干预。我们的工具已经成功地注释了超过300万个细胞,揭示了40多种不同人体组织中各种细胞类型的空间组织。它是开源的,具有模块化设计,允许轻松扩展到其他单元类型。
{"title":"Flexible and robust cell-type annotation for highly multiplexed tissue images.","authors":"Huangqingbo Sun, Shiqiu Yu, Anna Martinez Casals, Anna Bäckström, Yuxin Lu, Cecilia Lindskog, Matthew Ruffalo, Emma Lundberg, Robert F Murphy","doi":"10.1016/j.cels.2025.101374","DOIUrl":"10.1016/j.cels.2025.101374","url":null,"abstract":"<p><p>Identifying cell types in highly multiplexed images is essential for understanding tissue spatial organization. Current cell-type annotation methods often rely on extensive reference images and manual adjustments. In this work, we present a tool, the Robust Image-Based Cell Annotator (RIBCA), that enables accurate, automated, unbiased, and fine-grained cell-type annotation for images with a wide range of antibody panels without requiring additional model training or human intervention. Our tool has successfully annotated over 3 million cells, revealing the spatial organization of various cell types across more than 40 different human tissues. It is open source and features a modular design, allowing for easy extension to additional cell types.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101374"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12728825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Microbial bellwether: Community-scale metabolic modeling to predict infection. 微生物领头羊:社区规模代谢模型预测感染。
IF 7.7 Pub Date : 2025-08-20 DOI: 10.1016/j.cels.2025.101370
Connor Tiffany, Joseph P Zackular

Microbial colonization is shaped by a complex network of interactions that influence both commensals and pathogens, including the public-health threat Clostridioides difficile. In this issue of Cell Systems, Carr et al. present a community-scale modeling framework for predicting colonization and metabolism, offering new insights into C. difficile infection.

微生物定植是由影响共生体和病原体的复杂相互作用网络形成的,包括公共卫生威胁艰难梭菌。在这一期的《细胞系统》中,Carr等人提出了一个社区规模的模型框架,用于预测定植和代谢,为艰难梭菌感染提供了新的见解。
{"title":"Microbial bellwether: Community-scale metabolic modeling to predict infection.","authors":"Connor Tiffany, Joseph P Zackular","doi":"10.1016/j.cels.2025.101370","DOIUrl":"https://doi.org/10.1016/j.cels.2025.101370","url":null,"abstract":"<p><p>Microbial colonization is shaped by a complex network of interactions that influence both commensals and pathogens, including the public-health threat Clostridioides difficile. In this issue of Cell Systems, Carr et al. present a community-scale modeling framework for predicting colonization and metabolism, offering new insights into C. difficile infection.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":"16 8","pages":"101370"},"PeriodicalIF":7.7,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144982351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing generative model coverage of protein structures with SHAPES. 用SHAPES评估蛋白质结构的生成模型覆盖率。
IF 7.7 Pub Date : 2025-08-20 Epub Date: 2025-07-29 DOI: 10.1016/j.cels.2025.101347
Tianyu Lu, Melissa Liu, Yilin Chen, Jinho Kim, Po-Ssu Huang

Recent advances in generative modeling enable efficient sampling of protein structures, but their tendency to optimize for designability imposes a bias toward idealized structures at the expense of loops and other complex structural motifs that are critical for function. We introduce SHAPES (structural and hierarchical assessment of proteins with embedding similarity) to evaluate five state-of-the-art generative models of protein structures. Using structural embeddings across multiple structural hierarchies, ranging from local geometries to global protein architectures, we reveal substantial undersampling of the observed protein structure space by these models. We use Fréchet protein distance (FPD) to quantify distributional coverage. Different models are distinct in their coverage behavior across different sampling noise scales and temperatures. The frequency of tertiary motifs (TERMs) further supports the observations. More robust sequence design and structure prediction methods are likely crucial in guiding the development of models with improved coverage of the designable protein space. A record of this paper's transparent peer review process is included in the supplemental information.

生成建模的最新进展使蛋白质结构的有效采样成为可能,但它们倾向于优化可设计性,以牺牲循环和其他对功能至关重要的复杂结构基序为代价,对理想化结构施加了偏见。我们引入了形状(蛋白质的结构和分层评估与嵌入相似性)来评估五种最先进的蛋白质结构生成模型。使用跨多个结构层次的结构嵌入,从局部几何到全局蛋白质结构,我们通过这些模型揭示了观察到的蛋白质结构空间的大量欠采样。我们使用fr蛋白距离(FPD)来量化分布覆盖率。不同的模型在不同的采样噪声尺度和温度下的覆盖行为是不同的。三级基序(TERMs)的频率进一步支持了观察结果。更稳健的序列设计和结构预测方法可能对指导模型的发展至关重要,这些模型可以提高可设计蛋白质空间的覆盖率。本文的透明同行评议过程记录包含在补充信息中。
{"title":"Assessing generative model coverage of protein structures with SHAPES.","authors":"Tianyu Lu, Melissa Liu, Yilin Chen, Jinho Kim, Po-Ssu Huang","doi":"10.1016/j.cels.2025.101347","DOIUrl":"10.1016/j.cels.2025.101347","url":null,"abstract":"<p><p>Recent advances in generative modeling enable efficient sampling of protein structures, but their tendency to optimize for designability imposes a bias toward idealized structures at the expense of loops and other complex structural motifs that are critical for function. We introduce SHAPES (structural and hierarchical assessment of proteins with embedding similarity) to evaluate five state-of-the-art generative models of protein structures. Using structural embeddings across multiple structural hierarchies, ranging from local geometries to global protein architectures, we reveal substantial undersampling of the observed protein structure space by these models. We use Fréchet protein distance (FPD) to quantify distributional coverage. Different models are distinct in their coverage behavior across different sampling noise scales and temperatures. The frequency of tertiary motifs (TERMs) further supports the observations. More robust sequence design and structure prediction methods are likely crucial in guiding the development of models with improved coverage of the designable protein space. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101347"},"PeriodicalIF":7.7,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12321228/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144755440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
T cell receptor cross-reactivity prediction improved by a comprehensive mutational scan database. 综合突变扫描数据库改进T细胞受体交叉反应性预测。
IF 7.7 Pub Date : 2025-08-20 Epub Date: 2025-07-25 DOI: 10.1016/j.cels.2025.101345
Amitava Banerjee, David J Pattinson, Cornelia L Wincek, Paul Bunk, Armend Axhemi, Sarah R Chapin, Saket Navlakha, Hannah V Meyer

Comprehensively mapping all targets of a T cell receptor (TCR) is important for predicting pathogenic escape and off-target effects of TCR therapies. However, this mapping has been challenging due to lack of unbiased benchmarking datasets and computational methods sensitive to small-peptide mutations. To address this, we curated the benchmark for activation of T cells with cross-reactive avidity for epitopes (BATCAVE) database, encompassing near-complete single-amino-acid mutational assays, centered around 25 immunogenic epitopes, across both major histocompatibility complex classes, against 151 human and mouse TCRs, containing 22,000+ TCR-peptide pairs in total. We then introduce Bayesian inference of activation of TCR by mutant antigens (BATMAN), an interpretable Bayesian model, trained on BATCAVE, for predicting the peptides that activate a TCR, and an active learning extension, which efficiently maps targets of a novel TCR by selecting a few peptides to assay. We show that BATMAN outperforms existing methods, reveals structural and biochemical predictors of TCR-peptide interactions, and can predict polyclonal T cell responses and TCR targets with high sequence dissimilarity. A record of this paper's transparent peer review process is included in the supplemental information.

全面定位T细胞受体(TCR)的所有靶标对于预测TCR治疗的致病性逃逸和脱靶效应非常重要。然而,由于缺乏无偏基准数据集和对小肽突变敏感的计算方法,这种映射一直具有挑战性。为了解决这个问题,我们策划了具有表位交叉反应性的T细胞激活基准(BATCAVE)数据库,包括接近完整的单氨基酸突变测定,以25个免疫原性表位为中心,跨越两个主要的组织相容性复合体类别,针对151个人和小鼠tcr,总共包含22,000多个tcr肽对。然后,我们介绍了突变抗原激活TCR的贝叶斯推断(BATMAN),一个可解释的贝叶斯模型,在BATCAVE上训练,用于预测激活TCR的肽,以及一个主动学习扩展,通过选择一些肽进行分析,有效地绘制新TCR的靶标。我们发现BATMAN优于现有方法,揭示了TCR-肽相互作用的结构和生化预测因子,并且可以预测具有高序列不相似性的多克隆T细胞反应和TCR靶点。本文的透明同行评议过程记录包含在补充信息中。
{"title":"T cell receptor cross-reactivity prediction improved by a comprehensive mutational scan database.","authors":"Amitava Banerjee, David J Pattinson, Cornelia L Wincek, Paul Bunk, Armend Axhemi, Sarah R Chapin, Saket Navlakha, Hannah V Meyer","doi":"10.1016/j.cels.2025.101345","DOIUrl":"10.1016/j.cels.2025.101345","url":null,"abstract":"<p><p>Comprehensively mapping all targets of a T cell receptor (TCR) is important for predicting pathogenic escape and off-target effects of TCR therapies. However, this mapping has been challenging due to lack of unbiased benchmarking datasets and computational methods sensitive to small-peptide mutations. To address this, we curated the benchmark for activation of T cells with cross-reactive avidity for epitopes (BATCAVE) database, encompassing near-complete single-amino-acid mutational assays, centered around 25 immunogenic epitopes, across both major histocompatibility complex classes, against 151 human and mouse TCRs, containing 22,000+ TCR-peptide pairs in total. We then introduce Bayesian inference of activation of TCR by mutant antigens (BATMAN), an interpretable Bayesian model, trained on BATCAVE, for predicting the peptides that activate a TCR, and an active learning extension, which efficiently maps targets of a novel TCR by selecting a few peptides to assay. We show that BATMAN outperforms existing methods, reveals structural and biochemical predictors of TCR-peptide interactions, and can predict polyclonal T cell responses and TCR targets with high sequence dissimilarity. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101345"},"PeriodicalIF":7.7,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144719264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Cell systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1