Pub Date : 2025-09-17DOI: 10.1016/j.cels.2025.101401
Lindsay G Cowell, Scott Christley, Felix Breden, Kevin A Burns, Brian D Corrie, William D Lees, James A Overton, Bjoern Peters, Eve Richardson, Krishna M Roskin, Lonneke Scheffer, Randi Vita, Corey T Watson, Gur Yaari
Sequencing data elucidating adaptive immune receptor (AIR) repertoires, genomic loci encoding AIR genes, and AIR antigens are hosted in siloed repositories, limiting integrative analyses. We are in the process of creating the AIR Repertoire Knowledge Commons (AKC) by merging data from community-backed repositories and applying knowledge-generation algorithms to integrate and enrich data and metadata.
{"title":"The Adaptive Immune Receptor Repertoire Knowledge Commons: An invitation to the community.","authors":"Lindsay G Cowell, Scott Christley, Felix Breden, Kevin A Burns, Brian D Corrie, William D Lees, James A Overton, Bjoern Peters, Eve Richardson, Krishna M Roskin, Lonneke Scheffer, Randi Vita, Corey T Watson, Gur Yaari","doi":"10.1016/j.cels.2025.101401","DOIUrl":"https://doi.org/10.1016/j.cels.2025.101401","url":null,"abstract":"<p><p>Sequencing data elucidating adaptive immune receptor (AIR) repertoires, genomic loci encoding AIR genes, and AIR antigens are hosted in siloed repositories, limiting integrative analyses. We are in the process of creating the AIR Repertoire Knowledge Commons (AKC) by merging data from community-backed repositories and applying knowledge-generation algorithms to integrate and enrich data and metadata.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":"16 9","pages":"101401"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145088542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17Epub Date: 2025-09-10DOI: 10.1016/j.cels.2025.101396
Frederik Post, Annika Hausmann, Sonja Kabatnik, Sophia Steigerwald, Alexandra Brand, Ditte L Clement, Jonathan Skov, Kadi Lõhmussaar, Hjalte L Larsen, Martti Maimets, Theresa L Boye, Gabriela Jez, Toshiro Sato, Casper Steenholdt, Florian Rosenberger, Andreas Mund, Ole H Nielsen, Kim B Jensen, Matthias Mann
Intestinal epithelial damage predisposes to disorders like inflammatory bowel disease (IBD), with organoid transplantation emerging as a potential treatment. However, it is not known how well organoids recapitulate in vivo intestinal epithelial cells (IECs). We employed deep visual proteomics (DVP), integrating AI-guided cell classification, laser microdissection, and ultra-high-sensitivity proteomics at the single-cell level to generate an in-depth proteome resource of IECs directly isolated from the human colon and organoids. While in vitro organoids display high proliferation and low functional signatures, xenotransplantation induces a remarkable shift toward an in vivo-like phenotype. We recapitulated this transition by modifying culture conditions. Our data provide a comprehensive spatial proteomics resource and validate xenotransplanted organoids as suitable models for studying human IEC behavior with unprecedented molecular detail and demonstrate their clinical potential for patients with IBD and other intestinal disorders. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"Deep visual proteomics reveals an in vivo-like phenotype of orthotopically transplanted human colon organoids.","authors":"Frederik Post, Annika Hausmann, Sonja Kabatnik, Sophia Steigerwald, Alexandra Brand, Ditte L Clement, Jonathan Skov, Kadi Lõhmussaar, Hjalte L Larsen, Martti Maimets, Theresa L Boye, Gabriela Jez, Toshiro Sato, Casper Steenholdt, Florian Rosenberger, Andreas Mund, Ole H Nielsen, Kim B Jensen, Matthias Mann","doi":"10.1016/j.cels.2025.101396","DOIUrl":"10.1016/j.cels.2025.101396","url":null,"abstract":"<p><p>Intestinal epithelial damage predisposes to disorders like inflammatory bowel disease (IBD), with organoid transplantation emerging as a potential treatment. However, it is not known how well organoids recapitulate in vivo intestinal epithelial cells (IECs). We employed deep visual proteomics (DVP), integrating AI-guided cell classification, laser microdissection, and ultra-high-sensitivity proteomics at the single-cell level to generate an in-depth proteome resource of IECs directly isolated from the human colon and organoids. While in vitro organoids display high proliferation and low functional signatures, xenotransplantation induces a remarkable shift toward an in vivo-like phenotype. We recapitulated this transition by modifying culture conditions. Our data provide a comprehensive spatial proteomics resource and validate xenotransplanted organoids as suitable models for studying human IEC behavior with unprecedented molecular detail and demonstrate their clinical potential for patients with IBD and other intestinal disorders. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101396"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145042593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17Epub Date: 2025-09-05DOI: 10.1016/j.cels.2025.101373
Grant Goldman, Prathamesh Chati, Vasilis Ntranos
Deep mutational scanning (DMS) experiments have been successfully leveraged to understand genotype to phenotype mapping. However, the overwhelming majority of DMS have focused on amino acid substitutions. Thus, it remains unclear how indels differentially shape the fitness landscape relative to substitutions. To further our understanding of the relationship between substitutions and deletions, we leveraged a protein language model to analyze every single amino acid deletion in the human proteome. We discovered hundreds of thousands of sites that display opposing behavior for deletions versus substitutions: sites that can tolerate being substituted but not deleted or vice versa. We identified secondary structural elements and sequence context to be important mediators of differential tolerance. Our results underscore the value of deletion-substitution comparisons at the genome-wide scale, provide novel insights into how substitutions could systematically differ from deletions, and showcase the power of protein language models to generate biological hypotheses in silico.
{"title":"Uncovering differential tolerance to deletions versus substitutions with a protein language model.","authors":"Grant Goldman, Prathamesh Chati, Vasilis Ntranos","doi":"10.1016/j.cels.2025.101373","DOIUrl":"10.1016/j.cels.2025.101373","url":null,"abstract":"<p><p>Deep mutational scanning (DMS) experiments have been successfully leveraged to understand genotype to phenotype mapping. However, the overwhelming majority of DMS have focused on amino acid substitutions. Thus, it remains unclear how indels differentially shape the fitness landscape relative to substitutions. To further our understanding of the relationship between substitutions and deletions, we leveraged a protein language model to analyze every single amino acid deletion in the human proteome. We discovered hundreds of thousands of sites that display opposing behavior for deletions versus substitutions: sites that can tolerate being substituted but not deleted or vice versa. We identified secondary structural elements and sequence context to be important mediators of differential tolerance. Our results underscore the value of deletion-substitution comparisons at the genome-wide scale, provide novel insights into how substitutions could systematically differ from deletions, and showcase the power of protein language models to generate biological hypotheses in silico.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101373"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823221/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145008688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17Epub Date: 2025-09-10DOI: 10.1016/j.cels.2025.101387
Francesca-Zhoufan Li, Jason Yang, Kadina E Johnston, Emre Gürsoy, Yisong Yue, Frances H Arnold
Various machine learning-assisted directed evolution (MLDE) strategies have been shown to identify high-fitness protein variants more efficiently than typical directed evolution approaches. However, limited understanding of the factors influencing MLDE performance across diverse proteins has hindered optimal strategy selection for wet-lab campaigns. To address this, we systematically analyzed multiple MLDE strategies, including active learning and focused training using six distinct zero-shot predictors, across 16 diverse protein fitness landscapes. By quantifying landscape navigability with six attributes, we found that MLDE offers a greater advantage on landscapes that are more challenging for directed evolution, especially when focused training is combined with active learning. Despite varying levels of advantage across landscapes, focused training with zero-shot predictors leveraging distinct evolutionary, structural, and stability knowledge sources consistently outperforms random sampling for both binding interactions and enzyme activities. Our findings provide practical guidelines for selecting MLDE strategies for protein engineering. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"Evaluation of machine learning-assisted directed evolution across diverse combinatorial landscapes.","authors":"Francesca-Zhoufan Li, Jason Yang, Kadina E Johnston, Emre Gürsoy, Yisong Yue, Frances H Arnold","doi":"10.1016/j.cels.2025.101387","DOIUrl":"10.1016/j.cels.2025.101387","url":null,"abstract":"<p><p>Various machine learning-assisted directed evolution (MLDE) strategies have been shown to identify high-fitness protein variants more efficiently than typical directed evolution approaches. However, limited understanding of the factors influencing MLDE performance across diverse proteins has hindered optimal strategy selection for wet-lab campaigns. To address this, we systematically analyzed multiple MLDE strategies, including active learning and focused training using six distinct zero-shot predictors, across 16 diverse protein fitness landscapes. By quantifying landscape navigability with six attributes, we found that MLDE offers a greater advantage on landscapes that are more challenging for directed evolution, especially when focused training is combined with active learning. Despite varying levels of advantage across landscapes, focused training with zero-shot predictors leveraging distinct evolutionary, structural, and stability knowledge sources consistently outperforms random sampling for both binding interactions and enzyme activities. Our findings provide practical guidelines for selecting MLDE strategies for protein engineering. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101387"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145042533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deciphering the cell state dynamics is crucial for understanding biological processes. Single-cell lineage-tracing technologies provide an effective way to track single-cell lineages by heritable DNA barcodes, but the high missing rates of lineage barcodes and the intra-clonal heterogeneity bring great challenges to dissecting the mechanisms of cell fate decision. Here, we systematically evaluate the features of single-cell lineage-tracing data and then develop an algorithm, scTrace+, to enhance the cell dynamic traces by incorporating multi-faceted transcriptomic similarities into lineage relationships via a kernelized probabilistic matrix factorization model. We assess its feasibility and performance by conducting ablation and benchmarking experiments on multiple real datasets and show that scTrace+ can accurately predict the fates of cells. Further, scTrace+ effectively identifies some important driver genes implicated in cellular fate decisions of diverse biological processes, such as cell differentiation or tumor drug responses. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"scTrace+: Enhancing cell fate inference by integrating the lineage-tracing and multi-faceted transcriptomic similarity information.","authors":"Wenbo Guo, Zeyu Chen, Xinqi Li, Jingmin Huang, Qifan Hu, Jin Gu","doi":"10.1016/j.cels.2025.101398","DOIUrl":"10.1016/j.cels.2025.101398","url":null,"abstract":"<p><p>Deciphering the cell state dynamics is crucial for understanding biological processes. Single-cell lineage-tracing technologies provide an effective way to track single-cell lineages by heritable DNA barcodes, but the high missing rates of lineage barcodes and the intra-clonal heterogeneity bring great challenges to dissecting the mechanisms of cell fate decision. Here, we systematically evaluate the features of single-cell lineage-tracing data and then develop an algorithm, scTrace+, to enhance the cell dynamic traces by incorporating multi-faceted transcriptomic similarities into lineage relationships via a kernelized probabilistic matrix factorization model. We assess its feasibility and performance by conducting ablation and benchmarking experiments on multiple real datasets and show that scTrace+ can accurately predict the fates of cells. Further, scTrace+ effectively identifies some important driver genes implicated in cellular fate decisions of diverse biological processes, such as cell differentiation or tumor drug responses. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101398"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145042600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17Epub Date: 2025-08-22DOI: 10.1016/j.cels.2025.101371
Ruoxi Zhang, Ben Ma, Gang Xu, Jianpeng Ma
Protein language models (PLMs), such as the highly successful ESM-2, have proven particularly effective. However, language models designed for RNA continue to face challenges. A key question is as follows: can the information derived from PLMs be harnessed and transferred to RNA? To investigate this, a model termed ProtRNA has been developed by a cross-modality transfer learning strategy for addressing the challenges posed by RNA's limited and less conserved sequences. By leveraging the evolutionary and physicochemical information encoded in protein sequences, the ESM-2 model is adapted to processing "low-resource" RNA sequence data. The results show comparable or superior performance in various RNA downstream tasks, with only 1/8 the trainable parameters and 1/6 the training data employed by the primary reference baseline RNA language model. This approach highlights the potential of cross-modality transfer learning in biological language models.
{"title":"ProtRNA: A protein-derived RNA language model by cross-modality transfer learning.","authors":"Ruoxi Zhang, Ben Ma, Gang Xu, Jianpeng Ma","doi":"10.1016/j.cels.2025.101371","DOIUrl":"10.1016/j.cels.2025.101371","url":null,"abstract":"<p><p>Protein language models (PLMs), such as the highly successful ESM-2, have proven particularly effective. However, language models designed for RNA continue to face challenges. A key question is as follows: can the information derived from PLMs be harnessed and transferred to RNA? To investigate this, a model termed ProtRNA has been developed by a cross-modality transfer learning strategy for addressing the challenges posed by RNA's limited and less conserved sequences. By leveraging the evolutionary and physicochemical information encoded in protein sequences, the ESM-2 model is adapted to processing \"low-resource\" RNA sequence data. The results show comparable or superior performance in various RNA downstream tasks, with only 1/8 the trainable parameters and 1/6 the training data employed by the primary reference baseline RNA language model. This approach highlights the potential of cross-modality transfer learning in biological language models.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101371"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144982416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17Epub Date: 2025-08-13DOI: 10.1016/j.cels.2025.101368
Milind Jagota, Chloe Hsu, Thomas Mazumder, Kevin Sung, William S DeWitt, Jennifer Listgarten, Frederick A Matsen Iv, Chun Jimmie Ye, Yun S Song
Although antibody sequences are highly diverse, they are constrained by requirements for expression and limited off-target reactivity. Describing which sequences violate such constraints has proven to be difficult. Here, we introduce a machine-learning framework to leverage a previously underutilized source of data for this problem. We use human single-cell sequencing data to find instances of allelic inclusion, a rare event where B cells express two different antibody light chains as mRNA. Previous studies suggest that one of these chains is either autoreactive or non-expressing as protein. We train machine-learning models to identify abnormal sequences associated with allelic inclusion. The resulting models generalize to predict antibody properties including polyreactivity, surface expression, and mutation usage, outperforming methods that do not use allelic inclusion data. We also investigate similar selection forces on the heavy chain in mice and observe that surrogate light-chain pairing has a large impact on heavy-chain diversity.
{"title":"Learning antibody sequence constraints from allelic inclusion.","authors":"Milind Jagota, Chloe Hsu, Thomas Mazumder, Kevin Sung, William S DeWitt, Jennifer Listgarten, Frederick A Matsen Iv, Chun Jimmie Ye, Yun S Song","doi":"10.1016/j.cels.2025.101368","DOIUrl":"10.1016/j.cels.2025.101368","url":null,"abstract":"<p><p>Although antibody sequences are highly diverse, they are constrained by requirements for expression and limited off-target reactivity. Describing which sequences violate such constraints has proven to be difficult. Here, we introduce a machine-learning framework to leverage a previously underutilized source of data for this problem. We use human single-cell sequencing data to find instances of allelic inclusion, a rare event where B cells express two different antibody light chains as mRNA. Previous studies suggest that one of these chains is either autoreactive or non-expressing as protein. We train machine-learning models to identify abnormal sequences associated with allelic inclusion. The resulting models generalize to predict antibody properties including polyreactivity, surface expression, and mutation usage, outperforming methods that do not use allelic inclusion data. We also investigate similar selection forces on the heavy chain in mice and observe that surrogate light-chain pairing has a large impact on heavy-chain diversity.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101368"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17Epub Date: 2025-09-10DOI: 10.1016/j.cels.2025.101397
Daniel Martinez-Martinez, Tanara V Peres, Kristin Gehling, Leonor Quintaneiro, Cecilia Cabrera, Maksym Cherevatenko, Stephen J Cutty, Lena Best, Georgios Marinos, Johannes Zimmerman, Ayesha Safoor, Despoina Chrysostomou, Joao B Mokochinski, Alex Montoya, Susanne Brodesser, Michalina Zatorska, Timothy Scott, Ivan Andrew, Holger Kramer, Masuma Begum, Bian Zhang, Bernard T Golding, Julian R Marchesi, Susumu Hirabayashi, Christoph Kaleta, Alexis R Barr, Christian Frezza, Helena M Cochemé, Filipe Cabreiro
Understanding how the microbiota produces regulatory metabolites is of significance for cancer and cancer therapy. Using a host-microbe-drug-nutrient 4-way screening approach, we evaluated the role of nutrition at the molecular level in the context of 5-fluorouracil toxicity. Notably, our screens identified the metabolite 2-methylisocitrate, which was found to be produced and enriched in human tumor-associated microbiota. 2-methylisocitrate exhibits anti-proliferative properties across genetically and tissue-diverse cancer cell lines, three-dimensional (3D) spheroids, and an in vivo Drosophila gut tumor model, where it reduced tumor dissemination and increased survival. Chemical landscape interaction screens identified drug-metabolite signatures and highlighted the synergy between 5-fluorouracil and 2-methylisocitrate. Multi-omic analyses revealed that 2-methylisocitrate acts via multiple cellular pathways linking metabolism and DNA damage to regulate chemotherapy. Finally, we converted 2-methylisocitrate into its trimethyl ester, thereby enhancing its potency. This work highlights the great impact of microbiome-derived metabolites on tumor proliferation and their potential as promising co-adjuvants for cancer treatment.
{"title":"Chemotherapy modulation by a cancer-associated microbiota metabolite.","authors":"Daniel Martinez-Martinez, Tanara V Peres, Kristin Gehling, Leonor Quintaneiro, Cecilia Cabrera, Maksym Cherevatenko, Stephen J Cutty, Lena Best, Georgios Marinos, Johannes Zimmerman, Ayesha Safoor, Despoina Chrysostomou, Joao B Mokochinski, Alex Montoya, Susanne Brodesser, Michalina Zatorska, Timothy Scott, Ivan Andrew, Holger Kramer, Masuma Begum, Bian Zhang, Bernard T Golding, Julian R Marchesi, Susumu Hirabayashi, Christoph Kaleta, Alexis R Barr, Christian Frezza, Helena M Cochemé, Filipe Cabreiro","doi":"10.1016/j.cels.2025.101397","DOIUrl":"10.1016/j.cels.2025.101397","url":null,"abstract":"<p><p>Understanding how the microbiota produces regulatory metabolites is of significance for cancer and cancer therapy. Using a host-microbe-drug-nutrient 4-way screening approach, we evaluated the role of nutrition at the molecular level in the context of 5-fluorouracil toxicity. Notably, our screens identified the metabolite 2-methylisocitrate, which was found to be produced and enriched in human tumor-associated microbiota. 2-methylisocitrate exhibits anti-proliferative properties across genetically and tissue-diverse cancer cell lines, three-dimensional (3D) spheroids, and an in vivo Drosophila gut tumor model, where it reduced tumor dissemination and increased survival. Chemical landscape interaction screens identified drug-metabolite signatures and highlighted the synergy between 5-fluorouracil and 2-methylisocitrate. Multi-omic analyses revealed that 2-methylisocitrate acts via multiple cellular pathways linking metabolism and DNA damage to regulate chemotherapy. Finally, we converted 2-methylisocitrate into its trimethyl ester, thereby enhancing its potency. This work highlights the great impact of microbiome-derived metabolites on tumor proliferation and their potential as promising co-adjuvants for cancer treatment.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101397"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145042552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial transcriptomics allows for the measurement of gene expression within the native tissue context. However, despite technological advancements, computational methods to link cell states with their microenvironment and compare these relationships across samples and conditions remain limited. To address this, we introduce Tissue Motif-Based Spatial Inference across Conditions (TissueMosaic), a self-supervised convolutional neural network designed to discover and represent tissue architectural motifs from multi-sample spatial transcriptomic datasets. TissueMosaic further links these motifs to gene expression, enabling the study of how changes in tissue structure impact cell-intrinsic function. TissueMosaic increases the signal-to-noise ratio of spatial differential expression analysis through a motif enrichment strategy, resulting in more reliable detection of genes that covary with tissue structure changes. Here, we demonstrate that TissueMosaic learns representations that outperform neighborhood cell-type composition baselines and existing methods on downstream tasks. These findings underscore the potential of self-supervised learning to advance spatial transcriptomics discovery.
{"title":"TissueMosaic: Self-supervised learning of tissue representations enables differential spatial transcriptomics across samples.","authors":"Sandeep Kambhampati, Luca D'Alessio, Fedor Grab, Stephen Fleming, Sophia Liu, Ruth Raichur, Fei Chen, Mehrtash Babadi","doi":"10.1016/j.cels.2025.101394","DOIUrl":"10.1016/j.cels.2025.101394","url":null,"abstract":"<p><p>Spatial transcriptomics allows for the measurement of gene expression within the native tissue context. However, despite technological advancements, computational methods to link cell states with their microenvironment and compare these relationships across samples and conditions remain limited. To address this, we introduce Tissue Motif-Based Spatial Inference across Conditions (TissueMosaic), a self-supervised convolutional neural network designed to discover and represent tissue architectural motifs from multi-sample spatial transcriptomic datasets. TissueMosaic further links these motifs to gene expression, enabling the study of how changes in tissue structure impact cell-intrinsic function. TissueMosaic increases the signal-to-noise ratio of spatial differential expression analysis through a motif enrichment strategy, resulting in more reliable detection of genes that covary with tissue structure changes. Here, we demonstrate that TissueMosaic learns representations that outperform neighborhood cell-type composition baselines and existing methods on downstream tasks. These findings underscore the potential of self-supervised learning to advance spatial transcriptomics discovery.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101394"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17Epub Date: 2025-09-08DOI: 10.1016/j.cels.2025.101374
Huangqingbo Sun, Shiqiu Yu, Anna Martinez Casals, Anna Bäckström, Yuxin Lu, Cecilia Lindskog, Matthew Ruffalo, Emma Lundberg, Robert F Murphy
Identifying cell types in highly multiplexed images is essential for understanding tissue spatial organization. Current cell-type annotation methods often rely on extensive reference images and manual adjustments. In this work, we present a tool, the Robust Image-Based Cell Annotator (RIBCA), that enables accurate, automated, unbiased, and fine-grained cell-type annotation for images with a wide range of antibody panels without requiring additional model training or human intervention. Our tool has successfully annotated over 3 million cells, revealing the spatial organization of various cell types across more than 40 different human tissues. It is open source and features a modular design, allowing for easy extension to additional cell types.
{"title":"Flexible and robust cell-type annotation for highly multiplexed tissue images.","authors":"Huangqingbo Sun, Shiqiu Yu, Anna Martinez Casals, Anna Bäckström, Yuxin Lu, Cecilia Lindskog, Matthew Ruffalo, Emma Lundberg, Robert F Murphy","doi":"10.1016/j.cels.2025.101374","DOIUrl":"10.1016/j.cels.2025.101374","url":null,"abstract":"<p><p>Identifying cell types in highly multiplexed images is essential for understanding tissue spatial organization. Current cell-type annotation methods often rely on extensive reference images and manual adjustments. In this work, we present a tool, the Robust Image-Based Cell Annotator (RIBCA), that enables accurate, automated, unbiased, and fine-grained cell-type annotation for images with a wide range of antibody panels without requiring additional model training or human intervention. Our tool has successfully annotated over 3 million cells, revealing the spatial organization of various cell types across more than 40 different human tissues. It is open source and features a modular design, allowing for easy extension to additional cell types.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":" ","pages":"101374"},"PeriodicalIF":7.7,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12728825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}