Pub Date : 2024-10-16Epub Date: 2024-10-04DOI: 10.1016/j.cels.2024.09.004
Yiqi Huang, Christian Urban, Philipp Hubel, Alexey Stukalov, Andreas Pichlmair
The abundance of a protein is defined by its continuous synthesis and degradation, a process known as protein turnover. Here, we systematically profiled the turnover of proteins in influenza A virus (IAV)-infected cells using a pulse-chase stable isotope labeling by amino acids in cell culture (SILAC)-based approach combined with downstream statistical modeling. We identified 1,798 virus-affected proteins with turnover changes (tVAPs) out of 7,739 detected proteins (data available at pulsechase.innatelab.org). In particular, the affected proteins were involved in RNA transcription, splicing and nuclear transport, protein translation and stability, and energy metabolism. Many tVAPs appeared to be known IAV-interacting proteins that regulate virus propagation, such as KPNA6, PPP6C, and POLR2A. Notably, our analysis identified additional IAV host and restriction factors, such as the splicing factor GPKOW, that exhibit significant turnover rate changes while their total abundance is minimally affected. Overall, we show that protein turnover is a critical factor both for virus replication and antiviral defense.
{"title":"Protein turnover regulation is critical for influenza A virus infection.","authors":"Yiqi Huang, Christian Urban, Philipp Hubel, Alexey Stukalov, Andreas Pichlmair","doi":"10.1016/j.cels.2024.09.004","DOIUrl":"10.1016/j.cels.2024.09.004","url":null,"abstract":"<p><p>The abundance of a protein is defined by its continuous synthesis and degradation, a process known as protein turnover. Here, we systematically profiled the turnover of proteins in influenza A virus (IAV)-infected cells using a pulse-chase stable isotope labeling by amino acids in cell culture (SILAC)-based approach combined with downstream statistical modeling. We identified 1,798 virus-affected proteins with turnover changes (tVAPs) out of 7,739 detected proteins (data available at pulsechase.innatelab.org). In particular, the affected proteins were involved in RNA transcription, splicing and nuclear transport, protein translation and stability, and energy metabolism. Many tVAPs appeared to be known IAV-interacting proteins that regulate virus propagation, such as KPNA6, PPP6C, and POLR2A. Notably, our analysis identified additional IAV host and restriction factors, such as the splicing factor GPKOW, that exhibit significant turnover rate changes while their total abundance is minimally affected. Overall, we show that protein turnover is a critical factor both for virus replication and antiviral defense.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142378731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16Epub Date: 2024-10-04DOI: 10.1016/j.cels.2024.09.001
Alba Jiménez, Alessandra Lucchetti, Mathias S Heltberg, Liv Moretto, Carlos Sanchez, Ashwini Jambhekar, Mogens H Jensen, Galit Lahav
The tumor suppressor p53 responds to cellular stress and activates transcription programs critical for regulating cell fate. DNA damage triggers oscillations in p53 levels with a robust period. Guided by the theory of synchronization and entrainment, we developed a mathematical model and experimental system to test the ability of the p53 oscillator to entrain to external drug pulses of various periods and strengths. We found that the p53 oscillator can be locked and entrained to a wide range of entrainment modes. External periods far from p53's natural oscillations increased the heterogeneity between individual cells whereas stronger inputs reduced it. Single-cell measurements allowed deriving the phase response curves (PRCs) and multiple Arnold tongues of p53. In addition, multi-stability and non-linear behaviors were mathematically predicted and experimentally detected, including mode hopping, period doubling, and chaos. Our work revealed critical dynamical properties of the p53 oscillator and provided insights into understanding and controlling it. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"Entrainment and multi-stability of the p53 oscillator in human cells.","authors":"Alba Jiménez, Alessandra Lucchetti, Mathias S Heltberg, Liv Moretto, Carlos Sanchez, Ashwini Jambhekar, Mogens H Jensen, Galit Lahav","doi":"10.1016/j.cels.2024.09.001","DOIUrl":"10.1016/j.cels.2024.09.001","url":null,"abstract":"<p><p>The tumor suppressor p53 responds to cellular stress and activates transcription programs critical for regulating cell fate. DNA damage triggers oscillations in p53 levels with a robust period. Guided by the theory of synchronization and entrainment, we developed a mathematical model and experimental system to test the ability of the p53 oscillator to entrain to external drug pulses of various periods and strengths. We found that the p53 oscillator can be locked and entrained to a wide range of entrainment modes. External periods far from p53's natural oscillations increased the heterogeneity between individual cells whereas stronger inputs reduced it. Single-cell measurements allowed deriving the phase response curves (PRCs) and multiple Arnold tongues of p53. In addition, multi-stability and non-linear behaviors were mathematically predicted and experimentally detected, including mode hopping, period doubling, and chaos. Our work revealed critical dynamical properties of the p53 oscillator and provided insights into understanding and controlling it. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142378730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16Epub Date: 2024-10-08DOI: 10.1016/j.cels.2024.09.006
Zander Harteveld, Alexandra Van Hall-Beauvais, Irina Morozova, Joshua Southern, Casper Goverde, Sandrine Georgeon, Stéphane Rosset, Michëal Defferrard, Andreas Loukas, Pierre Vandergheynst, Michael M Bronstein, Bruno E Correia
De novo protein design explores uncharted sequence and structure space to generate novel proteins not sampled by evolution. A main challenge in de novo design involves crafting "designable" structural templates to guide the sequence searches toward adopting target structures. We present a convolutional variational autoencoder that learns patterns of protein structure, dubbed Genesis. We coupled Genesis with trRosetta to design sequences for a set of protein folds and found that Genesis is capable of reconstructing native-like distance and angle distributions for five native folds and three novel, the so-called "dark-matter" folds as a demonstration of generalizability. We used a high-throughput assay to characterize the stability of the designs through protease resistance, obtaining encouraging success rates for folded proteins. Genesis enables exploration of the protein fold space within minutes, unrestricted by protein topologies. Our approach addresses the backbone designability problem, showing that small neural networks can efficiently learn structural patterns in proteins. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"Exploring \"dark-matter\" protein folds using deep learning.","authors":"Zander Harteveld, Alexandra Van Hall-Beauvais, Irina Morozova, Joshua Southern, Casper Goverde, Sandrine Georgeon, Stéphane Rosset, Michëal Defferrard, Andreas Loukas, Pierre Vandergheynst, Michael M Bronstein, Bruno E Correia","doi":"10.1016/j.cels.2024.09.006","DOIUrl":"10.1016/j.cels.2024.09.006","url":null,"abstract":"<p><p>De novo protein design explores uncharted sequence and structure space to generate novel proteins not sampled by evolution. A main challenge in de novo design involves crafting \"designable\" structural templates to guide the sequence searches toward adopting target structures. We present a convolutional variational autoencoder that learns patterns of protein structure, dubbed Genesis. We coupled Genesis with trRosetta to design sequences for a set of protein folds and found that Genesis is capable of reconstructing native-like distance and angle distributions for five native folds and three novel, the so-called \"dark-matter\" folds as a demonstration of generalizability. We used a high-throughput assay to characterize the stability of the designs through protease resistance, obtaining encouraging success rates for folded proteins. Genesis enables exploration of the protein fold space within minutes, unrestricted by protein topologies. Our approach addresses the backbone designability problem, showing that small neural networks can efficiently learn structural patterns in proteins. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142395960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18Epub Date: 2024-09-06DOI: 10.1016/j.cels.2024.08.005
Milton Pividori, Marylyn D Ritchie, Diego H Milone, Casey S Greene
Identifying meaningful patterns in data is crucial for understanding complex biological processes, particularly in transcriptomics, where genes with correlated expression often share functions or contribute to disease mechanisms. Traditional correlation coefficients, which primarily capture linear relationships, may overlook important nonlinear patterns. We introduce the clustermatch correlation coefficient (CCC), a not-only-linear coefficient that utilizes clustering to efficiently detect both linear and nonlinear associations. CCC outperforms standard methods by revealing biologically meaningful patterns that linear-only coefficients miss and is faster than state-of-the-art coefficients such as the maximal information coefficient. When applied to human gene expression data from genotype-tissue expression (GTEx), CCC identified robust linear relationships and nonlinear patterns, such as sex-specific differences, that are undetectable by standard methods. Highly ranked gene pairs were enriched for interactions in integrated networks built from protein-protein interactions, transcription factor regulation, and chemical and genetic perturbations, suggesting that CCC can detect functional relationships missed by linear-only approaches. CCC is a highly efficient, next-generation, not-only-linear correlation coefficient for genome-scale data. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"An efficient, not-only-linear correlation coefficient based on clustering.","authors":"Milton Pividori, Marylyn D Ritchie, Diego H Milone, Casey S Greene","doi":"10.1016/j.cels.2024.08.005","DOIUrl":"10.1016/j.cels.2024.08.005","url":null,"abstract":"<p><p>Identifying meaningful patterns in data is crucial for understanding complex biological processes, particularly in transcriptomics, where genes with correlated expression often share functions or contribute to disease mechanisms. Traditional correlation coefficients, which primarily capture linear relationships, may overlook important nonlinear patterns. We introduce the clustermatch correlation coefficient (CCC), a not-only-linear coefficient that utilizes clustering to efficiently detect both linear and nonlinear associations. CCC outperforms standard methods by revealing biologically meaningful patterns that linear-only coefficients miss and is faster than state-of-the-art coefficients such as the maximal information coefficient. When applied to human gene expression data from genotype-tissue expression (GTEx), CCC identified robust linear relationships and nonlinear patterns, such as sex-specific differences, that are undetectable by standard methods. Highly ranked gene pairs were enriched for interactions in integrated networks built from protein-protein interactions, transcription factor regulation, and chemical and genetic perturbations, suggesting that CCC can detect functional relationships missed by linear-only approaches. CCC is a highly efficient, next-generation, not-only-linear correlation coefficient for genome-scale data. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18Epub Date: 2024-09-04DOI: 10.1016/j.cels.2024.08.004
Andre Neil Forbes, Duo Xu, Sandra Cohen, Priya Pancholi, Ekta Khurana
Most cancer types lack targeted therapeutic options, and when first-line targeted therapies are available, treatment resistance is a huge challenge. Recent technological advances enable the use of assay for transposase-accessible chromatin with sequencing (ATAC-seq) and RNA sequencing (RNA-seq) on patient tissue in a high-throughput manner. Here, we present a computational approach that leverages these datasets to identify drug targets based on tumor lineage. We constructed gene regulatory networks for 371 patients of 22 cancer types using machine learning approaches trained with three-dimensional genomic data for enhancer-to-promoter contacts. Next, we identified the key transcription factors (TFs) in these networks, which are used to find therapeutic vulnerabilities, by direct targeting of either TFs or the proteins that they interact with. We validated four candidates identified for neuroendocrine, liver, and renal cancers, which have a dismal prognosis with current therapeutic options.
{"title":"Discovery of therapeutic targets in cancer using chromatin accessibility and transcriptomic data.","authors":"Andre Neil Forbes, Duo Xu, Sandra Cohen, Priya Pancholi, Ekta Khurana","doi":"10.1016/j.cels.2024.08.004","DOIUrl":"10.1016/j.cels.2024.08.004","url":null,"abstract":"<p><p>Most cancer types lack targeted therapeutic options, and when first-line targeted therapies are available, treatment resistance is a huge challenge. Recent technological advances enable the use of assay for transposase-accessible chromatin with sequencing (ATAC-seq) and RNA sequencing (RNA-seq) on patient tissue in a high-throughput manner. Here, we present a computational approach that leverages these datasets to identify drug targets based on tumor lineage. We constructed gene regulatory networks for 371 patients of 22 cancer types using machine learning approaches trained with three-dimensional genomic data for enhancer-to-promoter contacts. Next, we identified the key transcription factors (TFs) in these networks, which are used to find therapeutic vulnerabilities, by direct targeting of either TFs or the proteins that they interact with. We validated four candidates identified for neuroendocrine, liver, and renal cancers, which have a dismal prognosis with current therapeutic options.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11415227/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The regulation of genes can be mathematically described by input-output functions that are typically assumed to be time invariant. This fundamental assumption underpins the design of synthetic gene circuits and the quantitative understanding of natural gene regulatory networks. Here, we found that this assumption is challenged in mammalian cells. We observed that a synthetic reporter gene can exhibit unexpected transcriptional memory, leading to a shift in the dose-response curve upon a second induction. Mechanistically, we investigated the cis-dependency of transcriptional memory, revealing the necessity of promoter DNA methylation in establishing memory. Furthermore, we showed that the synthetic transcription factor's effective DNA binding affinity underlies trans-dependency, which is associated with its capacity to undergo biomolecular condensation. These principles enabled modulating memory by perturbing either cis- or trans-regulation of genes. Together, our findings suggest the potential pervasiveness of transcriptional memory and implicate the need to model mammalian gene regulation with time-varying input-output functions. A record of this paper's transparent peer review process is included in the supplemental information.
基因的调控可以用输入-输出函数进行数学描述,这些函数通常被假定为时间不变。这一基本假设是设计合成基因回路和定量理解天然基因调控网络的基础。在这里,我们发现这一假设在哺乳动物细胞中受到了挑战。我们观察到,合成报告基因会表现出意想不到的转录记忆,导致剂量反应曲线在第二次诱导时发生移动。从机理上讲,我们研究了转录记忆的顺式依赖性,揭示了启动子 DNA 甲基化对建立记忆的必要性。此外,我们还发现合成转录因子的有效 DNA 结合亲和力是反式依赖性的基础,而反式依赖性与其进行生物分子缩聚的能力有关。这些原理使我们能够通过干扰基因的顺式或反式调控来调节记忆。总之,我们的研究结果表明转录记忆具有潜在的普遍性,并暗示了利用时变输入-输出功能来模拟哺乳动物基因调控的必要性。本文的同行评审过程透明,其记录见补充信息。
{"title":"Promoter DNA methylation and transcription factor condensation are linked to transcriptional memory in mammalian cells.","authors":"Shenqi Fan, Liang Ma, Chengzhi Song, Xu Han, Bijunyao Zhong, Yihan Lin","doi":"10.1016/j.cels.2024.08.007","DOIUrl":"10.1016/j.cels.2024.08.007","url":null,"abstract":"<p><p>The regulation of genes can be mathematically described by input-output functions that are typically assumed to be time invariant. This fundamental assumption underpins the design of synthetic gene circuits and the quantitative understanding of natural gene regulatory networks. Here, we found that this assumption is challenged in mammalian cells. We observed that a synthetic reporter gene can exhibit unexpected transcriptional memory, leading to a shift in the dose-response curve upon a second induction. Mechanistically, we investigated the cis-dependency of transcriptional memory, revealing the necessity of promoter DNA methylation in establishing memory. Furthermore, we showed that the synthetic transcription factor's effective DNA binding affinity underlies trans-dependency, which is associated with its capacity to undergo biomolecular condensation. These principles enabled modulating memory by perturbing either cis- or trans-regulation of genes. Together, our findings suggest the potential pervasiveness of transcriptional memory and implicate the need to model mammalian gene regulation with time-varying input-output functions. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18Epub Date: 2024-09-04DOI: 10.1016/j.cels.2024.08.001
Elisa Gallo, Stefano De Renzis, James Sharpe, Roberto Mayor, Jonas Hartmann
The discovery of general principles underlying the complexity and diversity of cellular and developmental systems is a central and long-standing aim of biology. While new technologies collect data at an ever-accelerating rate, there is growing concern that conceptual progress is not keeping pace. We contend that this is due to a paucity of conceptual frameworks that support meaningful generalizations. This led us to develop the core and periphery (C&P) hypothesis, which posits that many biological systems can be decomposed into a highly versatile core with a large behavioral repertoire and a specific periphery that configures said core to perform one particular function. Versatile cores tend to be widely reused across biology, which confers generality to theories describing them. Here, we introduce this concept and describe examples at multiple scales, including Turing patterning, actomyosin dynamics, multi-cellular morphogenesis, and vertebrate gastrulation. We also sketch its evolutionary basis and discuss key implications and open questions. We propose that the C&P hypothesis could unlock new avenues of conceptual progress in mesoscale biology.
{"title":"Versatile system cores as a conceptual basis for generality in cell and developmental biology.","authors":"Elisa Gallo, Stefano De Renzis, James Sharpe, Roberto Mayor, Jonas Hartmann","doi":"10.1016/j.cels.2024.08.001","DOIUrl":"10.1016/j.cels.2024.08.001","url":null,"abstract":"<p><p>The discovery of general principles underlying the complexity and diversity of cellular and developmental systems is a central and long-standing aim of biology. While new technologies collect data at an ever-accelerating rate, there is growing concern that conceptual progress is not keeping pace. We contend that this is due to a paucity of conceptual frameworks that support meaningful generalizations. This led us to develop the core and periphery (C&P) hypothesis, which posits that many biological systems can be decomposed into a highly versatile core with a large behavioral repertoire and a specific periphery that configures said core to perform one particular function. Versatile cores tend to be widely reused across biology, which confers generality to theories describing them. Here, we introduce this concept and describe examples at multiple scales, including Turing patterning, actomyosin dynamics, multi-cellular morphogenesis, and vertebrate gastrulation. We also sketch its evolutionary basis and discuss key implications and open questions. We propose that the C&P hypothesis could unlock new avenues of conceptual progress in mesoscale biology.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18Epub Date: 2024-09-06DOI: 10.1016/j.cels.2024.08.006
Ruoqiao Chen, Jiayu Zhou, Bin Chen
Cell surface proteins serve as primary drug targets and cell identity markers. Techniques such as CITE-seq (cellular indexing of transcriptomes and epitopes by sequencing) have enabled the simultaneous quantification of surface protein abundance and transcript expression within individual cells. The published data have been utilized to train machine learning models for predicting surface protein abundance solely from transcript expression. However, the small scale of proteins predicted and the poor generalization ability of these computational approaches across diverse contexts (e.g., different tissues/disease states) impede their widespread adoption. Here, we propose SPIDER (surface protein prediction using deep ensembles from single-cell RNA sequencing), a context-agnostic zero-shot deep ensemble model, which enables large-scale protein abundance prediction and generalizes better to various contexts. Comprehensive benchmarking shows that SPIDER outperforms other state-of-the-art methods. Using the predicted surface abundance of >2,500 proteins from single-cell transcriptomes, we demonstrate the broad applications of SPIDER, including cell type annotation, biomarker/target identification, and cell-cell interaction analysis in hepatocellular carcinoma and colorectal cancer. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"Imputing abundance of over 2,500 surface proteins from single-cell transcriptomes with context-agnostic zero-shot deep ensembles.","authors":"Ruoqiao Chen, Jiayu Zhou, Bin Chen","doi":"10.1016/j.cels.2024.08.006","DOIUrl":"10.1016/j.cels.2024.08.006","url":null,"abstract":"<p><p>Cell surface proteins serve as primary drug targets and cell identity markers. Techniques such as CITE-seq (cellular indexing of transcriptomes and epitopes by sequencing) have enabled the simultaneous quantification of surface protein abundance and transcript expression within individual cells. The published data have been utilized to train machine learning models for predicting surface protein abundance solely from transcript expression. However, the small scale of proteins predicted and the poor generalization ability of these computational approaches across diverse contexts (e.g., different tissues/disease states) impede their widespread adoption. Here, we propose SPIDER (surface protein prediction using deep ensembles from single-cell RNA sequencing), a context-agnostic zero-shot deep ensemble model, which enables large-scale protein abundance prediction and generalizes better to various contexts. Comprehensive benchmarking shows that SPIDER outperforms other state-of-the-art methods. Using the predicted surface abundance of >2,500 proteins from single-cell transcriptomes, we demonstrate the broad applications of SPIDER, including cell type annotation, biomarker/target identification, and cell-cell interaction analysis in hepatocellular carcinoma and colorectal cancer. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11423933/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-18Epub Date: 2024-09-04DOI: 10.1016/j.cels.2024.08.002
Chandana Gopalakrishnappa, Zeqian Li, Seppe Kuehn
Interactions between photosynthetic and heterotrophic microbes play a key role in global primary production. Understanding phototroph-heterotroph interactions remains challenging because these microbes reside in chemically complex environments. Here, we leverage a massively parallel droplet microfluidic platform that enables us to interrogate interactions between photosynthetic algae and heterotrophic bacteria in >100,000 communities across ∼525 environmental conditions with varying pH, carbon availability, and phosphorus availability. By developing a statistical framework to dissect interactions in this complex dataset, we reveal that the dependence of algae-bacteria interactions on nutrient availability is strongly modulated by pH and buffering capacity. Furthermore, we show that the chemical identity of the available organic carbon source controls how pH, buffering capacity, and nutrient availability modulate algae-bacteria interactions. Our study reveals the previously underappreciated role of pH in modulating phototroph-heterotroph interactions and provides a framework for thinking about interactions between phototrophs and heterotrophs in more natural contexts.
{"title":"Environmental modulators of algae-bacteria interactions at scale.","authors":"Chandana Gopalakrishnappa, Zeqian Li, Seppe Kuehn","doi":"10.1016/j.cels.2024.08.002","DOIUrl":"10.1016/j.cels.2024.08.002","url":null,"abstract":"<p><p>Interactions between photosynthetic and heterotrophic microbes play a key role in global primary production. Understanding phototroph-heterotroph interactions remains challenging because these microbes reside in chemically complex environments. Here, we leverage a massively parallel droplet microfluidic platform that enables us to interrogate interactions between photosynthetic algae and heterotrophic bacteria in >100,000 communities across ∼525 environmental conditions with varying pH, carbon availability, and phosphorus availability. By developing a statistical framework to dissect interactions in this complex dataset, we reveal that the dependence of algae-bacteria interactions on nutrient availability is strongly modulated by pH and buffering capacity. Furthermore, we show that the chemical identity of the available organic carbon source controls how pH, buffering capacity, and nutrient availability modulate algae-bacteria interactions. Our study reveals the previously underappreciated role of pH in modulating phototroph-heterotroph interactions and provides a framework for thinking about interactions between phototrophs and heterotrophs in more natural contexts.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11412779/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142142058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-21DOI: 10.1016/j.cels.2024.07.004
Zhixin Cyrillus Tan, Aaron S Meyer
Recent biological studies have been revolutionized in scale and granularity by multiplex and high-throughput assays. Profiling cell responses across several experimental parameters, such as perturbations, time, and genetic contexts, leads to richer and more generalizable findings. However, these multidimensional datasets necessitate a reevaluation of the conventional methods for their representation and analysis. Traditionally, experimental parameters are merged to flatten the data into a two-dimensional matrix, sacrificing crucial experiment context reflected by the structure. As Marshall McLuhan famously stated, "the medium is the message." In this work, we propose that the experiment structure is the medium in which subsequent analysis is performed, and the optimal choice of data representation must reflect the experiment structure. We review how tensor-structured analyses and decompositions can preserve this information. We contend that tensor methods are poised to become integral to the biomedical data sciences toolkit.
{"title":"The structure is the message: Preserving experimental context through tensor decomposition.","authors":"Zhixin Cyrillus Tan, Aaron S Meyer","doi":"10.1016/j.cels.2024.07.004","DOIUrl":"10.1016/j.cels.2024.07.004","url":null,"abstract":"<p><p>Recent biological studies have been revolutionized in scale and granularity by multiplex and high-throughput assays. Profiling cell responses across several experimental parameters, such as perturbations, time, and genetic contexts, leads to richer and more generalizable findings. However, these multidimensional datasets necessitate a reevaluation of the conventional methods for their representation and analysis. Traditionally, experimental parameters are merged to flatten the data into a two-dimensional matrix, sacrificing crucial experiment context reflected by the structure. As Marshall McLuhan famously stated, \"the medium is the message.\" In this work, we propose that the experiment structure is the medium in which subsequent analysis is performed, and the optimal choice of data representation must reflect the experiment structure. We review how tensor-structured analyses and decompositions can preserve this information. We contend that tensor methods are poised to become integral to the biomedical data sciences toolkit.</p>","PeriodicalId":93929,"journal":{"name":"Cell systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11366223/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142038014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}