Pub Date : 2024-04-05DOI: 10.1038/s41540-024-00364-2
Kevin O’Leary, Deyou Zheng
By profiling gene expression in individual cells, single-cell RNA-sequencing (scRNA-seq) can resolve cellular heterogeneity and cell-type gene expression dynamics. Its application to time-series samples can identify temporal gene programs active in different cell types, for example, immune cells’ responses to viral infection. However, current scRNA-seq analysis has limitations. One is the low number of genes detected per cell. The second is insufficient replicates (often 1-2) due to high experimental cost. The third lies in the data analysis—treating individual cells as independent measurements leads to inflated statistics. To address these, we explore a new computational framework, specifically whether “metacells” constructed to maintain cellular heterogeneity within individual cell types (or clusters) can be used as “replicates” for increasing statistical rigor. Toward this, we applied SEACells to a time-series scRNA-seq dataset from peripheral blood mononuclear cells (PBMCs) after SARS-CoV-2 infection to construct metacells, and used them in maSigPro for quadratic regression to find significantly differentially expressed genes (DEGs) over time, followed by clustering expression velocity trends. We showed that such metacells retained greater expression variances and produced more biologically meaningful DEGs compared to either metacells generated randomly or from simple pseudobulk methods. More specifically, this approach correctly identified the known ISG15 interferon response program in almost all PBMC cell types and many DEGs enriched in the previously defined SARS-CoV-2 infection response pathway. It also uncovered additional and more cell type-specific temporal gene expression programs. Overall, our results demonstrate that the metacell-pseudoreplicate strategy could potentially overcome the limitation of 1-2 replicates.
{"title":"Metacell-based differential expression analysis identifies cell type specific temporal gene response programs in COVID-19 patient PBMCs","authors":"Kevin O’Leary, Deyou Zheng","doi":"10.1038/s41540-024-00364-2","DOIUrl":"https://doi.org/10.1038/s41540-024-00364-2","url":null,"abstract":"<p>By profiling gene expression in individual cells, single-cell RNA-sequencing (scRNA-seq) can resolve cellular heterogeneity and cell-type gene expression dynamics. Its application to time-series samples can identify temporal gene programs active in different cell types, for example, immune cells’ responses to viral infection. However, current scRNA-seq analysis has limitations. One is the low number of genes detected per cell. The second is insufficient replicates (often 1-2) due to high experimental cost. The third lies in the data analysis—treating individual cells as independent measurements leads to inflated statistics. To address these, we explore a new computational framework, specifically whether “metacells” constructed to maintain cellular heterogeneity within individual cell types (or clusters) can be used as “replicates” for increasing statistical rigor. Toward this, we applied SEACells to a time-series scRNA-seq dataset from peripheral blood mononuclear cells (PBMCs) after SARS-CoV-2 infection to construct metacells, and used them in maSigPro for quadratic regression to find significantly differentially expressed genes (DEGs) over time, followed by clustering expression velocity trends. We showed that such metacells retained greater expression variances and produced more biologically meaningful DEGs compared to either metacells generated randomly or from simple pseudobulk methods. More specifically, this approach correctly identified the known ISG15 interferon response program in almost all PBMC cell types and many DEGs enriched in the previously defined SARS-CoV-2 infection response pathway. It also uncovered additional and more cell type-specific temporal gene expression programs. Overall, our results demonstrate that the metacell-pseudoreplicate strategy could potentially overcome the limitation of 1-2 replicates.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140594485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1038/s41540-024-00361-5
Reza Mousavi, Daniel Lobo
Gene regulatory mechanisms (GRMs) control the formation of spatial and temporal expression patterns that can serve as regulatory signals for the development of complex shapes. Synthetic developmental biology aims to engineer such genetic circuits for understanding and producing desired multicellular spatial patterns. However, designing synthetic GRMs for complex, multi-dimensional spatial patterns is a current challenge due to the nonlinear interactions and feedback loops in genetic circuits. Here we present a methodology to automatically design GRMs that can produce any given two-dimensional spatial pattern. The proposed approach uses two orthogonal morphogen gradients acting as positional information signals in a multicellular tissue area or culture, which constitutes a continuous field of engineered cells implementing the same designed GRM. To efficiently design both the circuit network and the interaction mechanisms—including the number of genes necessary for the formation of the target spatial pattern—we developed an automated algorithm based on high-performance evolutionary computation. The tolerance of the algorithm can be configured to design GRMs that are either simple to produce approximate patterns or complex to produce precise patterns. We demonstrate the approach by automatically designing GRMs that can produce a diverse set of synthetic spatial expression patterns by interpreting just two orthogonal morphogen gradients. The proposed framework offers a versatile approach to systematically design and discover complex genetic circuits producing spatial patterns.
{"title":"Automatic design of gene regulatory mechanisms for spatial pattern formation","authors":"Reza Mousavi, Daniel Lobo","doi":"10.1038/s41540-024-00361-5","DOIUrl":"https://doi.org/10.1038/s41540-024-00361-5","url":null,"abstract":"<p>Gene regulatory mechanisms (GRMs) control the formation of spatial and temporal expression patterns that can serve as regulatory signals for the development of complex shapes. Synthetic developmental biology aims to engineer such genetic circuits for understanding and producing desired multicellular spatial patterns. However, designing synthetic GRMs for complex, multi-dimensional spatial patterns is a current challenge due to the nonlinear interactions and feedback loops in genetic circuits. Here we present a methodology to automatically design GRMs that can produce any given two-dimensional spatial pattern. The proposed approach uses two orthogonal morphogen gradients acting as positional information signals in a multicellular tissue area or culture, which constitutes a continuous field of engineered cells implementing the same designed GRM. To efficiently design both the circuit network and the interaction mechanisms—including the number of genes necessary for the formation of the target spatial pattern—we developed an automated algorithm based on high-performance evolutionary computation. The tolerance of the algorithm can be configured to design GRMs that are either simple to produce approximate patterns or complex to produce precise patterns. We demonstrate the approach by automatically designing GRMs that can produce a diverse set of synthetic spatial expression patterns by interpreting just two orthogonal morphogen gradients. The proposed framework offers a versatile approach to systematically design and discover complex genetic circuits producing spatial patterns.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1038/s41540-024-00360-6
Maxime Mahout, Ross P. Carlson, Laurent Simon, Sabine Peres
Minimal Cut Sets (MCSs) identify sets of reactions which, when removed from a metabolic network, disable certain cellular functions. The traditional search for MCSs within genome-scale metabolic models (GSMMs) targets cellular growth, identifies reaction sets resulting in a lethal phenotype if disrupted, and retrieves a list of corresponding gene, mRNA, or enzyme targets. Using the dual link between MCSs and Elementary Flux Modes (EFMs), our logic programming-based tool aspefm was able to compute MCSs of any size from GSMMs in acceptable run times. The tool demonstrated better performance when computing large-sized MCSs than the mixed-integer linear programming methods. We applied the new MCSs methodology to a medically-relevant consortium model of two cross-feeding bacteria, Staphylococcus aureus and Pseudomonas aeruginosa. aspefm constraints were used to bias the computation of MCSs toward exchanged metabolites that could complement lethal phenotypes in individual species. We found that interspecies metabolite exchanges could play an essential role in rescuing single-species growth, for instance inosine could complement lethal reaction knock-outs in the purine synthesis, glycolysis, and pentose phosphate pathways of both bacteria. Finally, MCSs were used to derive a list of promising enzyme targets for consortium-level therapeutic applications that cannot be circumvented via interspecies metabolite exchange.
{"title":"Logic programming-based Minimal Cut Sets reveal consortium-level therapeutic targets for chronic wound infections","authors":"Maxime Mahout, Ross P. Carlson, Laurent Simon, Sabine Peres","doi":"10.1038/s41540-024-00360-6","DOIUrl":"https://doi.org/10.1038/s41540-024-00360-6","url":null,"abstract":"<p>Minimal Cut Sets (MCSs) identify sets of reactions which, when removed from a metabolic network, disable certain cellular functions. The traditional search for MCSs within genome-scale metabolic models (GSMMs) targets cellular growth, identifies reaction sets resulting in a lethal phenotype if disrupted, and retrieves a list of corresponding gene, mRNA, or enzyme targets. Using the dual link between MCSs and Elementary Flux Modes (EFMs), our logic programming-based tool <i>aspefm</i> was able to compute MCSs of any size from GSMMs in acceptable run times. The tool demonstrated better performance when computing large-sized MCSs than the mixed-integer linear programming methods. We applied the new MCSs methodology to a medically-relevant consortium model of two cross-feeding bacteria, <i>Staphylococcus aureus</i> and <i>Pseudomonas aeruginosa</i>. <i>aspefm</i> constraints were used to bias the computation of MCSs toward exchanged metabolites that could complement lethal phenotypes in individual species. We found that interspecies metabolite exchanges could play an essential role in rescuing single-species growth, for instance inosine could complement lethal reaction knock-outs in the purine synthesis, glycolysis, and pentose phosphate pathways of both bacteria. Finally, MCSs were used to derive a list of promising enzyme targets for consortium-level therapeutic applications that cannot be circumvented via interspecies metabolite exchange.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140594338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-29DOI: 10.1038/s41540-024-00357-1
Bevelynn Williams, Jamie Paterson, Helena J Rawsthorne-Manning, Polly-Anne Jeffrey, Joseph J Gillard, Grant Lythe, Thomas R Laws, Martín López-García
Protective antigen (PA) is a protein produced by Bacillus anthracis. It forms part of the anthrax toxin and is a key immunogen in US and UK anthrax vaccines. In this study, we have conducted experiments to quantify PA in the supernatants of cultures of B. anthracis Sterne strain, which is the strain used in the manufacture of the UK anthrax vaccine. Then, for the first time, we quantify PA production and degradation via mathematical modelling and Bayesian statistical techniques, making use of this new experimental data as well as two other independent published data sets. We propose a single mathematical model, in terms of delay differential equations (DDEs), which can explain the in vitro dynamics of all three data sets. Since we did not heat activate the B. anthracis spores prior to inoculation, germination occurred much slower in our experiments, allowing us to calibrate two additional parameters with respect to the other data sets. Our model is able to distinguish between natural PA decay and that triggered by bacteria via proteases. There is promising consistency between the different independent data sets for most of the parameter estimates. The quantitative characterisation of B. anthracis PA production and degradation obtained here will contribute towards the ambition to include a realistic description of toxin dynamics, the host immune response, and anti-toxin treatments in future mechanistic models of anthrax infection.
保护性抗原(PA)是炭疽杆菌产生的一种蛋白质。它是炭疽毒素的一部分,也是美国和英国炭疽疫苗的主要免疫原。在本研究中,我们对英国炭疽疫苗生产中使用的炭疽杆菌 Sterne 株培养上清液中的 PA 进行了定量实验。然后,我们首次通过数学建模和贝叶斯统计技术对 PA 的产生和降解进行了量化,并利用了这一新的实验数据和另外两组独立发表的数据。我们用延迟微分方程(DDE)提出了一个数学模型,该模型可以解释所有三组数据的体外动态。由于我们在接种前没有对炭疽杆菌孢子进行热激活,因此在我们的实验中萌发的速度要慢得多,这使得我们可以校准与其他数据集相比的两个额外参数。我们的模型能够区分 PA 的自然衰变和细菌通过蛋白酶引发的衰变。在不同的独立数据集之间,大多数参数的估计值都具有很好的一致性。本文获得的炭疽杆菌 PA 生成和降解的定量特征将有助于在未来的炭疽感染机理模型中对毒素动态、宿主免疫反应和抗毒素治疗进行现实描述。
{"title":"Quantifying in vitro B. anthracis growth and PA production and decay: a mathematical modelling approach.","authors":"Bevelynn Williams, Jamie Paterson, Helena J Rawsthorne-Manning, Polly-Anne Jeffrey, Joseph J Gillard, Grant Lythe, Thomas R Laws, Martín López-García","doi":"10.1038/s41540-024-00357-1","DOIUrl":"10.1038/s41540-024-00357-1","url":null,"abstract":"<p><p>Protective antigen (PA) is a protein produced by Bacillus anthracis. It forms part of the anthrax toxin and is a key immunogen in US and UK anthrax vaccines. In this study, we have conducted experiments to quantify PA in the supernatants of cultures of B. anthracis Sterne strain, which is the strain used in the manufacture of the UK anthrax vaccine. Then, for the first time, we quantify PA production and degradation via mathematical modelling and Bayesian statistical techniques, making use of this new experimental data as well as two other independent published data sets. We propose a single mathematical model, in terms of delay differential equations (DDEs), which can explain the in vitro dynamics of all three data sets. Since we did not heat activate the B. anthracis spores prior to inoculation, germination occurred much slower in our experiments, allowing us to calibrate two additional parameters with respect to the other data sets. Our model is able to distinguish between natural PA decay and that triggered by bacteria via proteases. There is promising consistency between the different independent data sets for most of the parameter estimates. The quantitative characterisation of B. anthracis PA production and degradation obtained here will contribute towards the ambition to include a realistic description of toxin dynamics, the host immune response, and anti-toxin treatments in future mechanistic models of anthrax infection.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10980772/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140326920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-25DOI: 10.1038/s41540-024-00352-6
Lisa Uechi, Swetha Vasudevan, Daniela Vilenski, Sergio Branciamore, David Frankhouser, Denis O'Meally, Soheil Meshinchi, Guido Marcucci, Ya-Huei Kuo, Russell Rockne, Nataly Kravchenko-Balasha
Acute myeloid leukemia (AML) is prevalent in both adult and pediatric patients. Despite advances in patient categorization, the heterogeneity of AML remains a challenge. Recent studies have explored the use of gene expression data to enhance AML diagnosis and prognosis, however, alternative approaches rooted in physics and chemistry may provide another level of insight into AML transformation. Utilizing publicly available databases, we analyze 884 human and mouse blood and bone marrow samples. We employ a personalized medicine strategy, combining state-transition theory and surprisal analysis, to assess the RNA transcriptome of individual patients. The transcriptome is transformed into physical parameters that represent each sample's steady state and the free energy change (FEC) from that steady state, which is the state with the lowest free energy.We found the transcriptome steady state was invariant across normal and AML samples. FEC, representing active molecular processes, varied significantly between samples and was used to create patient-specific barcodes to characterize the biology of the disease. We discovered that AML samples that were in a transition state had the highest FEC. This disease state may be characterized as the most unstable and hence the most therapeutically targetable since a change in free energy is a thermodynamic requirement for disease progression. We also found that distinct sets of ongoing processes may be at the root of otherwise similar clinical phenotypes, implying that our integrated analysis of transcriptome profiles may facilitate a personalized medicine approach to cure AML and restore a steady state in each patient.
{"title":"Transcriptome free energy can serve as a dynamic patient-specific biomarker in acute myeloid leukemia.","authors":"Lisa Uechi, Swetha Vasudevan, Daniela Vilenski, Sergio Branciamore, David Frankhouser, Denis O'Meally, Soheil Meshinchi, Guido Marcucci, Ya-Huei Kuo, Russell Rockne, Nataly Kravchenko-Balasha","doi":"10.1038/s41540-024-00352-6","DOIUrl":"10.1038/s41540-024-00352-6","url":null,"abstract":"<p><p>Acute myeloid leukemia (AML) is prevalent in both adult and pediatric patients. Despite advances in patient categorization, the heterogeneity of AML remains a challenge. Recent studies have explored the use of gene expression data to enhance AML diagnosis and prognosis, however, alternative approaches rooted in physics and chemistry may provide another level of insight into AML transformation. Utilizing publicly available databases, we analyze 884 human and mouse blood and bone marrow samples. We employ a personalized medicine strategy, combining state-transition theory and surprisal analysis, to assess the RNA transcriptome of individual patients. The transcriptome is transformed into physical parameters that represent each sample's steady state and the free energy change (FEC) from that steady state, which is the state with the lowest free energy.We found the transcriptome steady state was invariant across normal and AML samples. FEC, representing active molecular processes, varied significantly between samples and was used to create patient-specific barcodes to characterize the biology of the disease. We discovered that AML samples that were in a transition state had the highest FEC. This disease state may be characterized as the most unstable and hence the most therapeutically targetable since a change in free energy is a thermodynamic requirement for disease progression. We also found that distinct sets of ongoing processes may be at the root of otherwise similar clinical phenotypes, implying that our integrated analysis of transcriptome profiles may facilitate a personalized medicine approach to cure AML and restore a steady state in each patient.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10963775/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140288679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-18DOI: 10.1038/s41540-024-00355-3
Daniel C. Kirouac, Cole Zmurchok, Denise Morris
Engineered T cells have emerged as highly effective treatments for hematological cancers. Hundreds of clinical programs are underway in efforts to expand the efficacy, safety, and applications of this immuno-therapeutic modality. A primary challenge in developing these “living drugs” is the complexity of their pharmacology, as the drug product proliferates, differentiates, traffics between tissues, and evolves through interactions with patient immune systems. Using publicly available clinical data from Chimeric Antigen Receptor (CAR) T cells, we demonstrate how mathematical models can be used to quantify the relationships between product characteristics, patient physiology, pharmacokinetics and clinical outcomes. As scientists work to develop next-generation cell therapy products, mathematical models will be integral for contextualizing data and facilitating the translation of product designs to clinical strategy.
工程 T 细胞已成为治疗血液肿瘤的高效疗法。目前正在开展数百项临床计划,努力扩大这种免疫治疗方式的疗效、安全性和应用范围。开发这些 "活体药物 "的一个主要挑战是其药理学的复杂性,因为药物产品会增殖、分化、在组织间流动,并通过与患者免疫系统的相互作用而演变。利用公开的嵌合抗原受体(CAR)T 细胞临床数据,我们展示了如何利用数学模型来量化产品特性、患者生理学、药代动力学和临床结果之间的关系。在科学家们开发下一代细胞疗法产品的过程中,数学模型将成为数据背景化和促进产品设计转化为临床策略不可或缺的一部分。
{"title":"Making drugs from T cells: The quantitative pharmacology of engineered T cell therapeutics","authors":"Daniel C. Kirouac, Cole Zmurchok, Denise Morris","doi":"10.1038/s41540-024-00355-3","DOIUrl":"https://doi.org/10.1038/s41540-024-00355-3","url":null,"abstract":"<p>Engineered T cells have emerged as highly effective treatments for hematological cancers. Hundreds of clinical programs are underway in efforts to expand the efficacy, safety, and applications of this immuno-therapeutic modality. A primary challenge in developing these “living drugs” is the complexity of their pharmacology, as the drug product proliferates, differentiates, traffics between tissues, and evolves through interactions with patient immune systems. Using publicly available clinical data from Chimeric Antigen Receptor (CAR) T cells, we demonstrate how mathematical models can be used to quantify the relationships between product characteristics, patient physiology, pharmacokinetics and clinical outcomes. As scientists work to develop next-generation cell therapy products, mathematical models will be integral for contextualizing data and facilitating the translation of product designs to clinical strategy.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140151732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-16DOI: 10.1038/s41540-024-00356-2
Eui Min Jeong, Jae Kyoung Kim
Ultrasensitive transcriptional switches enable sharp transitions between transcriptional on and off states and are essential for cells to respond to environmental cues with high fidelity. However, conventional switches, which rely on direct repressor-DNA binding, are extremely noise-sensitive, leading to unintended changes in gene expression. Here, through model simulations and analysis, we discovered that an alternative design combining three indirect transcriptional repression mechanisms, sequestration, blocking, and displacement, can generate a noise-resilient ultrasensitive switch. Although sequestration alone can generate an ultrasensitive switch, it remains sensitive to noise because the unintended transcriptional state induced by noise persists for long periods. However, by jointly utilizing blocking and displacement, these noise-induced transitions can be rapidly restored to the original transcriptional state. Because this transcriptional switch is effective in noisy cellular contexts, it goes beyond previous synthetic transcriptional switches, making it particularly valuable for robust synthetic system design. Our findings also provide insights into the evolution of robust ultrasensitive switches in cells. Specifically, the concurrent use of seemingly redundant indirect repression mechanisms in diverse biological systems appears to be a strategy to achieve noise-resilience of ultrasensitive switches.
{"title":"A robust ultrasensitive transcriptional switch in noisy cellular environments.","authors":"Eui Min Jeong, Jae Kyoung Kim","doi":"10.1038/s41540-024-00356-2","DOIUrl":"10.1038/s41540-024-00356-2","url":null,"abstract":"<p><p>Ultrasensitive transcriptional switches enable sharp transitions between transcriptional on and off states and are essential for cells to respond to environmental cues with high fidelity. However, conventional switches, which rely on direct repressor-DNA binding, are extremely noise-sensitive, leading to unintended changes in gene expression. Here, through model simulations and analysis, we discovered that an alternative design combining three indirect transcriptional repression mechanisms, sequestration, blocking, and displacement, can generate a noise-resilient ultrasensitive switch. Although sequestration alone can generate an ultrasensitive switch, it remains sensitive to noise because the unintended transcriptional state induced by noise persists for long periods. However, by jointly utilizing blocking and displacement, these noise-induced transitions can be rapidly restored to the original transcriptional state. Because this transcriptional switch is effective in noisy cellular contexts, it goes beyond previous synthetic transcriptional switches, making it particularly valuable for robust synthetic system design. Our findings also provide insights into the evolution of robust ultrasensitive switches in cells. Specifically, the concurrent use of seemingly redundant indirect repression mechanisms in diverse biological systems appears to be a strategy to achieve noise-resilience of ultrasensitive switches.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10944533/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140140517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Understanding the biological functions of proteins is of fundamental importance in modern biology. To represent a function of proteins, Gene Ontology (GO), a controlled vocabulary, is frequently used, because it is easy to handle by computer programs avoiding open-ended text interpretation. Particularly, the majority of current protein function prediction methods rely on GO terms. However, the extensive list of GO terms that describe a protein function can pose challenges for biologists when it comes to interpretation. In response to this issue, we developed GO2Sum (Gene Ontology terms Summarizer), a model that takes a set of GO terms as input and generates a human-readable summary using the T5 large language model. GO2Sum was developed by fine-tuning T5 on GO term assignments and free-text function descriptions for UniProt entries, enabling it to recreate function descriptions by concatenating GO term descriptions. Our results demonstrated that GO2Sum significantly outperforms the original T5 model that was trained on the entire web corpus in generating Function, Subunit Structure, and Pathway paragraphs for UniProt entries.
了解蛋白质的生物学功能对现代生物学至关重要。为了表示蛋白质的功能,基因本体(Gene Ontology,GO)这一受控词汇经常被使用,因为它易于计算机程序处理,避免了开放式文本解释。特别是,目前大多数蛋白质功能预测方法都依赖于 GO 术语。然而,描述蛋白质功能的大量 GO 术语在解释时会给生物学家带来挑战。为了解决这个问题,我们开发了 GO2Sum(基因本体术语总结器),这是一个将一组 GO 术语作为输入,并使用 T5 大语言模型生成人类可读总结的模型。GO2Sum 是通过微调 T5 的 GO 术语分配和 UniProt 条目的自由文本功能描述而开发的,使其能够通过连接 GO 术语描述来重新创建功能描述。我们的研究结果表明,在为 UniProt 条目生成功能、亚基结构和途径段落方面,GO2Sum 明显优于在整个网络语料库中训练的原始 T5 模型。
{"title":"GO2Sum: generating human-readable functional summary of proteins from GO terms.","authors":"Swagarika Jaharlal Giri, Nabil Ibtehaz, Daisuke Kihara","doi":"10.1038/s41540-024-00358-0","DOIUrl":"10.1038/s41540-024-00358-0","url":null,"abstract":"<p><p>Understanding the biological functions of proteins is of fundamental importance in modern biology. To represent a function of proteins, Gene Ontology (GO), a controlled vocabulary, is frequently used, because it is easy to handle by computer programs avoiding open-ended text interpretation. Particularly, the majority of current protein function prediction methods rely on GO terms. However, the extensive list of GO terms that describe a protein function can pose challenges for biologists when it comes to interpretation. In response to this issue, we developed GO2Sum (Gene Ontology terms Summarizer), a model that takes a set of GO terms as input and generates a human-readable summary using the T5 large language model. GO2Sum was developed by fine-tuning T5 on GO term assignments and free-text function descriptions for UniProt entries, enabling it to recreate function descriptions by concatenating GO term descriptions. Our results demonstrated that GO2Sum significantly outperforms the original T5 model that was trained on the entire web corpus in generating Function, Subunit Structure, and Pathway paragraphs for UniProt entries.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10943200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140140518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-09DOI: 10.1038/s41540-024-00350-8
Amy J Osborne, Agnieszka Bierzynska, Elizabeth Colby, Uwe Andag, Philip A Kalra, Olivier Radresa, Philipp Skroblin, Maarten W Taal, Gavin I Welsh, Moin A Saleem, Colin Campbell
Chronic kidney diseases (CKD) have genetic associations with kidney function. Univariate genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with estimated glomerular filtration rate (eGFR) and blood urea nitrogen (BUN), two complementary kidney function markers. However, it is unknown whether additional SNPs for kidney function can be identified by multivariate statistical analysis. To address this, we applied canonical correlation analysis (CCA), a multivariate method, to two individual-level CKD genotype datasets, and metaCCA to two published GWAS summary statistics datasets. We identified SNPs previously associated with kidney function by published univariate GWASs with high replication rates, validating the metaCCA method. We then extended discovery and identified previously unreported lead SNPs for both kidney function markers, jointly. These showed expression quantitative trait loci (eQTL) colocalisation with genes having significant differential expression between CKD and healthy individuals. Several of these identified lead missense SNPs were predicted to have a functional impact, including in SLC14A2. We also identified previously unreported lead SNPs that showed significant correlation with both kidney function markers, jointly, in the European ancestry CKDGen, National Unified Renal Translational Research Enterprise (NURTuRE)-CKD and Salford Kidney Study (SKS) datasets. Of these, rs3094060 colocalised with FLOT1 gene expression and was significantly more common in CKD cases in both NURTURE-CKD and SKS, than in the general population. Overall, by using multivariate analysis by CCA, we identified additional SNPs and genes for both kidney function and CKD, that can be prioritised for further CKD analyses.
{"title":"Multivariate canonical correlation analysis identifies additional genetic variants for chronic kidney disease.","authors":"Amy J Osborne, Agnieszka Bierzynska, Elizabeth Colby, Uwe Andag, Philip A Kalra, Olivier Radresa, Philipp Skroblin, Maarten W Taal, Gavin I Welsh, Moin A Saleem, Colin Campbell","doi":"10.1038/s41540-024-00350-8","DOIUrl":"10.1038/s41540-024-00350-8","url":null,"abstract":"<p><p>Chronic kidney diseases (CKD) have genetic associations with kidney function. Univariate genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with estimated glomerular filtration rate (eGFR) and blood urea nitrogen (BUN), two complementary kidney function markers. However, it is unknown whether additional SNPs for kidney function can be identified by multivariate statistical analysis. To address this, we applied canonical correlation analysis (CCA), a multivariate method, to two individual-level CKD genotype datasets, and metaCCA to two published GWAS summary statistics datasets. We identified SNPs previously associated with kidney function by published univariate GWASs with high replication rates, validating the metaCCA method. We then extended discovery and identified previously unreported lead SNPs for both kidney function markers, jointly. These showed expression quantitative trait loci (eQTL) colocalisation with genes having significant differential expression between CKD and healthy individuals. Several of these identified lead missense SNPs were predicted to have a functional impact, including in SLC14A2. We also identified previously unreported lead SNPs that showed significant correlation with both kidney function markers, jointly, in the European ancestry CKDGen, National Unified Renal Translational Research Enterprise (NURTuRE)-CKD and Salford Kidney Study (SKS) datasets. Of these, rs3094060 colocalised with FLOT1 gene expression and was significantly more common in CKD cases in both NURTURE-CKD and SKS, than in the general population. Overall, by using multivariate analysis by CCA, we identified additional SNPs and genes for both kidney function and CKD, that can be prioritised for further CKD analyses.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10924093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140065645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-08DOI: 10.1038/s41540-024-00354-4
Juntan Liu, Chunhe Li
The evolution of cancer is a complex process characterized by stable states and transitions among them. Studying the dynamic evolution of cancer and revealing the mechanisms of cancer progression based on experimental data is an important topic. In this study, we aim to employ a data-driven energy landscape approach to analyze the dynamic evolution of cancer. We take Kidney renal clear cell carcinoma (KIRC) as an example. From the energy landscape, we introduce two quantitative indicators (transition probability and barrier height) to study critical shifts in KIRC cancer evolution, including cancer onset and progression, and identify critical genes involved in these transitions. Our results successfully identify crucial genes that either promote or inhibit these transition processes in KIRC. We also conduct a comprehensive biological function analysis on these genes, validating the accuracy and reliability of our predictions. This work has implications for discovering new biomarkers, drug targets, and cancer treatment strategies in KIRC.
{"title":"Data-driven energy landscape reveals critical genes in cancer progression.","authors":"Juntan Liu, Chunhe Li","doi":"10.1038/s41540-024-00354-4","DOIUrl":"10.1038/s41540-024-00354-4","url":null,"abstract":"<p><p>The evolution of cancer is a complex process characterized by stable states and transitions among them. Studying the dynamic evolution of cancer and revealing the mechanisms of cancer progression based on experimental data is an important topic. In this study, we aim to employ a data-driven energy landscape approach to analyze the dynamic evolution of cancer. We take Kidney renal clear cell carcinoma (KIRC) as an example. From the energy landscape, we introduce two quantitative indicators (transition probability and barrier height) to study critical shifts in KIRC cancer evolution, including cancer onset and progression, and identify critical genes involved in these transitions. Our results successfully identify crucial genes that either promote or inhibit these transition processes in KIRC. We also conduct a comprehensive biological function analysis on these genes, validating the accuracy and reliability of our predictions. This work has implications for discovering new biomarkers, drug targets, and cancer treatment strategies in KIRC.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":4.0,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10923824/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140065621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}