Pub Date : 2024-03-01Epub Date: 2024-01-30DOI: 10.1038/s44320-024-00016-x
Jürgen Jänes, Pedro Beltrao
Proteins are the key molecular machines that orchestrate all biological processes of the cell. Most proteins fold into three-dimensional shapes that are critical for their function. Studying the 3D shape of proteins can inform us of the mechanisms that underlie biological processes in living cells and can have practical applications in the study of disease mutations or the discovery of novel drug treatments. Here, we review the progress made in sequence-based prediction of protein structures with a focus on applications that go beyond the prediction of single monomer structures. This includes the application of deep learning methods for the prediction of structures of protein complexes, different conformations, the evolution of protein structures and the application of these methods to protein design. These developments create new opportunities for research that will have impact across many areas of biomedical research.
{"title":"Deep learning for protein structure prediction and design-progress and applications.","authors":"Jürgen Jänes, Pedro Beltrao","doi":"10.1038/s44320-024-00016-x","DOIUrl":"10.1038/s44320-024-00016-x","url":null,"abstract":"<p><p>Proteins are the key molecular machines that orchestrate all biological processes of the cell. Most proteins fold into three-dimensional shapes that are critical for their function. Studying the 3D shape of proteins can inform us of the mechanisms that underlie biological processes in living cells and can have practical applications in the study of disease mutations or the discovery of novel drug treatments. Here, we review the progress made in sequence-based prediction of protein structures with a focus on applications that go beyond the prediction of single monomer structures. This includes the application of deep learning methods for the prediction of structures of protein complexes, different conformations, the evolution of protein structures and the application of these methods to protein design. These developments create new opportunities for research that will have impact across many areas of biomedical research.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10912668/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139642552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01Epub Date: 2023-12-19DOI: 10.1038/s44320-023-00003-8
Mehdi Joodaki, Mina Shaigan, Victor Parra, Roman D Bülow, Christoph Kuppe, David L Hölscher, Mingbo Cheng, James S Nagai, Michaël Goedertier, Nassim Bouteldja, Vladimir Tesar, Jonathan Barratt, Ian Sd Roberts, Rosanna Coppo, Rafael Kramann, Peter Boor, Ivan G Costa
Although clinical applications represent the next challenge in single-cell genomics and digital pathology, we still lack computational methods to analyze single-cell or pathomics data to find sample-level trajectories or clusters associated with diseases. This remains challenging as single-cell/pathomics data are multi-scale, i.e., a sample is represented by clusters of cells/structures, and samples cannot be easily compared with each other. Here we propose PatIent Level analysis with Optimal Transport (PILOT). PILOT uses optimal transport to compute the Wasserstein distance between two individual single-cell samples. This allows us to perform unsupervised analysis at the sample level and uncover trajectories or cellular clusters associated with disease progression. We evaluate PILOT and competing approaches in single-cell genomics or pathomics studies involving various human diseases with up to 600 samples/patients and millions of cells or tissue structures. Our results demonstrate that PILOT detects disease-associated samples from large and complex single-cell or pathomics data. Moreover, PILOT provides a statistical approach to find changes in cell populations, gene expression, and tissue structures related to the trajectories or clusters supporting interpretation of predictions.
虽然临床应用是单细胞基因组学和数字病理学的下一个挑战,但我们仍然缺乏分析单细胞或病理组学数据的计算方法,以找到与疾病相关的样本级轨迹或集群。这仍然具有挑战性,因为单细胞/病理组学数据是多尺度的,即样本由细胞/结构集群表示,样本之间不易比较。在此,我们提出了采用最优传输技术的实体级分析(PILOT)。PILOT 使用最优传输计算两个单细胞样本之间的瓦瑟斯坦距离。这样,我们就能在样本水平上进行无监督分析,发现与疾病进展相关的轨迹或细胞集群。我们在涉及各种人类疾病的单细胞基因组学或病理组学研究中评估了 PILOT 和其他竞争方法,这些研究涉及多达 600 个样本/患者和数百万个细胞或组织结构。结果表明,PILOT 能从大量复杂的单细胞或病理组学数据中检测出疾病相关样本。此外,PILOT 还提供了一种统计方法,用于发现与轨迹或集群相关的细胞群、基因表达和组织结构的变化,从而支持对预测结果的解释。
{"title":"Detection of PatIent-Level distances from single cell genomics and pathomics data with Optimal Transport (PILOT).","authors":"Mehdi Joodaki, Mina Shaigan, Victor Parra, Roman D Bülow, Christoph Kuppe, David L Hölscher, Mingbo Cheng, James S Nagai, Michaël Goedertier, Nassim Bouteldja, Vladimir Tesar, Jonathan Barratt, Ian Sd Roberts, Rosanna Coppo, Rafael Kramann, Peter Boor, Ivan G Costa","doi":"10.1038/s44320-023-00003-8","DOIUrl":"10.1038/s44320-023-00003-8","url":null,"abstract":"<p><p>Although clinical applications represent the next challenge in single-cell genomics and digital pathology, we still lack computational methods to analyze single-cell or pathomics data to find sample-level trajectories or clusters associated with diseases. This remains challenging as single-cell/pathomics data are multi-scale, i.e., a sample is represented by clusters of cells/structures, and samples cannot be easily compared with each other. Here we propose PatIent Level analysis with Optimal Transport (PILOT). PILOT uses optimal transport to compute the Wasserstein distance between two individual single-cell samples. This allows us to perform unsupervised analysis at the sample level and uncover trajectories or cellular clusters associated with disease progression. We evaluate PILOT and competing approaches in single-cell genomics or pathomics studies involving various human diseases with up to 600 samples/patients and millions of cells or tissue structures. Our results demonstrate that PILOT detects disease-associated samples from large and complex single-cell or pathomics data. Moreover, PILOT provides a statistical approach to find changes in cell populations, gene expression, and tissue structures related to the trajectories or clusters supporting interpretation of predictions.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":9.9,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10883279/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139098306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01Epub Date: 2024-01-05DOI: 10.1038/s44320-023-00009-2
Nalini R Rao, Arun Upadhyay, Jeffrey N Savas
Efficient protein turnover is essential for cellular homeostasis and organ function. Loss of proteostasis is a hallmark of aging culminating in severe dysfunction of protein turnover. To investigate protein turnover dynamics as a function of age, we performed continuous in vivo metabolic stable isotope labeling in mice along the aging continuum. First, we discovered that the brain proteome uniquely undergoes dynamic turnover fluctuations during aging compared to heart and liver tissue. Second, trends in protein turnover in the brain proteome during aging showed sex-specific differences that were tightly tied to cellular compartments. Next, parallel analyses of the insoluble proteome revealed that several cellular compartments experience hampered turnover, in part due to misfolding. Finally, we found that age-associated fluctuations in proteasome activity were associated with the turnover of core proteolytic subunits, which was recapitulated by pharmacological suppression of proteasome activity. Taken together, our study provides a proteome-wide atlas of protein turnover across the aging continuum and reveals a link between the turnover of individual proteasome subunits and the age-associated decline in proteasome activity.
{"title":"Derailed protein turnover in the aging mammalian brain.","authors":"Nalini R Rao, Arun Upadhyay, Jeffrey N Savas","doi":"10.1038/s44320-023-00009-2","DOIUrl":"10.1038/s44320-023-00009-2","url":null,"abstract":"<p><p>Efficient protein turnover is essential for cellular homeostasis and organ function. Loss of proteostasis is a hallmark of aging culminating in severe dysfunction of protein turnover. To investigate protein turnover dynamics as a function of age, we performed continuous in vivo metabolic stable isotope labeling in mice along the aging continuum. First, we discovered that the brain proteome uniquely undergoes dynamic turnover fluctuations during aging compared to heart and liver tissue. Second, trends in protein turnover in the brain proteome during aging showed sex-specific differences that were tightly tied to cellular compartments. Next, parallel analyses of the insoluble proteome revealed that several cellular compartments experience hampered turnover, in part due to misfolding. Finally, we found that age-associated fluctuations in proteasome activity were associated with the turnover of core proteolytic subunits, which was recapitulated by pharmacological suppression of proteasome activity. Taken together, our study provides a proteome-wide atlas of protein turnover across the aging continuum and reveals a link between the turnover of individual proteasome subunits and the age-associated decline in proteasome activity.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":9.9,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10897147/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139106318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01Epub Date: 2024-01-15DOI: 10.1038/s44320-023-00006-5
Atefeh Lafzi, Costanza Borrelli, Simona Baghai Sain, Karsten Bach, Jonas A Kretz, Kristina Handler, Daniel Regan-Komito, Xenia Ficht, Andreas Frei, Andreas Moor
Sequencing-based spatial transcriptomics (ST) methods allow unbiased capturing of RNA molecules at barcoded spots, charting the distribution and localization of cell types and transcripts across a tissue. While the coarse resolution of these techniques is considered a disadvantage, we argue that the inherent proximity of transcriptomes captured on spots can be leveraged to reconstruct cellular networks. To this end, we developed ISCHIA (Identifying Spatial Co-occurrence in Healthy and InflAmed tissues), a computational framework to analyze the spatial co-occurrence of cell types and transcript species within spots. Co-occurrence analysis is complementary to differential gene expression, as it does not depend on the abundance of a given cell type or on the transcript expression levels, but rather on their spatial association in the tissue. We applied ISCHIA to analyze co-occurrence of cell types, ligands and receptors in a Visium dataset of human ulcerative colitis patients, and validated our findings at single-cell resolution on matched hybridization-based data. We uncover inflammation-induced cellular networks involving M cell and fibroblasts, as well as ligand-receptor interactions enriched in the inflamed human colon, and their associated gene signatures. Our results highlight the hypothesis-generating power and broad applicability of co-occurrence analysis on spatial transcriptomics data.
基于测序的空间转录组学(ST)方法可以在条形码点上无偏见地捕获 RNA 分子,描绘出细胞类型和转录本在组织中的分布和定位。虽然这些技术的分辨率较低被认为是一个缺点,但我们认为,可以利用捕获点上转录组固有的接近性来重建细胞网络。为此,我们开发了 ISCHIA(在健康和炎症组织中识别空间共现),这是一种分析斑点内细胞类型和转录本物种空间共现的计算框架。共现分析是对差异基因表达的补充,因为它并不取决于特定细胞类型的丰度或转录本的表达水平,而是取决于它们在组织中的空间关联。我们应用 ISCHIA 分析了人类溃疡性结肠炎患者 Visium 数据集中细胞类型、配体和受体的共现,并在基于匹配杂交数据的单细胞分辨率上验证了我们的发现。我们发现了炎症诱导的细胞网络,其中涉及 M 细胞和成纤维细胞,以及富集在炎症人类结肠中的配体-受体相互作用及其相关基因特征。我们的研究结果凸显了空间转录组学数据共现分析的假设生成能力和广泛适用性。
{"title":"Identifying Spatial Co-occurrence in Healthy and InflAmed tissues (ISCHIA).","authors":"Atefeh Lafzi, Costanza Borrelli, Simona Baghai Sain, Karsten Bach, Jonas A Kretz, Kristina Handler, Daniel Regan-Komito, Xenia Ficht, Andreas Frei, Andreas Moor","doi":"10.1038/s44320-023-00006-5","DOIUrl":"10.1038/s44320-023-00006-5","url":null,"abstract":"<p><p>Sequencing-based spatial transcriptomics (ST) methods allow unbiased capturing of RNA molecules at barcoded spots, charting the distribution and localization of cell types and transcripts across a tissue. While the coarse resolution of these techniques is considered a disadvantage, we argue that the inherent proximity of transcriptomes captured on spots can be leveraged to reconstruct cellular networks. To this end, we developed ISCHIA (Identifying Spatial Co-occurrence in Healthy and InflAmed tissues), a computational framework to analyze the spatial co-occurrence of cell types and transcript species within spots. Co-occurrence analysis is complementary to differential gene expression, as it does not depend on the abundance of a given cell type or on the transcript expression levels, but rather on their spatial association in the tissue. We applied ISCHIA to analyze co-occurrence of cell types, ligands and receptors in a Visium dataset of human ulcerative colitis patients, and validated our findings at single-cell resolution on matched hybridization-based data. We uncover inflammation-induced cellular networks involving M cell and fibroblasts, as well as ligand-receptor interactions enriched in the inflamed human colon, and their associated gene signatures. Our results highlight the hypothesis-generating power and broad applicability of co-occurrence analysis on spatial transcriptomics data.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":9.9,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10897385/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139472182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01Epub Date: 2024-01-15DOI: 10.1038/s44320-023-00005-6
Chop Yan Lee, Dalmira Hubrich, Julia K Varga, Christian Schäfer, Mareen Welzel, Eric Schumbera, Milena Djokic, Joelle M Strom, Jonas Schönfeld, Johanna L Geist, Feyza Polat, Toby J Gibson, Claudia Isabelle Keller Valsecchi, Manjeet Kumar, Ora Schueler-Furman, Katja Luck
Structural resolution of protein interactions enables mechanistic and functional studies as well as interpretation of disease variants. However, structural data is still missing for most protein interactions because we lack computational and experimental tools at scale. This is particularly true for interactions mediated by short linear motifs occurring in disordered regions of proteins. We find that AlphaFold-Multimer predicts with high sensitivity but limited specificity structures of domain-motif interactions when using small protein fragments as input. Sensitivity decreased substantially when using long protein fragments or full length proteins. We delineated a protein fragmentation strategy particularly suited for the prediction of domain-motif interfaces and applied it to interactions between human proteins associated with neurodevelopmental disorders. This enabled the prediction of highly confident and likely disease-related novel interfaces, which we further experimentally corroborated for FBXO23-STX1B, STX1B-VAMP2, ESRRG-PSMC5, PEX3-PEX19, PEX3-PEX16, and SNRPB-GIGYF1 providing novel molecular insights for diverse biological processes. Our work highlights exciting perspectives, but also reveals clear limitations and the need for future developments to maximize the power of Alphafold-Multimer for interface predictions.
{"title":"Systematic discovery of protein interaction interfaces using AlphaFold and experimental validation.","authors":"Chop Yan Lee, Dalmira Hubrich, Julia K Varga, Christian Schäfer, Mareen Welzel, Eric Schumbera, Milena Djokic, Joelle M Strom, Jonas Schönfeld, Johanna L Geist, Feyza Polat, Toby J Gibson, Claudia Isabelle Keller Valsecchi, Manjeet Kumar, Ora Schueler-Furman, Katja Luck","doi":"10.1038/s44320-023-00005-6","DOIUrl":"10.1038/s44320-023-00005-6","url":null,"abstract":"<p><p>Structural resolution of protein interactions enables mechanistic and functional studies as well as interpretation of disease variants. However, structural data is still missing for most protein interactions because we lack computational and experimental tools at scale. This is particularly true for interactions mediated by short linear motifs occurring in disordered regions of proteins. We find that AlphaFold-Multimer predicts with high sensitivity but limited specificity structures of domain-motif interactions when using small protein fragments as input. Sensitivity decreased substantially when using long protein fragments or full length proteins. We delineated a protein fragmentation strategy particularly suited for the prediction of domain-motif interfaces and applied it to interactions between human proteins associated with neurodevelopmental disorders. This enabled the prediction of highly confident and likely disease-related novel interfaces, which we further experimentally corroborated for FBXO23-STX1B, STX1B-VAMP2, ESRRG-PSMC5, PEX3-PEX19, PEX3-PEX16, and SNRPB-GIGYF1 providing novel molecular insights for diverse biological processes. Our work highlights exciting perspectives, but also reveals clear limitations and the need for future developments to maximize the power of Alphafold-Multimer for interface predictions.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10883280/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139472172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01Epub Date: 2023-12-18DOI: 10.1038/s44320-023-00004-7
Chien-Yun Lee, Matthew The, Chen Meng, Florian P Bayer, Kerstin Putzker, Julian Müller, Johanna Streubel, Julia Woortman, Amirhossein Sakhteman, Moritz Resch, Annika Schneider, Stephanie Wilhelm, Bernhard Kuster
Kinase inhibitors (KIs) are important cancer drugs but often feature polypharmacology that is molecularly not understood. This disconnect is particularly apparent in cancer entities such as sarcomas for which the oncogenic drivers are often not clear. To investigate more systematically how the cellular proteotypes of sarcoma cells shape their response to molecularly targeted drugs, we profiled the proteomes and phosphoproteomes of 17 sarcoma cell lines and screened the same against 150 cancer drugs. The resulting 2550 phenotypic profiles revealed distinct drug responses and the cellular activity landscapes derived from deep (phospho)proteomes (9-10,000 proteins and 10-27,000 phosphorylation sites per cell line) enabled several lines of analysis. For instance, connecting the (phospho)proteomic data with drug responses revealed known and novel mechanisms of action (MoAs) of KIs and identified markers of drug sensitivity or resistance. All data is publicly accessible via an interactive web application that enables exploration of this rich molecular resource for a better understanding of active signalling pathways in sarcoma cells, identifying treatment response predictors and revealing novel MoA of clinical KIs.
激酶抑制剂(KIs)是重要的抗癌药物,但往往具有分子上不为人知的多药理作用。这种脱节在肉瘤等癌症实体中尤为明显,因为肉瘤的致癌驱动因素往往并不明确。为了更系统地研究肉瘤细胞的细胞蛋白型如何影响它们对分子靶向药物的反应,我们分析了 17 种肉瘤细胞系的蛋白质组和磷酸蛋白组,并针对 150 种抗癌药物进行了筛选。由此产生的 2550 个表型图谱揭示了不同的药物反应,而从深度(磷酸)蛋白质组(每个细胞系有 9-10,000 个蛋白质和 10-27,000 个磷酸化位点)中得出的细胞活性图谱促成了多种分析方法。例如,将(磷酸化)蛋白质组数据与药物反应联系起来,可以揭示 KIs 的已知和新型作用机制 (MoAs),并确定药物敏感性或耐药性的标记。所有数据都可通过一个交互式网络应用程序公开访问,通过该程序可以探索这一丰富的分子资源,从而更好地了解肉瘤细胞中的活性信号通路,确定治疗反应预测因子,并揭示临床 KIs 的新作用机制。
{"title":"Illuminating phenotypic drug responses of sarcoma cells to kinase inhibitors by phosphoproteomics.","authors":"Chien-Yun Lee, Matthew The, Chen Meng, Florian P Bayer, Kerstin Putzker, Julian Müller, Johanna Streubel, Julia Woortman, Amirhossein Sakhteman, Moritz Resch, Annika Schneider, Stephanie Wilhelm, Bernhard Kuster","doi":"10.1038/s44320-023-00004-7","DOIUrl":"10.1038/s44320-023-00004-7","url":null,"abstract":"<p><p>Kinase inhibitors (KIs) are important cancer drugs but often feature polypharmacology that is molecularly not understood. This disconnect is particularly apparent in cancer entities such as sarcomas for which the oncogenic drivers are often not clear. To investigate more systematically how the cellular proteotypes of sarcoma cells shape their response to molecularly targeted drugs, we profiled the proteomes and phosphoproteomes of 17 sarcoma cell lines and screened the same against 150 cancer drugs. The resulting 2550 phenotypic profiles revealed distinct drug responses and the cellular activity landscapes derived from deep (phospho)proteomes (9-10,000 proteins and 10-27,000 phosphorylation sites per cell line) enabled several lines of analysis. For instance, connecting the (phospho)proteomic data with drug responses revealed known and novel mechanisms of action (MoAs) of KIs and identified markers of drug sensitivity or resistance. All data is publicly accessible via an interactive web application that enables exploration of this rich molecular resource for a better understanding of active signalling pathways in sarcoma cells, identifying treatment response predictors and revealing novel MoA of clinical KIs.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":9.9,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10883282/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139098308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01Epub Date: 2023-12-20DOI: 10.1038/s44320-023-00001-w
Claudia Arnedo-Pac, Ferran Muiños, Abel Gonzalez-Perez, Nuria Lopez-Bigas
The sparsity of mutations observed across tumours hinders our ability to study mutation rate variability at nucleotide resolution. To circumvent this, here we investigated the propensity of mutational processes to form mutational hotspots as a readout of their mutation rate variability at single base resolution. Mutational signatures 1 and 17 have the highest hotspot propensity (5-78 times higher than other processes). After accounting for trinucleotide mutational probabilities, sequence composition and mutational heterogeneity at 10 Kbp, most (94-95%) signature 17 hotspots remain unexplained, suggesting a significant role of local genomic features. For signature 1, the inclusion of genome-wide distribution of methylated CpG sites into models can explain most (80-100%) of the hotspot propensity. There is an increased hotspot propensity of signature 1 in normal tissues and de novo germline mutations. We demonstrate that hotspot propensity is a useful readout to assess the accuracy of mutation rate models at nucleotide resolution. This new approach and the findings derived from it open up new avenues for a range of somatic and germline studies investigating and modelling mutagenesis.
{"title":"Hotspot propensity across mutational processes.","authors":"Claudia Arnedo-Pac, Ferran Muiños, Abel Gonzalez-Perez, Nuria Lopez-Bigas","doi":"10.1038/s44320-023-00001-w","DOIUrl":"10.1038/s44320-023-00001-w","url":null,"abstract":"<p><p>The sparsity of mutations observed across tumours hinders our ability to study mutation rate variability at nucleotide resolution. To circumvent this, here we investigated the propensity of mutational processes to form mutational hotspots as a readout of their mutation rate variability at single base resolution. Mutational signatures 1 and 17 have the highest hotspot propensity (5-78 times higher than other processes). After accounting for trinucleotide mutational probabilities, sequence composition and mutational heterogeneity at 10 Kbp, most (94-95%) signature 17 hotspots remain unexplained, suggesting a significant role of local genomic features. For signature 1, the inclusion of genome-wide distribution of methylated CpG sites into models can explain most (80-100%) of the hotspot propensity. There is an increased hotspot propensity of signature 1 in normal tissues and de novo germline mutations. We demonstrate that hotspot propensity is a useful readout to assess the accuracy of mutation rate models at nucleotide resolution. This new approach and the findings derived from it open up new avenues for a range of somatic and germline studies investigating and modelling mutagenesis.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":9.9,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10883281/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139098307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-13DOI: 10.1038/s44320-023-00002-9
Tatiana Woller, Christopher J Cawthorne, Romain Raymond Agnes Slootmaekers, Ingrid Barcena Roig, Alexander Botzki, Sebastian Munck
{"title":"What we can learn from deep space communication for reproducible bioimaging and data analysis","authors":"Tatiana Woller, Christopher J Cawthorne, Romain Raymond Agnes Slootmaekers, Ingrid Barcena Roig, Alexander Botzki, Sebastian Munck","doi":"10.1038/s44320-023-00002-9","DOIUrl":"https://doi.org/10.1038/s44320-023-00002-9","url":null,"abstract":"","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":9.9,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139004856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-06Epub Date: 2023-11-20DOI: 10.15252/msb.202311801
Dokyun Na, Do-Hwan Lim, Jae-Sang Hong, Hyang-Mi Lee, Daeahn Cho, Myeong-Sang Yu, Bilal Shaker, Jun Ren, Bomi Lee, Jae Gwang Song, Yuna Oh, Kyungeun Lee, Kwang-Seok Oh, Mi Young Lee, Min-Seok Choi, Han Saem Choi, Yang-Hee Kim, Jennifer M Bui, Kangseok Lee, Hyung Wook Kim, Young Sik Lee, Jörg Gsponer
The accumulation of misfolded and aggregated proteins is a hallmark of neurodegenerative proteinopathies. Although multiple genetic loci have been associated with specific neurodegenerative diseases (NDs), molecular mechanisms that may have a broader relevance for most or all proteinopathies remain poorly resolved. In this study, we developed a multi-layered network expansion (MLnet) model to predict protein modifiers that are common to a group of diseases and, therefore, may have broader pathophysiological relevance for that group. When applied to the four NDs Alzheimer's disease (AD), Huntington's disease, and spinocerebellar ataxia types 1 and 3, we predicted multiple members of the insulin pathway, including PDK1, Akt1, InR, and sgg (GSK-3β), as common modifiers. We validated these modifiers with the help of four Drosophila ND models. Further evaluation of Akt1 in human cell-based ND models revealed that activation of Akt1 signaling by the small molecule SC79 increased cell viability in all models. Moreover, treatment of AD model mice with SC79 enhanced their long-term memory and ameliorated dysregulated anxiety levels, which are commonly affected in AD patients. These findings validate MLnet as a valuable tool to uncover molecular pathways and proteins involved in the pathophysiology of entire disease groups and identify potential therapeutic targets that have relevance across disease boundaries. MLnet can be used for any group of diseases and is available as a web tool at http://ssbio.cau.ac.kr/software/mlnet.
{"title":"A multi-layered network model identifies Akt1 as a common modulator of neurodegeneration.","authors":"Dokyun Na, Do-Hwan Lim, Jae-Sang Hong, Hyang-Mi Lee, Daeahn Cho, Myeong-Sang Yu, Bilal Shaker, Jun Ren, Bomi Lee, Jae Gwang Song, Yuna Oh, Kyungeun Lee, Kwang-Seok Oh, Mi Young Lee, Min-Seok Choi, Han Saem Choi, Yang-Hee Kim, Jennifer M Bui, Kangseok Lee, Hyung Wook Kim, Young Sik Lee, Jörg Gsponer","doi":"10.15252/msb.202311801","DOIUrl":"10.15252/msb.202311801","url":null,"abstract":"<p><p>The accumulation of misfolded and aggregated proteins is a hallmark of neurodegenerative proteinopathies. Although multiple genetic loci have been associated with specific neurodegenerative diseases (NDs), molecular mechanisms that may have a broader relevance for most or all proteinopathies remain poorly resolved. In this study, we developed a multi-layered network expansion (MLnet) model to predict protein modifiers that are common to a group of diseases and, therefore, may have broader pathophysiological relevance for that group. When applied to the four NDs Alzheimer's disease (AD), Huntington's disease, and spinocerebellar ataxia types 1 and 3, we predicted multiple members of the insulin pathway, including PDK1, Akt1, InR, and sgg (GSK-3β), as common modifiers. We validated these modifiers with the help of four Drosophila ND models. Further evaluation of Akt1 in human cell-based ND models revealed that activation of Akt1 signaling by the small molecule SC79 increased cell viability in all models. Moreover, treatment of AD model mice with SC79 enhanced their long-term memory and ameliorated dysregulated anxiety levels, which are commonly affected in AD patients. These findings validate MLnet as a valuable tool to uncover molecular pathways and proteins involved in the pathophysiology of entire disease groups and identify potential therapeutic targets that have relevance across disease boundaries. MLnet can be used for any group of diseases and is available as a web tool at http://ssbio.cau.ac.kr/software/mlnet.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":9.9,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10698508/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138176810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-06Epub Date: 2023-10-27DOI: 10.15252/msb.202311566
David B Bernstein, Batu Akkas, Morgan N Price, Adam P Arkin
The Escherichia coli genome-scale metabolic model (GEM) is an exemplar systems biology model for the simulation of cellular metabolism. Experimental validation of model predictions is essential to pinpoint uncertainty and ensure continued development of accurate models. Here, we quantified the accuracy of four subsequent E. coli GEMs using published mutant fitness data across thousands of genes and 25 different carbon sources. This evaluation demonstrated the utility of the area under a precision-recall curve relative to alternative accuracy metrics. An analysis of errors in the latest (iML1515) model identified several vitamins/cofactors that are likely available to mutants despite being absent from the experimental growth medium and highlighted isoenzyme gene-protein-reaction mapping as a key source of inaccurate predictions. A machine learning approach further identified metabolic fluxes through hydrogen ion exchange and specific central metabolism branch points as important determinants of model accuracy. This work outlines improved practices for the assessment of GEM accuracy with high-throughput mutant fitness data and highlights promising areas for future model refinement in E. coli and beyond.
{"title":"Evaluating E. coli genome-scale metabolic model accuracy with high-throughput mutant fitness data.","authors":"David B Bernstein, Batu Akkas, Morgan N Price, Adam P Arkin","doi":"10.15252/msb.202311566","DOIUrl":"10.15252/msb.202311566","url":null,"abstract":"<p><p>The Escherichia coli genome-scale metabolic model (GEM) is an exemplar systems biology model for the simulation of cellular metabolism. Experimental validation of model predictions is essential to pinpoint uncertainty and ensure continued development of accurate models. Here, we quantified the accuracy of four subsequent E. coli GEMs using published mutant fitness data across thousands of genes and 25 different carbon sources. This evaluation demonstrated the utility of the area under a precision-recall curve relative to alternative accuracy metrics. An analysis of errors in the latest (iML1515) model identified several vitamins/cofactors that are likely available to mutants despite being absent from the experimental growth medium and highlighted isoenzyme gene-protein-reaction mapping as a key source of inaccurate predictions. A machine learning approach further identified metabolic fluxes through hydrogen ion exchange and specific central metabolism branch points as important determinants of model accuracy. This work outlines improved practices for the assessment of GEM accuracy with high-throughput mutant fitness data and highlights promising areas for future model refinement in E. coli and beyond.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":null,"pages":null},"PeriodicalIF":9.9,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10698504/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"54230090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}