Pub Date : 2024-11-19DOI: 10.1038/s44320-024-00075-0
David W Morgens, Leah Gulyas, Xiaowen Mao, Alejandro Rivera-Madera, Annabelle S Souza, Britt A Glaunsinger
Complex transcriptional control is a conserved feature of both eukaryotes and the viruses that infect them. Despite viral genomes being smaller and more gene dense than their hosts, we generally lack a sense of scope for the features governing the transcriptional output of individual viral genes. Even having a seemingly simple expression pattern does not imply that a gene's underlying regulation is straightforward. Here, we illustrate this by combining high-density functional genomics, expression profiling, and viral-specific chromosome conformation capture to define with unprecedented detail the transcriptional regulation of a single gene from Kaposi's sarcoma-associated herpesvirus (KSHV). We used as our model KSHV ORF68 - which has simple, early expression kinetics and is essential for viral genome packaging. We first identified seven cis-regulatory regions involved in ORF68 expression by densely tiling the ~154 kb KSHV genome with dCas9 fused to a transcriptional repressor domain (CRISPRi). A parallel Cas9 nuclease screen indicated that three of these regions act as promoters of genes that regulate ORF68. RNA expression profiling demonstrated that three more of these regions act by either repressing or enhancing other distal viral genes involved in ORF68 transcriptional regulation. Finally, we tracked how the 3D structure of the viral genome changes during its lifecycle, revealing that these enhancing regulatory elements are physically closer to their targets when active, and that disrupting some elements caused large-scale changes to the 3D genome. These data enable us to construct a complete model revealing that the mechanistic diversity of this essential regulatory circuit matches that of human genes.
{"title":"Enhancers and genome conformation provide complex transcriptional control of a herpesviral gene.","authors":"David W Morgens, Leah Gulyas, Xiaowen Mao, Alejandro Rivera-Madera, Annabelle S Souza, Britt A Glaunsinger","doi":"10.1038/s44320-024-00075-0","DOIUrl":"https://doi.org/10.1038/s44320-024-00075-0","url":null,"abstract":"<p><p>Complex transcriptional control is a conserved feature of both eukaryotes and the viruses that infect them. Despite viral genomes being smaller and more gene dense than their hosts, we generally lack a sense of scope for the features governing the transcriptional output of individual viral genes. Even having a seemingly simple expression pattern does not imply that a gene's underlying regulation is straightforward. Here, we illustrate this by combining high-density functional genomics, expression profiling, and viral-specific chromosome conformation capture to define with unprecedented detail the transcriptional regulation of a single gene from Kaposi's sarcoma-associated herpesvirus (KSHV). We used as our model KSHV ORF68 - which has simple, early expression kinetics and is essential for viral genome packaging. We first identified seven cis-regulatory regions involved in ORF68 expression by densely tiling the ~154 kb KSHV genome with dCas9 fused to a transcriptional repressor domain (CRISPRi). A parallel Cas9 nuclease screen indicated that three of these regions act as promoters of genes that regulate ORF68. RNA expression profiling demonstrated that three more of these regions act by either repressing or enhancing other distal viral genes involved in ORF68 transcriptional regulation. Finally, we tracked how the 3D structure of the viral genome changes during its lifecycle, revealing that these enhancing regulatory elements are physically closer to their targets when active, and that disrupting some elements caused large-scale changes to the 3D genome. These data enable us to construct a complete model revealing that the mechanistic diversity of this essential regulatory circuit matches that of human genes.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142676266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19DOI: 10.1038/s44320-024-00076-z
Deepak T Patel, Peter J Stogios, Lukasz Jaroszewski, Malene L Urbanus, Mayya Sedova, Cameron Semper, Cathy Le, Abraham Takkouche, Keita Ichii, Julie Innabi, Dhruvin H Patel, Alexander W Ensminger, Adam Godzik, Alexei Savchenko
Legionella pneumophila utilizes the Dot/Icm type IVB secretion system to deliver hundreds of effector proteins inside eukaryotic cells to ensure intracellular replication. Our understanding of the molecular functions of the largest pathogenic arsenal known to the bacterial world remains incomplete. By leveraging advancements in 3D protein structure prediction, we provide a comprehensive structural analysis of 368 L. pneumophila effectors, representing a global atlas of predicted functional domains summarized in a database ( https://pathogens3d.org/legionella-pneumophila ). Our analysis identified 157 types of diverse functional domains in 287 effectors, including 159 effectors with no prior functional annotations. Furthermore, we identified 35 cryptic domains in 30 effector models that have no similarity with experimentally structurally characterized proteins, thus, hinting at novel functionalities. Using this analysis, we demonstrate the activity of thirteen functional domains, including three cryptic domains, predicted in L. pneumophila effectors to cause growth defects in the Saccharomyces cerevisiae model system. This illustrates an emerging strategy of exploring synergies between predictions and targeted experimental approaches in elucidating novel effector activities involved in infection.
{"title":"Global atlas of predicted functional domains in Legionella pneumophila Dot/Icm translocated effectors.","authors":"Deepak T Patel, Peter J Stogios, Lukasz Jaroszewski, Malene L Urbanus, Mayya Sedova, Cameron Semper, Cathy Le, Abraham Takkouche, Keita Ichii, Julie Innabi, Dhruvin H Patel, Alexander W Ensminger, Adam Godzik, Alexei Savchenko","doi":"10.1038/s44320-024-00076-z","DOIUrl":"https://doi.org/10.1038/s44320-024-00076-z","url":null,"abstract":"<p><p>Legionella pneumophila utilizes the Dot/Icm type IVB secretion system to deliver hundreds of effector proteins inside eukaryotic cells to ensure intracellular replication. Our understanding of the molecular functions of the largest pathogenic arsenal known to the bacterial world remains incomplete. By leveraging advancements in 3D protein structure prediction, we provide a comprehensive structural analysis of 368 L. pneumophila effectors, representing a global atlas of predicted functional domains summarized in a database ( https://pathogens3d.org/legionella-pneumophila ). Our analysis identified 157 types of diverse functional domains in 287 effectors, including 159 effectors with no prior functional annotations. Furthermore, we identified 35 cryptic domains in 30 effector models that have no similarity with experimentally structurally characterized proteins, thus, hinting at novel functionalities. Using this analysis, we demonstrate the activity of thirteen functional domains, including three cryptic domains, predicted in L. pneumophila effectors to cause growth defects in the Saccharomyces cerevisiae model system. This illustrates an emerging strategy of exploring synergies between predictions and targeted experimental approaches in elucidating novel effector activities involved in infection.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142676352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-15DOI: 10.1038/s44320-024-00073-2
David Steinbrecht, Igor Minia, Miha Milek, Johannes Meisig, Nils Blüthgen, Markus Landthaler
Eukaryotic mRNAs are transcribed, processed, translated, and degraded in different subcellular compartments. Here, we measured mRNA flow rates between subcellular compartments in mouse embryonic stem cells. By combining metabolic RNA labeling, biochemical fractionation, mRNA sequencing, and mathematical modeling, we determined the half-lives of nuclear pre-, nuclear mature, cytosolic, and membrane-associated mRNAs from over 9000 genes. In addition, we estimated transcript elongation rates. Many matured mRNAs have long nuclear half-lives, indicating nuclear retention as the rate-limiting step in the flow of mRNAs. In contrast, mRNA transcripts coding for transcription factors show fast kinetic rates, and in particular short nuclear half-lives. Differentially localized mRNAs have distinct rate constant combinations, implying modular regulation. Membrane stability is high for membrane-localized mRNA and cytosolic stability is high for cytosol-localized mRNA. mRNAs encoding target signals for membranes have low cytosolic and high membrane half-lives with minor differences between signals. Transcripts of nuclear-encoded mitochondrial proteins have long nuclear retention and cytoplasmic kinetics that do not reflect co-translational targeting. Our data and analyses provide a useful resource to study spatiotemporal gene expression regulation.
{"title":"Subcellular mRNA kinetic modeling reveals nuclear retention as rate-limiting.","authors":"David Steinbrecht, Igor Minia, Miha Milek, Johannes Meisig, Nils Blüthgen, Markus Landthaler","doi":"10.1038/s44320-024-00073-2","DOIUrl":"https://doi.org/10.1038/s44320-024-00073-2","url":null,"abstract":"<p><p>Eukaryotic mRNAs are transcribed, processed, translated, and degraded in different subcellular compartments. Here, we measured mRNA flow rates between subcellular compartments in mouse embryonic stem cells. By combining metabolic RNA labeling, biochemical fractionation, mRNA sequencing, and mathematical modeling, we determined the half-lives of nuclear pre-, nuclear mature, cytosolic, and membrane-associated mRNAs from over 9000 genes. In addition, we estimated transcript elongation rates. Many matured mRNAs have long nuclear half-lives, indicating nuclear retention as the rate-limiting step in the flow of mRNAs. In contrast, mRNA transcripts coding for transcription factors show fast kinetic rates, and in particular short nuclear half-lives. Differentially localized mRNAs have distinct rate constant combinations, implying modular regulation. Membrane stability is high for membrane-localized mRNA and cytosolic stability is high for cytosol-localized mRNA. mRNAs encoding target signals for membranes have low cytosolic and high membrane half-lives with minor differences between signals. Transcripts of nuclear-encoded mitochondrial proteins have long nuclear retention and cytoplasmic kinetics that do not reflect co-translational targeting. Our data and analyses provide a useful resource to study spatiotemporal gene expression regulation.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-09-19DOI: 10.1038/s44320-024-00063-4
Martin Giera, Aries Aisporna, Winnie Uritboonthai, Linh Hoang, Rico J E Derks, Kara M Joseph, Erin S Baker, Gary Siuzdak
{"title":"XCMS-METLIN: data-driven metabolite, lipid, and chemical analysis.","authors":"Martin Giera, Aries Aisporna, Winnie Uritboonthai, Linh Hoang, Rico J E Derks, Kara M Joseph, Erin S Baker, Gary Siuzdak","doi":"10.1038/s44320-024-00063-4","DOIUrl":"10.1038/s44320-024-00063-4","url":null,"abstract":"","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"1153-1155"},"PeriodicalIF":8.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535300/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142291590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-10-07DOI: 10.1038/s44320-024-00068-z
Bradley W Biggs, Morgan N Price, Dexter Lai, Jasmine Escobedo, Yuridia Fortanel, Yolanda Y Huang, Kyoungmin Kim, Valentine V Trotter, Jennifer V Kuehl, Lauren M Lui, Romy Chakraborty, Adam M Deutschbauer, Adam P Arkin
Our ability to predict, control, or design biological function is fundamentally limited by poorly annotated gene function. This can be particularly challenging in non-model systems. Accordingly, there is motivation for new high-throughput methods for accurate functional annotation. Here, we used complementation of auxotrophs and DNA barcode sequencing (Coaux-Seq) to enable high-throughput characterization of protein function. Fragment libraries from eleven genetically diverse bacteria were tested in twenty different auxotrophic strains of Escherichia coli to identify genes that complement missing biochemical activity. We recovered 41% of expected hits, with effectiveness ranging per source genome, and observed success even with distant E. coli relatives like Bacillus subtilis and Bacteroides thetaiotaomicron. Coaux-Seq provided the first experimental validation for 53 proteins, of which 11 are less than 40% identical to an experimentally characterized protein. Among the unexpected function identified was a sulfate uptake transporter, an O-succinylhomoserine sulfhydrylase for methionine synthesis, and an aminotransferase. We also identified instances of cross-feeding wherein protein overexpression and nearby non-auxotrophic strains enabled growth. Altogether, Coaux-Seq's utility is demonstrated, with future applications in ecology, health, and engineering.
{"title":"High-throughput protein characterization by complementation using DNA barcoded fragment libraries.","authors":"Bradley W Biggs, Morgan N Price, Dexter Lai, Jasmine Escobedo, Yuridia Fortanel, Yolanda Y Huang, Kyoungmin Kim, Valentine V Trotter, Jennifer V Kuehl, Lauren M Lui, Romy Chakraborty, Adam M Deutschbauer, Adam P Arkin","doi":"10.1038/s44320-024-00068-z","DOIUrl":"10.1038/s44320-024-00068-z","url":null,"abstract":"<p><p>Our ability to predict, control, or design biological function is fundamentally limited by poorly annotated gene function. This can be particularly challenging in non-model systems. Accordingly, there is motivation for new high-throughput methods for accurate functional annotation. Here, we used complementation of auxotrophs and DNA barcode sequencing (Coaux-Seq) to enable high-throughput characterization of protein function. Fragment libraries from eleven genetically diverse bacteria were tested in twenty different auxotrophic strains of Escherichia coli to identify genes that complement missing biochemical activity. We recovered 41% of expected hits, with effectiveness ranging per source genome, and observed success even with distant E. coli relatives like Bacillus subtilis and Bacteroides thetaiotaomicron. Coaux-Seq provided the first experimental validation for 53 proteins, of which 11 are less than 40% identical to an experimentally characterized protein. Among the unexpected function identified was a sulfate uptake transporter, an O-succinylhomoserine sulfhydrylase for methionine synthesis, and an aminotransferase. We also identified instances of cross-feeding wherein protein overexpression and nearby non-auxotrophic strains enabled growth. Altogether, Coaux-Seq's utility is demonstrated, with future applications in ecology, health, and engineering.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"1207-1229"},"PeriodicalIF":8.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535334/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142391857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01DOI: 10.1038/s44320-024-00062-5
Matteo Mori, Zhongge Zhang, Amir Banaei-Esfahani, Jean-Benoît Lalanne, Hiroyuki Okano, Ben C Collins, Alexander Schmidt, Olga T Schubert, Deok-Sun Lee, Gene-Wei Li, Ruedi Aebersold, Terence Hwa, Christina Ludwig
{"title":"Author Correction: From coarse to fine: the absolute Escherichia coli proteome under diverse growth conditions.","authors":"Matteo Mori, Zhongge Zhang, Amir Banaei-Esfahani, Jean-Benoît Lalanne, Hiroyuki Okano, Ben C Collins, Alexander Schmidt, Olga T Schubert, Deok-Sun Lee, Gene-Wei Li, Ruedi Aebersold, Terence Hwa, Christina Ludwig","doi":"10.1038/s44320-024-00062-5","DOIUrl":"10.1038/s44320-024-00062-5","url":null,"abstract":"","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"1257-1259"},"PeriodicalIF":8.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535196/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142365857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-09-30DOI: 10.1038/s44320-024-00069-y
Lili M Kim, Horia Todor, Carol A Gross
Chemical genomics is a powerful and increasingly accessible technique to probe gene function, gene-gene interactions, and antibiotic synergies and antagonisms. Indeed, multiple large-scale pooled datasets in diverse organisms have been published. Here, we identify an artifact arising from uncorrected differences in the number of cell doublings between experiments within such datasets. We demonstrate that this artifact is widespread, show how it causes spurious gene-gene and drug-drug correlations, and present a simple but effective post hoc method for removing its effects. Using several published datasets, we demonstrate that this correction removes spurious correlations between genes and conditions, improving data interpretability and revealing new biological insights. Finally, we determine experimental factors that predispose a dataset for this artifact and suggest a set of experimental and computational guidelines for performing pooled chemical genomics experiments that will maximize the potential of this powerful technique.
{"title":"Correction of a widespread bias in pooled chemical genomics screens improves their interpretability.","authors":"Lili M Kim, Horia Todor, Carol A Gross","doi":"10.1038/s44320-024-00069-y","DOIUrl":"10.1038/s44320-024-00069-y","url":null,"abstract":"<p><p>Chemical genomics is a powerful and increasingly accessible technique to probe gene function, gene-gene interactions, and antibiotic synergies and antagonisms. Indeed, multiple large-scale pooled datasets in diverse organisms have been published. Here, we identify an artifact arising from uncorrected differences in the number of cell doublings between experiments within such datasets. We demonstrate that this artifact is widespread, show how it causes spurious gene-gene and drug-drug correlations, and present a simple but effective post hoc method for removing its effects. Using several published datasets, we demonstrate that this correction removes spurious correlations between genes and conditions, improving data interpretability and revealing new biological insights. Finally, we determine experimental factors that predispose a dataset for this artifact and suggest a set of experimental and computational guidelines for performing pooled chemical genomics experiments that will maximize the potential of this powerful technique.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"1173-1186"},"PeriodicalIF":8.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535069/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142350452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-09-25DOI: 10.1038/s44320-024-00065-2
Yeonghun Lee, Sung-Hye Park, Hyunju Lee
The 3D genome prediction in cancer is crucial for uncovering the impact of structural variations (SVs) on tumorigenesis, especially when they are present in noncoding regions. We present InfoHiC, a systemic framework for predicting the 3D cancer genome directly from whole-genome sequencing (WGS). InfoHiC utilizes contig-specific copy number encoding on the SV contig assembly, and performs a contig-to-total Hi-C conversion for the cancer Hi-C prediction from multiple SV contigs. We showed that InfoHiC can predict 3D genome folding from all types of SVs using breast cancer cell line data. We applied it to WGS data of patients with breast cancer and pediatric patients with medulloblastoma, and identified neo topologically associating domains. For breast cancer, we discovered super-enhancer hijacking events associated with oncogenic overexpression and poor survival outcomes. For medulloblastoma, we found SVs in noncoding regions that caused super-enhancer hijacking events of medulloblastoma driver genes (GFI1, GFI1B, and PRDM6). In addition, we provide trained models for cancer Hi-C prediction from WGS at https://github.com/dmcb-gist/InfoHiC , uncovering the impacts of SVs in cancer patients and revealing novel therapeutic targets.
{"title":"Prediction of the 3D cancer genome from whole-genome sequencing using InfoHiC.","authors":"Yeonghun Lee, Sung-Hye Park, Hyunju Lee","doi":"10.1038/s44320-024-00065-2","DOIUrl":"10.1038/s44320-024-00065-2","url":null,"abstract":"<p><p>The 3D genome prediction in cancer is crucial for uncovering the impact of structural variations (SVs) on tumorigenesis, especially when they are present in noncoding regions. We present InfoHiC, a systemic framework for predicting the 3D cancer genome directly from whole-genome sequencing (WGS). InfoHiC utilizes contig-specific copy number encoding on the SV contig assembly, and performs a contig-to-total Hi-C conversion for the cancer Hi-C prediction from multiple SV contigs. We showed that InfoHiC can predict 3D genome folding from all types of SVs using breast cancer cell line data. We applied it to WGS data of patients with breast cancer and pediatric patients with medulloblastoma, and identified neo topologically associating domains. For breast cancer, we discovered super-enhancer hijacking events associated with oncogenic overexpression and poor survival outcomes. For medulloblastoma, we found SVs in noncoding regions that caused super-enhancer hijacking events of medulloblastoma driver genes (GFI1, GFI1B, and PRDM6). In addition, we provide trained models for cancer Hi-C prediction from WGS at https://github.com/dmcb-gist/InfoHiC , uncovering the impacts of SVs in cancer patients and revealing novel therapeutic targets.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"1156-1172"},"PeriodicalIF":8.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535030/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142350453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-01Epub Date: 2024-09-27DOI: 10.1038/s44320-024-00064-3
Andrew J Sweatt, Cameron D Griffiths, Sarah M Groves, B Bishal Paudel, Lixin Wang, David F Kashatus, Kevin A Janes
Protein copy numbers constrain systems-level properties of regulatory networks, but proportional proteomic data remain scarce compared to RNA-seq. We related mRNA to protein statistically using best-available data from quantitative proteomics and transcriptomics for 4366 genes in 369 cell lines. The approach starts with a protein's median copy number and hierarchically appends mRNA-protein and mRNA-mRNA dependencies to define an optimal gene-specific model linking mRNAs to protein. For dozens of cell lines and primary samples, these protein inferences from mRNA outmatch stringent null models, a count-based protein-abundance repository, empirical mRNA-to-protein ratios, and a proteogenomic DREAM challenge winner. The optimal mRNA-to-protein relationships capture biological processes along with hundreds of known protein-protein complexes, suggesting mechanistic relationships. We use the method to identify a viral-receptor abundance threshold for coxsackievirus B3 susceptibility from 1489 systems-biology infection models parameterized by protein inference. When applied to 796 RNA-seq profiles of breast cancer, inferred copy-number estimates collectively re-classify 26-29% of luminal tumors. By adopting a gene-centered perspective of mRNA-protein covariation across different biological contexts, we achieve accuracies comparable to the technical reproducibility of contemporary proteomics.
{"title":"Proteome-wide copy-number estimation from transcriptomics.","authors":"Andrew J Sweatt, Cameron D Griffiths, Sarah M Groves, B Bishal Paudel, Lixin Wang, David F Kashatus, Kevin A Janes","doi":"10.1038/s44320-024-00064-3","DOIUrl":"10.1038/s44320-024-00064-3","url":null,"abstract":"<p><p>Protein copy numbers constrain systems-level properties of regulatory networks, but proportional proteomic data remain scarce compared to RNA-seq. We related mRNA to protein statistically using best-available data from quantitative proteomics and transcriptomics for 4366 genes in 369 cell lines. The approach starts with a protein's median copy number and hierarchically appends mRNA-protein and mRNA-mRNA dependencies to define an optimal gene-specific model linking mRNAs to protein. For dozens of cell lines and primary samples, these protein inferences from mRNA outmatch stringent null models, a count-based protein-abundance repository, empirical mRNA-to-protein ratios, and a proteogenomic DREAM challenge winner. The optimal mRNA-to-protein relationships capture biological processes along with hundreds of known protein-protein complexes, suggesting mechanistic relationships. We use the method to identify a viral-receptor abundance threshold for coxsackievirus B3 susceptibility from 1489 systems-biology infection models parameterized by protein inference. When applied to 796 RNA-seq profiles of breast cancer, inferred copy-number estimates collectively re-classify 26-29% of luminal tumors. By adopting a gene-centered perspective of mRNA-protein covariation across different biological contexts, we achieve accuracies comparable to the technical reproducibility of contemporary proteomics.</p>","PeriodicalId":18906,"journal":{"name":"Molecular Systems Biology","volume":" ","pages":"1230-1256"},"PeriodicalIF":8.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535397/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142350454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}