Pub Date : 2023-09-20Epub Date: 2023-08-23DOI: 10.1016/j.cels.2023.07.007
Yuxing Liao, Sara R Savage, Yongchao Dou, Zhiao Shi, Xinpei Yi, Wen Jiang, Jonathan T Lei, Bing Zhang
By combining mass-spectrometry-based proteomics and phosphoproteomics with genomics, epi-genomics, and transcriptomics, proteogenomics provides comprehensive molecular characterization of cancer. Using this approach, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) has characterized over 1,000 primary tumors spanning 10 cancer types, many with matched normal tissues. Here, we present LinkedOmicsKB, a proteogenomics data-driven knowledge base that makes consistently processed and systematically precomputed CPTAC pan-cancer proteogenomics data available to the public through ∼40,000 gene-, protein-, mutation-, and phenotype-centric web pages. Visualization techniques facilitate efficient exploration and reasoning of complex, interconnected data. Using three case studies, we illustrate the practical utility of LinkedOmicsKB in providing new insights into genes, phosphorylation sites, somatic mutations, and cancer phenotypes. With precomputed results of 19,701 coding genes, 125,969 phosphosites, and 256 genotypes and phenotypes, LinkedOmicsKB provides a comprehensive resource to accelerate proteogenomics data-driven discoveries to improve our understanding and treatment of human cancer. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"A proteogenomics data-driven knowledge base of human cancer.","authors":"Yuxing Liao, Sara R Savage, Yongchao Dou, Zhiao Shi, Xinpei Yi, Wen Jiang, Jonathan T Lei, Bing Zhang","doi":"10.1016/j.cels.2023.07.007","DOIUrl":"10.1016/j.cels.2023.07.007","url":null,"abstract":"<p><p>By combining mass-spectrometry-based proteomics and phosphoproteomics with genomics, epi-genomics, and transcriptomics, proteogenomics provides comprehensive molecular characterization of cancer. Using this approach, the Clinical Proteomic Tumor Analysis Consortium (CPTAC) has characterized over 1,000 primary tumors spanning 10 cancer types, many with matched normal tissues. Here, we present LinkedOmicsKB, a proteogenomics data-driven knowledge base that makes consistently processed and systematically precomputed CPTAC pan-cancer proteogenomics data available to the public through ∼40,000 gene-, protein-, mutation-, and phenotype-centric web pages. Visualization techniques facilitate efficient exploration and reasoning of complex, interconnected data. Using three case studies, we illustrate the practical utility of LinkedOmicsKB in providing new insights into genes, phosphorylation sites, somatic mutations, and cancer phenotypes. With precomputed results of 19,701 coding genes, 125,969 phosphosites, and 256 genotypes and phenotypes, LinkedOmicsKB provides a comprehensive resource to accelerate proteogenomics data-driven discoveries to improve our understanding and treatment of human cancer. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":" ","pages":"777-787.e5"},"PeriodicalIF":9.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10530292/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10070752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-20Epub Date: 2023-08-25DOI: 10.1016/j.cels.2023.07.008
Teresa E Knudsen, William B Hamilton, Martin Proks, Maria Lykkegaard, Madeleine Linneberg-Agerholm, Alexander V Nielsen, Marta Perera, Luna Lynge Malzard, Ala Trusina, Joshua M Brickman
Cooperative DNA binding of transcription factors (TFs) integrates the cellular context to support cell specification during development. Naive mouse embryonic stem cells are derived from early development and can sustain their pluripotent identity indefinitely. Here, we ask whether TFs associated with pluripotency evolved to directly support this state or if the state emerges from their combinatorial action. NANOG and ESRRB are key pluripotency factors that co-bind DNA. We find that when both factors are expressed, ESRRB supports pluripotency. However, when NANOG is absent, ESRRB supports a bistable culture of cells with an embryo-like primitive endoderm identity ancillary to pluripotency. The stoichiometry between NANOG and ESRRB allows quantitative titration of this differentiation, and in silico modeling of bipartite ESRRB activity suggests it safeguards plasticity in differentiation. Thus, the concerted activity of cooperative TFs can transform their effect to sustain intermediate cell identities and allow ex vivo expansion of immortal stem cells. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"A bipartite function of ESRRB can integrate signaling over time to balance self-renewal and differentiation.","authors":"Teresa E Knudsen, William B Hamilton, Martin Proks, Maria Lykkegaard, Madeleine Linneberg-Agerholm, Alexander V Nielsen, Marta Perera, Luna Lynge Malzard, Ala Trusina, Joshua M Brickman","doi":"10.1016/j.cels.2023.07.008","DOIUrl":"10.1016/j.cels.2023.07.008","url":null,"abstract":"<p><p>Cooperative DNA binding of transcription factors (TFs) integrates the cellular context to support cell specification during development. Naive mouse embryonic stem cells are derived from early development and can sustain their pluripotent identity indefinitely. Here, we ask whether TFs associated with pluripotency evolved to directly support this state or if the state emerges from their combinatorial action. NANOG and ESRRB are key pluripotency factors that co-bind DNA. We find that when both factors are expressed, ESRRB supports pluripotency. However, when NANOG is absent, ESRRB supports a bistable culture of cells with an embryo-like primitive endoderm identity ancillary to pluripotency. The stoichiometry between NANOG and ESRRB allows quantitative titration of this differentiation, and in silico modeling of bipartite ESRRB activity suggests it safeguards plasticity in differentiation. Thus, the concerted activity of cooperative TFs can transform their effect to sustain intermediate cell identities and allow ex vivo expansion of immortal stem cells. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":" ","pages":"788-805.e8"},"PeriodicalIF":9.3,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10075495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-20Epub Date: 2023-08-04DOI: 10.1016/j.cels.2023.07.001
Adi X Mukund, Josh Tycko, Sage J Allen, Stephanie A Robinson, Cecelia Andrews, Joydeb Sinha, Connor H Ludwig, Kaitlyn Spees, Michael C Bassik, Lacramioara Bintu
Despite growing knowledge of the functions of individual human transcriptional effector domains, much less is understood about how multiple effector domains within the same protein combine to regulate gene expression. Here, we measure transcriptional activity for 8,400 effector domain combinations by recruiting them to reporter genes in human cells. In our assay, weak and moderate activation domains synergize to drive strong gene expression, whereas combining strong activators often results in weaker activation. In contrast, repressors combine linearly and produce full gene silencing, and repressor domains often overpower activation domains. We use this information to build a synthetic transcription factor whose function can be tuned between repression and activation independent of recruitment to target genes by using a small-molecule drug. Altogether, we outline the basic principles of how effector domains combine to regulate gene expression and demonstrate their value in building precise and flexible synthetic biology tools. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"High-throughput functional characterization of combinations of transcriptional activators and repressors.","authors":"Adi X Mukund, Josh Tycko, Sage J Allen, Stephanie A Robinson, Cecelia Andrews, Joydeb Sinha, Connor H Ludwig, Kaitlyn Spees, Michael C Bassik, Lacramioara Bintu","doi":"10.1016/j.cels.2023.07.001","DOIUrl":"10.1016/j.cels.2023.07.001","url":null,"abstract":"<p><p>Despite growing knowledge of the functions of individual human transcriptional effector domains, much less is understood about how multiple effector domains within the same protein combine to regulate gene expression. Here, we measure transcriptional activity for 8,400 effector domain combinations by recruiting them to reporter genes in human cells. In our assay, weak and moderate activation domains synergize to drive strong gene expression, whereas combining strong activators often results in weaker activation. In contrast, repressors combine linearly and produce full gene silencing, and repressor domains often overpower activation domains. We use this information to build a synthetic transcription factor whose function can be tuned between repression and activation independent of recruitment to target genes by using a small-molecule drug. Altogether, we outline the basic principles of how effector domains combine to regulate gene expression and demonstrate their value in building precise and flexible synthetic biology tools. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":" ","pages":"746-763.e5"},"PeriodicalIF":9.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10642976/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10218034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-16DOI: 10.1016/j.cels.2023.04.008
Ashley N Hersey, Valerie E Kay, Sumin Lee, Matthew J Realff, Corey J Wilson
Allosteric transcription factors (aTFs) are used in a myriad of processes throughout biology and biotechnology. aTFs have served as the workhorses for developments in synthetic biology, fundamental research, and protein manufacturing. One of the most utilized TFs is the lactose repressor (LacI). In addition to being an exceptional tool for gene regulation, LacI has also served as an outstanding model system for understanding allosteric communication. In this perspective, we will use the LacI TF as the principal exemplar for engineering alternate functions related to allostery-i.e., alternate protein DNA interactions, alternate protein-ligand interactions, and alternate phenotypic mechanisms. In addition, we will summarize the design rules and heuristics for each design goal and demonstrate how the resulting design rules and heuristics can be extrapolated to engineer other aTFs with a similar topology-i.e., from the broader LacI/GalR family of TFs.
{"title":"Engineering allosteric transcription factors guided by the LacI topology.","authors":"Ashley N Hersey, Valerie E Kay, Sumin Lee, Matthew J Realff, Corey J Wilson","doi":"10.1016/j.cels.2023.04.008","DOIUrl":"https://doi.org/10.1016/j.cels.2023.04.008","url":null,"abstract":"<p><p>Allosteric transcription factors (aTFs) are used in a myriad of processes throughout biology and biotechnology. aTFs have served as the workhorses for developments in synthetic biology, fundamental research, and protein manufacturing. One of the most utilized TFs is the lactose repressor (LacI). In addition to being an exceptional tool for gene regulation, LacI has also served as an outstanding model system for understanding allosteric communication. In this perspective, we will use the LacI TF as the principal exemplar for engineering alternate functions related to allostery-i.e., alternate protein DNA interactions, alternate protein-ligand interactions, and alternate phenotypic mechanisms. In addition, we will summarize the design rules and heuristics for each design goal and demonstrate how the resulting design rules and heuristics can be extrapolated to engineer other aTFs with a similar topology-i.e., from the broader LacI/GalR family of TFs.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":"14 8","pages":"645-655"},"PeriodicalIF":9.3,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10046056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-16DOI: 10.1016/j.cels.2023.07.006
Atul Deshpande, Melanie Loth, Dimitrios N Sidiropoulos, Shuming Zhang, Long Yuan, Alexander T F Bell, Qingfeng Zhu, Won Jin Ho, Cesar Santa-Maria, Daniele M Gilkes, Stephen R Williams, Cedric R Uytingco, Jennifer Chew, Andrej Hartnett, Zachary W Bent, Alexander V Favorov, Aleksander S Popel, Mark Yarchoan, Ashley Kiemen, Pei-Hsun Wu, Kohei Fujikura, Denis Wirtz, Laura D Wood, Lei Zheng, Elizabeth M Jaffee, Robert A Anders, Ludmila Danilova, Genevieve Stein-O'Brien, Luciane T Kagohara, Elana J Fertig
{"title":"Uncovering the spatial landscape of molecular interactions within the tumor microenvironment through latent spaces.","authors":"Atul Deshpande, Melanie Loth, Dimitrios N Sidiropoulos, Shuming Zhang, Long Yuan, Alexander T F Bell, Qingfeng Zhu, Won Jin Ho, Cesar Santa-Maria, Daniele M Gilkes, Stephen R Williams, Cedric R Uytingco, Jennifer Chew, Andrej Hartnett, Zachary W Bent, Alexander V Favorov, Aleksander S Popel, Mark Yarchoan, Ashley Kiemen, Pei-Hsun Wu, Kohei Fujikura, Denis Wirtz, Laura D Wood, Lei Zheng, Elizabeth M Jaffee, Robert A Anders, Ludmila Danilova, Genevieve Stein-O'Brien, Luciane T Kagohara, Elana J Fertig","doi":"10.1016/j.cels.2023.07.006","DOIUrl":"10.1016/j.cels.2023.07.006","url":null,"abstract":"","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":"14 8","pages":"722"},"PeriodicalIF":9.3,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10523348/pdf/nihms-1925932.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10531707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-16DOI: 10.1016/j.cels.2023.04.009
Emily K Makowski, Hsin-Ting Chen, Peter M Tessier
Machine learning is transforming antibody engineering by enabling the generation of drug-like monoclonal antibodies with unprecedented efficiency. Unsupervised algorithms trained on massive and diverse protein sequence datasets facilitate the prediction of panels of antibody variants with native-like intrinsic properties (e.g., high stability), greatly reducing the amount of subsequent experimentation needed to identify specific candidates that also possess desired extrinsic properties (e.g., high affinity). Additionally, supervised algorithms, which are trained on deep sequencing datasets obtained after enrichment of in vitro antibody libraries for one or more specific extrinsic properties, enable the prediction of antibody variants with desired combinations of extrinsic properties without the need for additional screening. Here we review recent advances using both machine learning approaches and how they are impacting the field of antibody engineering as well as key outstanding challenges and opportunities for these paradigm-changing methods.
{"title":"Simplifying complex antibody engineering using machine learning.","authors":"Emily K Makowski, Hsin-Ting Chen, Peter M Tessier","doi":"10.1016/j.cels.2023.04.009","DOIUrl":"10.1016/j.cels.2023.04.009","url":null,"abstract":"<p><p>Machine learning is transforming antibody engineering by enabling the generation of drug-like monoclonal antibodies with unprecedented efficiency. Unsupervised algorithms trained on massive and diverse protein sequence datasets facilitate the prediction of panels of antibody variants with native-like intrinsic properties (e.g., high stability), greatly reducing the amount of subsequent experimentation needed to identify specific candidates that also possess desired extrinsic properties (e.g., high affinity). Additionally, supervised algorithms, which are trained on deep sequencing datasets obtained after enrichment of in vitro antibody libraries for one or more specific extrinsic properties, enable the prediction of antibody variants with desired combinations of extrinsic properties without the need for additional screening. Here we review recent advances using both machine learning approaches and how they are impacting the field of antibody engineering as well as key outstanding challenges and opportunities for these paradigm-changing methods.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":"14 8","pages":"667-675"},"PeriodicalIF":9.0,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10733906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10421318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One of the key points of machine learning-assisted directed evolution (MLDE) is the accurate learning of the fitness landscape, a conceptual mapping from sequence variants to the desired function. Here, we describe a multi-protein training scheme that leverages the existing deep mutational scanning data from diverse proteins to aid in understanding the fitness landscape of a new protein. Proof-of-concept trials are designed to validate this training scheme in three aspects: random and positional extrapolation for single-variant effects, zero-shot fitness predictions for new proteins, and extrapolation for higher-order variant effects from single-variant effects. Moreover, our study identified previously overlooked strong baselines, and their unexpectedly good performance brings our attention to the pitfalls of MLDE. Overall, these results may improve our understanding of the association between different protein fitness profiles and shed light on developing better machine learning-assisted approaches to the directed evolution of proteins. A record of this paper's transparent peer review process is included in the supplemental information.
{"title":"Learning protein fitness landscapes with deep mutational scanning data from multiple sources.","authors":"Lin Chen, Zehong Zhang, Zhenghao Li, Rui Li, Ruifeng Huo, Lifan Chen, Dingyan Wang, Xiaomin Luo, Kaixian Chen, Cangsong Liao, Mingyue Zheng","doi":"10.1016/j.cels.2023.07.003","DOIUrl":"https://doi.org/10.1016/j.cels.2023.07.003","url":null,"abstract":"<p><p>One of the key points of machine learning-assisted directed evolution (MLDE) is the accurate learning of the fitness landscape, a conceptual mapping from sequence variants to the desired function. Here, we describe a multi-protein training scheme that leverages the existing deep mutational scanning data from diverse proteins to aid in understanding the fitness landscape of a new protein. Proof-of-concept trials are designed to validate this training scheme in three aspects: random and positional extrapolation for single-variant effects, zero-shot fitness predictions for new proteins, and extrapolation for higher-order variant effects from single-variant effects. Moreover, our study identified previously overlooked strong baselines, and their unexpectedly good performance brings our attention to the pitfalls of MLDE. Overall, these results may improve our understanding of the association between different protein fitness profiles and shed light on developing better machine learning-assisted approaches to the directed evolution of proteins. A record of this paper's transparent peer review process is included in the supplemental information.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":"14 8","pages":"706-721.e5"},"PeriodicalIF":9.3,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10046053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Protein-ligand interactions are essential for cellular activities and drug discovery processes. Appropriately and effectively representing protein features is of vital importance for developing computational approaches, especially data-driven methods, for predicting protein-ligand interactions. However, existing approaches may not fully investigate the features of the ligand-occupying regions in the protein pockets. Here, we design a structure-based protein representation method, named PocketAnchor, for capturing the local environmental and spatial features of protein pockets to facilitate protein-ligand interaction-related learning tasks. We define "anchors" as probe points reaching into the cavities and those located near the surface of proteins, and we design a specific message passing strategy for gathering local information from the atoms and surface neighboring these anchors. Comprehensive evaluation of our method demonstrated its successful applications in pocket detection and binding affinity prediction, which indicated that our anchor-based approach can provide effective protein feature representations for improving the prediction of protein-ligand interactions.
{"title":"PocketAnchor: Learning structure-based pocket representations for protein-ligand interaction prediction.","authors":"Shuya Li, Tingzhong Tian, Ziting Zhang, Ziheng Zou, Dan Zhao, Jianyang Zeng","doi":"10.1016/j.cels.2023.05.005","DOIUrl":"https://doi.org/10.1016/j.cels.2023.05.005","url":null,"abstract":"<p><p>Protein-ligand interactions are essential for cellular activities and drug discovery processes. Appropriately and effectively representing protein features is of vital importance for developing computational approaches, especially data-driven methods, for predicting protein-ligand interactions. However, existing approaches may not fully investigate the features of the ligand-occupying regions in the protein pockets. Here, we design a structure-based protein representation method, named PocketAnchor, for capturing the local environmental and spatial features of protein pockets to facilitate protein-ligand interaction-related learning tasks. We define \"anchors\" as probe points reaching into the cavities and those located near the surface of proteins, and we design a specific message passing strategy for gathering local information from the atoms and surface neighboring these anchors. Comprehensive evaluation of our method demonstrated its successful applications in pocket detection and binding affinity prediction, which indicated that our anchor-based approach can provide effective protein feature representations for improving the prediction of protein-ligand interactions.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":"14 8","pages":"692-705.e6"},"PeriodicalIF":9.3,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10401429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-16Epub Date: 2023-07-25DOI: 10.1016/j.cels.2023.06.009
Adam McConnell, Benjamin J Hackel
Discovery and evolution of new and improved proteins has empowered molecular therapeutics, diagnostics, and industrial biotechnology. Discovery and evolution both require efficient screens and effective libraries, although they differ in their challenges because of the absence or presence, respectively, of an initial protein variant with the desired function. A host of high-throughput technologies-experimental and computational-enable efficient screens to identify performant protein variants. In partnership, an informed search of sequence space is needed to overcome the immensity, sparsity, and complexity of the sequence-performance landscape. Early in the historical trajectory of protein engineering, these elements aligned with distinct approaches to identify the most performant sequence: selection from large, randomized combinatorial libraries versus rational computational design. Substantial advances have now emerged from the synergy of these perspectives. Rational design of combinatorial libraries aids the experimental search of sequence space, and high-throughput, high-integrity experimental data inform computational design. At the core of the collaborative interface, efficient protein characterization (rather than mere selection of optimal variants) maps sequence-performance landscapes. Such quantitative maps elucidate the complex relationships between protein sequence and performance-e.g., binding, catalytic efficiency, biological activity, and developability-thereby advancing fundamental protein science and facilitating protein discovery and evolution.
{"title":"Protein engineering via sequence-performance mapping.","authors":"Adam McConnell, Benjamin J Hackel","doi":"10.1016/j.cels.2023.06.009","DOIUrl":"10.1016/j.cels.2023.06.009","url":null,"abstract":"<p><p>Discovery and evolution of new and improved proteins has empowered molecular therapeutics, diagnostics, and industrial biotechnology. Discovery and evolution both require efficient screens and effective libraries, although they differ in their challenges because of the absence or presence, respectively, of an initial protein variant with the desired function. A host of high-throughput technologies-experimental and computational-enable efficient screens to identify performant protein variants. In partnership, an informed search of sequence space is needed to overcome the immensity, sparsity, and complexity of the sequence-performance landscape. Early in the historical trajectory of protein engineering, these elements aligned with distinct approaches to identify the most performant sequence: selection from large, randomized combinatorial libraries versus rational computational design. Substantial advances have now emerged from the synergy of these perspectives. Rational design of combinatorial libraries aids the experimental search of sequence space, and high-throughput, high-integrity experimental data inform computational design. At the core of the collaborative interface, efficient protein characterization (rather than mere selection of optimal variants) maps sequence-performance landscapes. Such quantitative maps elucidate the complex relationships between protein sequence and performance-e.g., binding, catalytic efficiency, biological activity, and developability-thereby advancing fundamental protein science and facilitating protein discovery and evolution.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":"14 8","pages":"656-666"},"PeriodicalIF":9.0,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10527434/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10047733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Proteins are critical to cellular function and survival. They are complex molecules with precise structures and chemistries, which allow them to serve diverse functions for maintaining overall cell homeostasis. Since the discovery of the first enzyme in 1833, a gamut of advanced experimental and computational tools has been developed and deployed for understanding protein structure and function. Recent studies have demonstrated the ability to redesign/alter natural proteins for applications in industrial processes of interest and to make customized, novel synthetic proteins in the laboratory through protein engineering. We comprehensively review the successes in engineering pore-forming proteins and correlate the amino acid-level biochemistry of different pore modification strategies to the intended applications limited to nucleotide/peptide sequencing, single-molecule sensing, and precise molecular separations.
{"title":"Protein engineering of pores for separation, sensing, and sequencing.","authors":"Laxmicharan Samineni, Bibek Acharya, Harekrushna Behera, Hyeonji Oh, Manish Kumar, Ratul Chowdhury","doi":"10.1016/j.cels.2023.07.004","DOIUrl":"https://doi.org/10.1016/j.cels.2023.07.004","url":null,"abstract":"<p><p>Proteins are critical to cellular function and survival. They are complex molecules with precise structures and chemistries, which allow them to serve diverse functions for maintaining overall cell homeostasis. Since the discovery of the first enzyme in 1833, a gamut of advanced experimental and computational tools has been developed and deployed for understanding protein structure and function. Recent studies have demonstrated the ability to redesign/alter natural proteins for applications in industrial processes of interest and to make customized, novel synthetic proteins in the laboratory through protein engineering. We comprehensively review the successes in engineering pore-forming proteins and correlate the amino acid-level biochemistry of different pore modification strategies to the intended applications limited to nucleotide/peptide sequencing, single-molecule sensing, and precise molecular separations.</p>","PeriodicalId":54348,"journal":{"name":"Cell Systems","volume":"14 8","pages":"676-691"},"PeriodicalIF":9.3,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10046052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}