Hui-Ling Huang, Chong-Heng Weng, Torbjörn E M Nordling, Yi-Fan Liou
{"title":"ThermalProGAN: A sequence-based thermally stable protein generator trained using unpaired data.","authors":"Hui-Ling Huang, Chong-Heng Weng, Torbjörn E M Nordling, Yi-Fan Liou","doi":"10.1142/S0219720023500087","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>The synthesis of proteins with novel desired properties is challenging but sought after by the industry and academia. The dominating approach is based on trial-and-error inducing point mutations, assisted by structural information or predictive models built with paired data that are difficult to collect. This study proposes a sequence-based unpaired-sample of novel protein inventor (SUNI) to build ThermalProGAN for generating thermally stable proteins based on sequence information.</p><p><strong>Results: </strong>The ThermalProGAN can strongly mutate the input sequence with a median number of 32 residues. A known normal protein, 1RG0, was used to generate a thermally stable form by mutating 51 residues. After superimposing the two structures, high similarity is shown, indicating that the basic function would be conserved. Eighty four molecular dynamics simulation results of 1RG0 and the COVID-19 vaccine candidates with a total simulation time of 840[Formula: see text]ns indicate that the thermal stability increased.</p><p><strong>Conclusion: </strong>This proof of concept demonstrated that transfer of a desired protein property from one set of proteins is feasible. <b>Availability and implementation:</b> The source code of ThermalProGAN can be freely accessed at https://github.com/markliou/ThermalProGAN/ with an MIT license. The website is https://thermalprogan.markliou.tw:433. <b>Supplementary information:</b> Supplementary data are available on Github.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350008"},"PeriodicalIF":0.9000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bioinformatics and Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1142/S0219720023500087","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: The synthesis of proteins with novel desired properties is challenging but sought after by the industry and academia. The dominating approach is based on trial-and-error inducing point mutations, assisted by structural information or predictive models built with paired data that are difficult to collect. This study proposes a sequence-based unpaired-sample of novel protein inventor (SUNI) to build ThermalProGAN for generating thermally stable proteins based on sequence information.
Results: The ThermalProGAN can strongly mutate the input sequence with a median number of 32 residues. A known normal protein, 1RG0, was used to generate a thermally stable form by mutating 51 residues. After superimposing the two structures, high similarity is shown, indicating that the basic function would be conserved. Eighty four molecular dynamics simulation results of 1RG0 and the COVID-19 vaccine candidates with a total simulation time of 840[Formula: see text]ns indicate that the thermal stability increased.
Conclusion: This proof of concept demonstrated that transfer of a desired protein property from one set of proteins is feasible. Availability and implementation: The source code of ThermalProGAN can be freely accessed at https://github.com/markliou/ThermalProGAN/ with an MIT license. The website is https://thermalprogan.markliou.tw:433. Supplementary information: Supplementary data are available on Github.
期刊介绍:
The Journal of Bioinformatics and Computational Biology aims to publish high quality, original research articles, expository tutorial papers and review papers as well as short, critical comments on technical issues associated with the analysis of cellular information.
The research papers will be technical presentations of new assertions, discoveries and tools, intended for a narrower specialist community. The tutorials, reviews and critical commentary will be targeted at a broader readership of biologists who are interested in using computers but are not knowledgeable about scientific computing, and equally, computer scientists who have an interest in biology but are not familiar with current thrusts nor the language of biology. Such carefully chosen tutorials and articles should greatly accelerate the rate of entry of these new creative scientists into the field.