蛋白质稳定性模型无法捕捉双点突变的表观相互作用。

IF 4.5 3区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Protein Science Pub Date : 2025-01-01 DOI:10.1002/pro.70003
Henry Dieckhaus, Brian Kuhlman
{"title":"蛋白质稳定性模型无法捕捉双点突变的表观相互作用。","authors":"Henry Dieckhaus, Brian Kuhlman","doi":"10.1002/pro.70003","DOIUrl":null,"url":null,"abstract":"<p><p>There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single-point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We find that additive models of protein stability perform surprisingly well on this task, achieving similar performance to comparable non-additive predictors according to most metrics. Accordingly, we find that neither artificial intelligence-based nor physics-based protein stability models consistently capture epistatic interactions between single mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions than additive models on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling, as well as a novel data augmentation scheme, which mitigates some of the limitations in currently available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.</p>","PeriodicalId":20761,"journal":{"name":"Protein Science","volume":"34 1","pages":"e70003"},"PeriodicalIF":4.5000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659742/pdf/","citationCount":"0","resultStr":"{\"title\":\"Protein stability models fail to capture epistatic interactions of double point mutations.\",\"authors\":\"Henry Dieckhaus, Brian Kuhlman\",\"doi\":\"10.1002/pro.70003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single-point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We find that additive models of protein stability perform surprisingly well on this task, achieving similar performance to comparable non-additive predictors according to most metrics. Accordingly, we find that neither artificial intelligence-based nor physics-based protein stability models consistently capture epistatic interactions between single mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions than additive models on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling, as well as a novel data augmentation scheme, which mitigates some of the limitations in currently available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.</p>\",\"PeriodicalId\":20761,\"journal\":{\"name\":\"Protein Science\",\"volume\":\"34 1\",\"pages\":\"e70003\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659742/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Protein Science\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1002/pro.70003\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein Science","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pro.70003","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

有强烈的兴趣,准确的方法来预测变化的蛋白质稳定性导致氨基酸突变的蛋白质序列。重组蛋白通常必须稳定才能用作治疗药物或试剂,而不稳定突变与多种疾病有关。由于数据可用性的增加和建模技术的改进,最近的研究表明,当单点突变发生时,在预测蛋白质稳定性变化方面取得了进展。当存在两个或更多突变时,预测蛋白质稳定性变化的关注较少。在这里,我们分析了最大的双点突变稳定性数据集,并在此数据集和其他数据集上对几种广泛使用的蛋白质稳定性模型进行了基准测试。我们发现蛋白质稳定性的加性模型在这项任务上表现得非常好,根据大多数指标,与可比的非加性预测器实现了相似的性能。因此,我们发现无论是基于人工智能还是基于物理的蛋白质稳定性模型都不能一致地捕获单个突变之间的上位相互作用。我们观察到这一趋势的一个显著偏差,即上位感知模型在稳定双点突变方面提供了比加性模型稍好的预测。我们开发了用于双突变建模的ThermoMPNN框架的扩展,以及一种新的数据增强方案,该方案减轻了当前可用数据集的一些限制。总的来说,我们的研究结果表明,由于几个因素,包括训练数据集的限制和模型灵敏度不足,目前的蛋白质稳定性模型未能捕捉到并发突变之间细微的上位相互作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Protein stability models fail to capture epistatic interactions of double point mutations.

There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single-point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We find that additive models of protein stability perform surprisingly well on this task, achieving similar performance to comparable non-additive predictors according to most metrics. Accordingly, we find that neither artificial intelligence-based nor physics-based protein stability models consistently capture epistatic interactions between single mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions than additive models on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling, as well as a novel data augmentation scheme, which mitigates some of the limitations in currently available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Protein Science
Protein Science 生物-生化与分子生物学
CiteScore
12.40
自引率
1.20%
发文量
246
审稿时长
1 months
期刊介绍: Protein Science, the flagship journal of The Protein Society, is a publication that focuses on advancing fundamental knowledge in the field of protein molecules. The journal welcomes original reports and review articles that contribute to our understanding of protein function, structure, folding, design, and evolution. Additionally, Protein Science encourages papers that explore the applications of protein science in various areas such as therapeutics, protein-based biomaterials, bionanotechnology, synthetic biology, and bioelectronics. The journal accepts manuscript submissions in any suitable format for review, with the requirement of converting the manuscript to journal-style format only upon acceptance for publication. Protein Science is indexed and abstracted in numerous databases, including the Agricultural & Environmental Science Database (ProQuest), Biological Science Database (ProQuest), CAS: Chemical Abstracts Service (ACS), Embase (Elsevier), Health & Medical Collection (ProQuest), Health Research Premium Collection (ProQuest), Materials Science & Engineering Database (ProQuest), MEDLINE/PubMed (NLM), Natural Science Collection (ProQuest), and SciTech Premium Collection (ProQuest).
期刊最新文献
A functional helix shuffled variant of the B domain of Staphylococcus aureus. AFFIPred: AlphaFold2 structure-based Functional Impact Prediction of missense variations. AggNet: Advancing protein aggregation analysis through deep learning and protein language model. Allosteric modulation of NF1 GAP: Differential distributions of catalytically competent populations in loss-of-function and gain-of-function mutants. Citrullination at the N-terminal region of MDM2 by the PADI4 enzyme.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1