结合物理学克服蛋白质功能预测建模中的数据短缺:BK通道的案例研究。

IF 4.3 2区 生物学 PLoS Computational Biology Pub Date : 2023-09-15 eCollection Date: 2023-09-01 DOI:10.1371/journal.pcbi.1011460
Erik Nordquist, Guohui Zhang, Shrishti Barethiya, Nathan Ji, Kelli M White, Lu Han, Zhiguang Jia, Jingyi Shi, Jianmin Cui, Jianhan Chen
{"title":"结合物理学克服蛋白质功能预测建模中的数据短缺:BK通道的案例研究。","authors":"Erik Nordquist,&nbsp;Guohui Zhang,&nbsp;Shrishti Barethiya,&nbsp;Nathan Ji,&nbsp;Kelli M White,&nbsp;Lu Han,&nbsp;Zhiguang Jia,&nbsp;Jingyi Shi,&nbsp;Jianmin Cui,&nbsp;Jianhan Chen","doi":"10.1371/journal.pcbi.1011460","DOIUrl":null,"url":null,"abstract":"<p><p>Machine learning has played transformative roles in numerous chemical and biophysical problems such as protein folding where large amount of data exists. Nonetheless, many important problems remain challenging for data-driven machine learning approaches due to the limitation of data scarcity. One approach to overcome data scarcity is to incorporate physical principles such as through molecular modeling and simulation. Here, we focus on the big potassium (BK) channels that play important roles in cardiovascular and neural systems. Many mutants of BK channel are associated with various neurological and cardiovascular diseases, but the molecular effects are unknown. The voltage gating properties of BK channels have been characterized for 473 site-specific mutations experimentally over the last three decades; yet, these functional data by themselves remain far too sparse to derive a predictive model of BK channel voltage gating. Using physics-based modeling, we quantify the energetic effects of all single mutations on both open and closed states of the channel. Together with dynamic properties derived from atomistic simulations, these physical descriptors allow the training of random forest models that could reproduce unseen experimentally measured shifts in gating voltage, ∆V1/2, with a RMSE ~ 32 mV and correlation coefficient of R ~ 0.7. Importantly, the model appears capable of uncovering nontrivial physical principles underlying the gating of the channel, including a central role of hydrophobic gating. The model was further evaluated using four novel mutations of L235 and V236 on the S5 helix, mutations of which are predicted to have opposing effects on V1/2 and suggest a key role of S5 in mediating voltage sensor-pore coupling. The measured ∆V1/2 agree quantitatively with prediction for all four mutations, with a high correlation of R = 0.92 and RMSE = 18 mV. Therefore, the model can capture nontrivial voltage gating properties in regions where few mutations are known. The success of predictive modeling of BK voltage gating demonstrates the potential of combining physics and statistical learning for overcoming data scarcity in nontrivial protein function prediction.</p>","PeriodicalId":49688,"journal":{"name":"PLoS Computational Biology","volume":"19 9","pages":"e1011460"},"PeriodicalIF":4.3000,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10529646/pdf/","citationCount":"0","resultStr":"{\"title\":\"Incorporating physics to overcome data scarcity in predictive modeling of protein function: A case study of BK channels.\",\"authors\":\"Erik Nordquist,&nbsp;Guohui Zhang,&nbsp;Shrishti Barethiya,&nbsp;Nathan Ji,&nbsp;Kelli M White,&nbsp;Lu Han,&nbsp;Zhiguang Jia,&nbsp;Jingyi Shi,&nbsp;Jianmin Cui,&nbsp;Jianhan Chen\",\"doi\":\"10.1371/journal.pcbi.1011460\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Machine learning has played transformative roles in numerous chemical and biophysical problems such as protein folding where large amount of data exists. Nonetheless, many important problems remain challenging for data-driven machine learning approaches due to the limitation of data scarcity. One approach to overcome data scarcity is to incorporate physical principles such as through molecular modeling and simulation. Here, we focus on the big potassium (BK) channels that play important roles in cardiovascular and neural systems. Many mutants of BK channel are associated with various neurological and cardiovascular diseases, but the molecular effects are unknown. The voltage gating properties of BK channels have been characterized for 473 site-specific mutations experimentally over the last three decades; yet, these functional data by themselves remain far too sparse to derive a predictive model of BK channel voltage gating. Using physics-based modeling, we quantify the energetic effects of all single mutations on both open and closed states of the channel. Together with dynamic properties derived from atomistic simulations, these physical descriptors allow the training of random forest models that could reproduce unseen experimentally measured shifts in gating voltage, ∆V1/2, with a RMSE ~ 32 mV and correlation coefficient of R ~ 0.7. Importantly, the model appears capable of uncovering nontrivial physical principles underlying the gating of the channel, including a central role of hydrophobic gating. The model was further evaluated using four novel mutations of L235 and V236 on the S5 helix, mutations of which are predicted to have opposing effects on V1/2 and suggest a key role of S5 in mediating voltage sensor-pore coupling. The measured ∆V1/2 agree quantitatively with prediction for all four mutations, with a high correlation of R = 0.92 and RMSE = 18 mV. Therefore, the model can capture nontrivial voltage gating properties in regions where few mutations are known. The success of predictive modeling of BK voltage gating demonstrates the potential of combining physics and statistical learning for overcoming data scarcity in nontrivial protein function prediction.</p>\",\"PeriodicalId\":49688,\"journal\":{\"name\":\"PLoS Computational Biology\",\"volume\":\"19 9\",\"pages\":\"e1011460\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2023-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10529646/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS Computational Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pcbi.1011460\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/9/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pcbi.1011460","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/9/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

机器学习在许多化学和生物物理问题中发挥了变革性作用,例如存在大量数据的蛋白质折叠。尽管如此,由于数据稀缺的限制,数据驱动的机器学习方法仍然面临许多重要问题。克服数据匮乏的一种方法是结合物理原理,例如通过分子建模和模拟。在这里,我们关注在心血管和神经系统中发挥重要作用的大钾(BK)通道。BK通道的许多突变体与各种神经和心血管疾病有关,但其分子效应尚不清楚。在过去三十年中,BK通道的电压门控特性已经通过实验表征了473个位点特异性突变;然而,这些函数数据本身仍然过于稀疏,无法导出BK沟道电压门控的预测模型。使用基于物理的建模,我们量化了所有单个突变对通道打开和关闭状态的能量影响。结合原子模拟得出的动态特性,这些物理描述符允许训练随机森林模型,该模型可以再现未经实验测量的门控电压∆V1/2的偏移,RMSE~32mV,相关系数R~0.7。重要的是,该模型似乎能够揭示通道门控的重要物理原理,包括疏水门控的核心作用。使用S5螺旋上L235和V236的四个新突变对该模型进行了进一步评估,据预测,这两个突变对V1/2具有相反的影响,并表明S5在介导电压传感器-孔耦合中发挥着关键作用。测量的∆V1/2在数量上与所有四种突变的预测一致,具有R=0.92和RMSE=18mV的高度相关性。因此,该模型可以捕捉到已知突变很少的区域中的非平凡电压门控特性。BK电压门控预测建模的成功证明了将物理学和统计学习相结合以克服非平凡蛋白质功能预测中数据匮乏的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Incorporating physics to overcome data scarcity in predictive modeling of protein function: A case study of BK channels.

Machine learning has played transformative roles in numerous chemical and biophysical problems such as protein folding where large amount of data exists. Nonetheless, many important problems remain challenging for data-driven machine learning approaches due to the limitation of data scarcity. One approach to overcome data scarcity is to incorporate physical principles such as through molecular modeling and simulation. Here, we focus on the big potassium (BK) channels that play important roles in cardiovascular and neural systems. Many mutants of BK channel are associated with various neurological and cardiovascular diseases, but the molecular effects are unknown. The voltage gating properties of BK channels have been characterized for 473 site-specific mutations experimentally over the last three decades; yet, these functional data by themselves remain far too sparse to derive a predictive model of BK channel voltage gating. Using physics-based modeling, we quantify the energetic effects of all single mutations on both open and closed states of the channel. Together with dynamic properties derived from atomistic simulations, these physical descriptors allow the training of random forest models that could reproduce unseen experimentally measured shifts in gating voltage, ∆V1/2, with a RMSE ~ 32 mV and correlation coefficient of R ~ 0.7. Importantly, the model appears capable of uncovering nontrivial physical principles underlying the gating of the channel, including a central role of hydrophobic gating. The model was further evaluated using four novel mutations of L235 and V236 on the S5 helix, mutations of which are predicted to have opposing effects on V1/2 and suggest a key role of S5 in mediating voltage sensor-pore coupling. The measured ∆V1/2 agree quantitatively with prediction for all four mutations, with a high correlation of R = 0.92 and RMSE = 18 mV. Therefore, the model can capture nontrivial voltage gating properties in regions where few mutations are known. The success of predictive modeling of BK voltage gating demonstrates the potential of combining physics and statistical learning for overcoming data scarcity in nontrivial protein function prediction.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
PLoS Computational Biology
PLoS Computational Biology 生物-生化研究方法
CiteScore
7.10
自引率
4.70%
发文量
820
期刊介绍: PLOS Computational Biology features works of exceptional significance that further our understanding of living systems at all scales—from molecules and cells, to patient populations and ecosystems—through the application of computational methods. Readers include life and computational scientists, who can take the important findings presented here to the next level of discovery. Research articles must be declared as belonging to a relevant section. More information about the sections can be found in the submission guidelines. Research articles should model aspects of biological systems, demonstrate both methodological and scientific novelty, and provide profound new biological insights. Generally, reliability and significance of biological discovery through computation should be validated and enriched by experimental studies. Inclusion of experimental validation is not required for publication, but should be referenced where possible. Inclusion of experimental validation of a modest biological discovery through computation does not render a manuscript suitable for PLOS Computational Biology. Research articles specifically designated as Methods papers should describe outstanding methods of exceptional importance that have been shown, or have the promise to provide new biological insights. The method must already be widely adopted, or have the promise of wide adoption by a broad community of users. Enhancements to existing published methods will only be considered if those enhancements bring exceptional new capabilities.
期刊最新文献
Real-time forecasting of COVID-19-related hospital strain in France using a non-Markovian mechanistic model. Ten simple rules for teaching an introduction to R Evolutionary analyses of intrinsically disordered regions reveal widespread signals of conservation. A weak coupling mechanism for the early steps of the recovery stroke of myosin VI: A free energy simulation and string method analysis. Validity conditions of approximations for a target-mediated drug disposition model: A novel first-order approximation and its comparison to other approximations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1