Learning the local landscape of protein structures with convolutional neural networks

IF 1.8 4区 生物学 Q3 BIOPHYSICS Journal of Biological Physics Pub Date : 2021-11-09 DOI:10.1007/s10867-021-09593-6
Anastasiya V. Kulikova, Daniel J. Diaz, James M. Loy, Andrew D. Ellington, Claus O. Wilke
{"title":"Learning the local landscape of protein structures with convolutional neural networks","authors":"Anastasiya V. Kulikova,&nbsp;Daniel J. Diaz,&nbsp;James M. Loy,&nbsp;Andrew D. Ellington,&nbsp;Claus O. Wilke","doi":"10.1007/s10867-021-09593-6","DOIUrl":null,"url":null,"abstract":"<div><p>One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.</p></div>","PeriodicalId":612,"journal":{"name":"Journal of Biological Physics","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2021-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10867-021-09593-6.pdf","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biological Physics","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10867-021-09593-6","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOPHYSICS","Score":null,"Total":0}
引用次数: 5

Abstract

One fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用卷积神经网络学习蛋白质结构的局部景观
从氨基酸序列预测蛋白质结构是蛋白质生物化学的一个基本问题。相反的问题,即预测与给定蛋白质结构一致的整个序列或个体突变,尽管在蛋白质工程和进化生物学中都有重要的应用,但却很少受到关注。在这里,我们询问3D卷积神经网络(3D cnn)是否可以学习蛋白质结构的局部适应度景观,从而可靠地预测野生型氨基酸或从感兴趣位点周围的局部结构背景中对多序列比对的共识。我们发现网络可以很准确地预测野生型,网络置信度是一个可靠的衡量给定的预测是否可能是正确的。共识的预测不太准确,主要是由共识是否与野性类型相匹配驱动的。我们的工作表明,对野生型的高可信度错误预测可能会识别出突变的起始位点和蛋白质工程的可能靶标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Biological Physics
Journal of Biological Physics 生物-生物物理
CiteScore
3.00
自引率
5.60%
发文量
20
审稿时长
>12 weeks
期刊介绍: Many physicists are turning their attention to domains that were not traditionally part of physics and are applying the sophisticated tools of theoretical, computational and experimental physics to investigate biological processes, systems and materials. The Journal of Biological Physics provides a medium where this growing community of scientists can publish its results and discuss its aims and methods. It welcomes papers which use the tools of physics in an innovative way to study biological problems, as well as research aimed at providing a better understanding of the physical principles underlying biological processes.
期刊最新文献
Pseudo-trajectory inference for identifying essential regulations and molecules in cell fate decisions Stochastic model of seed dispersal with homogeneous and non-homogeneous Poisson processes under habitat reduction conditions Exploring the effects of simulated microgravity on esophageal cancer cells: insights into morphological, growth behavior, adhesion, and genetic damage A possible origin of the inverted vertebrate retina revealed by physical modeling Motor domain of condensin and step formation in extruding loop of DNA
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1