Accurate and efficient structure-based computational mutagenesis for modeling fluorescence levels of Aequorea victoria green fluorescent protein mutants.

IF 2.6 4区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Protein Engineering Design & Selection Pub Date : 2020-09-14 DOI:10.1093/protein/gzaa022
Majid Masso
{"title":"Accurate and efficient structure-based computational mutagenesis for modeling fluorescence levels of Aequorea victoria green fluorescent protein mutants.","authors":"Majid Masso","doi":"10.1093/protein/gzaa022","DOIUrl":null,"url":null,"abstract":"<p><p>A computational mutagenesis technique was used to characterize the structural effects associated with over 46 000 single and multiple amino acid variants of Aequorea victoria green fluorescent protein (GFP), whose functional effects (fluorescence levels) were recently measured by experimental researchers. For each GFP mutant, the approach generated a single score reflecting the overall change in sequence-structure compatibility relative to native GFP, as well as a vector of environmental perturbation (EP) scores characterizing the impact at all GFP residue positions. A significant GFP structure-function relationship (P < 0.0001) was elucidated by comparing the sequence-structure compatibility scores with the functional data. Next, the computed vectors for GFP mutants were used to train predictive models of fluorescence by implementing random forest (RF) classification and tree regression machine learning algorithms. Classification performance reached 0.93 for sensitivity, 0.91 for precision and 0.90 for balanced accuracy, and regression models led to Pearson's correlation as high as r = 0.83 between experimental and predicted GFP mutant fluorescence. An RF model trained on a subset of over 1000 experimental single residue GFP mutants with measured fluorescence was used for predicting the 3300 remaining unstudied single residue mutants, with results complementing known GFP biochemical and biophysical properties. In addition, models trained on the subset of experimental GFP mutants harboring multiple residue replacements successfully predicted fluorescence of the single residue GFP mutants. The models developed for this study were accurate and efficient, and their predictions outperformed those of several related state-of-the-art methods.</p>","PeriodicalId":54543,"journal":{"name":"Protein Engineering Design & Selection","volume":"33 ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/protein/gzaa022","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein Engineering Design & Selection","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/protein/gzaa022","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 1

Abstract

A computational mutagenesis technique was used to characterize the structural effects associated with over 46 000 single and multiple amino acid variants of Aequorea victoria green fluorescent protein (GFP), whose functional effects (fluorescence levels) were recently measured by experimental researchers. For each GFP mutant, the approach generated a single score reflecting the overall change in sequence-structure compatibility relative to native GFP, as well as a vector of environmental perturbation (EP) scores characterizing the impact at all GFP residue positions. A significant GFP structure-function relationship (P < 0.0001) was elucidated by comparing the sequence-structure compatibility scores with the functional data. Next, the computed vectors for GFP mutants were used to train predictive models of fluorescence by implementing random forest (RF) classification and tree regression machine learning algorithms. Classification performance reached 0.93 for sensitivity, 0.91 for precision and 0.90 for balanced accuracy, and regression models led to Pearson's correlation as high as r = 0.83 between experimental and predicted GFP mutant fluorescence. An RF model trained on a subset of over 1000 experimental single residue GFP mutants with measured fluorescence was used for predicting the 3300 remaining unstudied single residue mutants, with results complementing known GFP biochemical and biophysical properties. In addition, models trained on the subset of experimental GFP mutants harboring multiple residue replacements successfully predicted fluorescence of the single residue GFP mutants. The models developed for this study were accurate and efficient, and their predictions outperformed those of several related state-of-the-art methods.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
准确、高效的基于结构的计算诱变法模拟维多利亚绿荧光蛋白突变体的荧光水平。
利用计算诱变技术表征了维多利亚绿荧光蛋白(Aequorea victoria green fluorescent protein, GFP)超过46000个单氨基酸和多氨基酸变异的结构效应,实验研究人员最近测量了这些变异的功能效应(荧光水平)。对于每个GFP突变体,该方法生成了一个单独的分数,反映了相对于原生GFP序列结构兼容性的总体变化,以及一个环境扰动(EP)分数向量,表征了所有GFP残基位置的影响。显著的GFP结构-功能关系(P
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Protein Engineering Design & Selection
Protein Engineering Design & Selection 生物-生化与分子生物学
CiteScore
3.30
自引率
4.20%
发文量
14
审稿时长
6-12 weeks
期刊介绍: Protein Engineering, Design and Selection (PEDS) publishes high-quality research papers and review articles relevant to the engineering, design and selection of proteins for use in biotechnology and therapy, and for understanding the fundamental link between protein sequence, structure, dynamics, function, and evolution.
期刊最新文献
Optimized single-cell gates for yeast display screening. TIMED-Design: flexible and accessible protein sequence design with convolutional neural networks. Correction to: De novo design of a polycarbonate hydrolase. Interactive computational and experimental approaches improve the sensitivity of periplasmic binding protein-based nicotine biosensors for measurements in biofluids. Design of functional intrinsically disordered proteins.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1