AlzDiscovery: A computational tool to identify Alzheimer's disease‐causing missense mutations using protein structure information

IF 4.5 3区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Protein Science Pub Date : 2024-09-14 DOI:10.1002/pro.5147
Qisheng Pan, Georgina Becerra Parra, Yoochan Myung, Stephanie Portelli, Thanh Binh Nguyen, David B. Ascher
{"title":"AlzDiscovery: A computational tool to identify Alzheimer's disease‐causing missense mutations using protein structure information","authors":"Qisheng Pan, Georgina Becerra Parra, Yoochan Myung, Stephanie Portelli, Thanh Binh Nguyen, David B. Ascher","doi":"10.1002/pro.5147","DOIUrl":null,"url":null,"abstract":"Alzheimer's disease (AD) is one of the most common forms of dementia and neurodegenerative diseases, characterized by the formation of neuritic plaques and neurofibrillary tangles. Many different proteins participate in this complicated pathogenic mechanism, and missense mutations can alter the folding and functions of these proteins, significantly increasing the risk of AD. However, many methods to identify AD‐causing variants did not consider the effect of mutations from the perspective of a protein three‐dimensional environment. Here, we present a machine learning‐based analysis to classify the AD‐causing mutations from their benign counterparts in 21 AD‐related proteins leveraging both sequence‐ and structure‐based features. Using computational tools to estimate the effect of mutations on protein stability, we first observed a bias of the pathogenic mutations with significant destabilizing effects on family AD‐related proteins. Combining this insight, we built a generic predictive model, and improved the performance by tuning the sample weights in the training process. Our final model achieved the performance on area under the receiver operating characteristic curve up to 0.95 in the blind test and 0.70 in an independent clinical validation, outperforming all the state‐of‐the‐art methods. Feature interpretation indicated that the hydrophobic environment and polar interaction contacts were crucial to the decision on pathogenic phenotypes of missense mutations. Finally, we presented a user‐friendly web server, AlzDiscovery, for researchers to browse the predicted phenotypes of all possible missense mutations on these 21 AD‐related proteins. Our study will be a valuable resource for AD screening and the development of personalized treatment.","PeriodicalId":20761,"journal":{"name":"Protein Science","volume":"22 1","pages":""},"PeriodicalIF":4.5000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein Science","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pro.5147","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Alzheimer's disease (AD) is one of the most common forms of dementia and neurodegenerative diseases, characterized by the formation of neuritic plaques and neurofibrillary tangles. Many different proteins participate in this complicated pathogenic mechanism, and missense mutations can alter the folding and functions of these proteins, significantly increasing the risk of AD. However, many methods to identify AD‐causing variants did not consider the effect of mutations from the perspective of a protein three‐dimensional environment. Here, we present a machine learning‐based analysis to classify the AD‐causing mutations from their benign counterparts in 21 AD‐related proteins leveraging both sequence‐ and structure‐based features. Using computational tools to estimate the effect of mutations on protein stability, we first observed a bias of the pathogenic mutations with significant destabilizing effects on family AD‐related proteins. Combining this insight, we built a generic predictive model, and improved the performance by tuning the sample weights in the training process. Our final model achieved the performance on area under the receiver operating characteristic curve up to 0.95 in the blind test and 0.70 in an independent clinical validation, outperforming all the state‐of‐the‐art methods. Feature interpretation indicated that the hydrophobic environment and polar interaction contacts were crucial to the decision on pathogenic phenotypes of missense mutations. Finally, we presented a user‐friendly web server, AlzDiscovery, for researchers to browse the predicted phenotypes of all possible missense mutations on these 21 AD‐related proteins. Our study will be a valuable resource for AD screening and the development of personalized treatment.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
AlzDiscovery:利用蛋白质结构信息识别阿尔茨海默病致病错义突变的计算工具
阿尔茨海默病(AD)是最常见的痴呆症和神经退行性疾病之一,其特征是神经嵴斑块和神经纤维缠结的形成。许多不同的蛋白质都参与了这一复杂的致病机制,而错义突变会改变这些蛋白质的折叠和功能,从而大大增加患痴呆症的风险。然而,许多识别导致注意力缺失症变异的方法都没有从蛋白质三维环境的角度考虑突变的影响。在这里,我们提出了一种基于机器学习的分析方法,利用序列和结构特征对21种AD相关蛋白中的致AD变异和良性变异进行分类。利用计算工具估算突变对蛋白质稳定性的影响,我们首先观察到致病突变对AD相关蛋白家族具有显著的不稳定性影响。结合这一发现,我们建立了一个通用预测模型,并在训练过程中通过调整样本权重提高了模型的性能。我们的最终模型在盲测中的接收者操作特征曲线下面积达到了 0.95,在独立临床验证中达到了 0.70,优于所有最先进的方法。特征解释表明,疏水环境和极性相互作用接触对决定错义突变的致病表型至关重要。最后,我们介绍了一个用户友好型网络服务器 AlzDiscovery,供研究人员浏览这 21 种 AD 相关蛋白上所有可能的错义突变的预测表型。我们的研究将成为AD筛查和开发个性化治疗的宝贵资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Protein Science
Protein Science 生物-生化与分子生物学
CiteScore
12.40
自引率
1.20%
发文量
246
审稿时长
1 months
期刊介绍: Protein Science, the flagship journal of The Protein Society, is a publication that focuses on advancing fundamental knowledge in the field of protein molecules. The journal welcomes original reports and review articles that contribute to our understanding of protein function, structure, folding, design, and evolution. Additionally, Protein Science encourages papers that explore the applications of protein science in various areas such as therapeutics, protein-based biomaterials, bionanotechnology, synthetic biology, and bioelectronics. The journal accepts manuscript submissions in any suitable format for review, with the requirement of converting the manuscript to journal-style format only upon acceptance for publication. Protein Science is indexed and abstracted in numerous databases, including the Agricultural & Environmental Science Database (ProQuest), Biological Science Database (ProQuest), CAS: Chemical Abstracts Service (ACS), Embase (Elsevier), Health & Medical Collection (ProQuest), Health Research Premium Collection (ProQuest), Materials Science & Engineering Database (ProQuest), MEDLINE/PubMed (NLM), Natural Science Collection (ProQuest), and SciTech Premium Collection (ProQuest).
期刊最新文献
A protein fitness predictive framework based on feature combination and intelligent searching. Amino acid variability at W194 of Staphylococcus aureus sortase A alters nucleophile specificity. Characterization of DsrD and its interaction with the DsrAB dissimilatory sulfite reductase. Complexity associated with caprylate binding to bovine serum albumin: Dimerization, allostery, and variance between the change in free energy and enthalpy of binding. Disulfide-mediated oligomerization of mutant Cu/Zn-superoxide dismutase associated with canine degenerative myelopathy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1