HPMPdb:与人类错义变异相关的蛋白质分子表型的机器学习就绪数据库

IF 2.7 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Current Research in Structural Biology Pub Date : 2022-01-01 DOI:10.1016/j.crstbi.2022.04.004
Daniele Raimondi , Francesco Codicè , Gabriele Orlando , Joost Schymkowitz , Frederic Rousseau , Yves Moreau
{"title":"HPMPdb:与人类错义变异相关的蛋白质分子表型的机器学习就绪数据库","authors":"Daniele Raimondi ,&nbsp;Francesco Codicè ,&nbsp;Gabriele Orlando ,&nbsp;Joost Schymkowitz ,&nbsp;Frederic Rousseau ,&nbsp;Yves Moreau","doi":"10.1016/j.crstbi.2022.04.004","DOIUrl":null,"url":null,"abstract":"<div><p>Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies on un-realistic assumptions such as the perfect Mendelian behavior of each SAV and considers only dichotomic phenotypes. Moreover, the link between the effect of a SAV on a protein (its molecular phenotype) and the individual phenotype is often very complex, because multiple level of biological abstraction connect the protein and individual level phenotypes. Here we present HPMPdb, a manually curated database containing human SAVs associated with the detailed description of the molecular phenotype they cause on the affected proteins. With particular regards to machine learning (ML), this database can be used to let researchers go beyond the existing Deleterious/Neutral prediction paradigm, allowing them to build molecular phenotype predictors instead. Our class labels describe in a succinct way the effects that each SAV has on 15 protein molecular phenotypes, such as protein-protein interaction, small molecules binding, function, post-translational modifications (PTMs), sub-cellular localization, mimetic PTM, folding and protein expression. Moreover, we provide researchers with all necessary means to re-producibly train and test their models on our database. The webserver and the data described in this paper are available at hpmp.esat.kuleuven.be.</p></div>","PeriodicalId":10870,"journal":{"name":"Current Research in Structural Biology","volume":"4 ","pages":"Pages 167-174"},"PeriodicalIF":2.7000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2665928X22000125/pdfft?md5=d27cdb5d3a4326b327b8edd15547ebac&pid=1-s2.0-S2665928X22000125-main.pdf","citationCount":"1","resultStr":"{\"title\":\"HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants\",\"authors\":\"Daniele Raimondi ,&nbsp;Francesco Codicè ,&nbsp;Gabriele Orlando ,&nbsp;Joost Schymkowitz ,&nbsp;Frederic Rousseau ,&nbsp;Yves Moreau\",\"doi\":\"10.1016/j.crstbi.2022.04.004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies on un-realistic assumptions such as the perfect Mendelian behavior of each SAV and considers only dichotomic phenotypes. Moreover, the link between the effect of a SAV on a protein (its molecular phenotype) and the individual phenotype is often very complex, because multiple level of biological abstraction connect the protein and individual level phenotypes. Here we present HPMPdb, a manually curated database containing human SAVs associated with the detailed description of the molecular phenotype they cause on the affected proteins. With particular regards to machine learning (ML), this database can be used to let researchers go beyond the existing Deleterious/Neutral prediction paradigm, allowing them to build molecular phenotype predictors instead. Our class labels describe in a succinct way the effects that each SAV has on 15 protein molecular phenotypes, such as protein-protein interaction, small molecules binding, function, post-translational modifications (PTMs), sub-cellular localization, mimetic PTM, folding and protein expression. Moreover, we provide researchers with all necessary means to re-producibly train and test their models on our database. The webserver and the data described in this paper are available at hpmp.esat.kuleuven.be.</p></div>\",\"PeriodicalId\":10870,\"journal\":{\"name\":\"Current Research in Structural Biology\",\"volume\":\"4 \",\"pages\":\"Pages 167-174\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2665928X22000125/pdfft?md5=d27cdb5d3a4326b327b8edd15547ebac&pid=1-s2.0-S2665928X22000125-main.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current Research in Structural Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2665928X22000125\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Research in Structural Biology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2665928X22000125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 1

摘要

目前的人类单氨基酸变体(sav)数据库提供了sav与它们对携带者个体表型的影响之间的联系,通常将它们分为有害/中性变体。这是对基因型-表型关系的一种非常粗粒度的描述,因为它依赖于不现实的假设,例如每个SAV的完美孟德尔行为,并且只考虑二分表型。此外,SAV对蛋白质(其分子表型)的影响与个体表型之间的联系通常非常复杂,因为多个生物抽象水平将蛋白质与个体水平表型联系起来。在这里,我们提出了HPMPdb,这是一个人工管理的数据库,包含人类sav,并详细描述了它们在受影响蛋白质上引起的分子表型。特别是在机器学习(ML)方面,该数据库可用于让研究人员超越现有的有害/中性预测范式,允许他们构建分子表型预测因子。我们的类标签以简洁的方式描述了每种SAV对15种蛋白质分子表型的影响,如蛋白质-蛋白质相互作用、小分子结合、功能、翻译后修饰(PTMs)、亚细胞定位、模拟PTM、折叠和蛋白质表达。此外,我们为研究人员提供了所有必要的手段,以便在我们的数据库上可重复地训练和测试他们的模型。本文所描述的web服务器和数据可在hmp.esat .kuleuven.be上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
HPMPdb: A machine learning-ready database of protein molecular phenotypes associated to human missense variants

Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies on un-realistic assumptions such as the perfect Mendelian behavior of each SAV and considers only dichotomic phenotypes. Moreover, the link between the effect of a SAV on a protein (its molecular phenotype) and the individual phenotype is often very complex, because multiple level of biological abstraction connect the protein and individual level phenotypes. Here we present HPMPdb, a manually curated database containing human SAVs associated with the detailed description of the molecular phenotype they cause on the affected proteins. With particular regards to machine learning (ML), this database can be used to let researchers go beyond the existing Deleterious/Neutral prediction paradigm, allowing them to build molecular phenotype predictors instead. Our class labels describe in a succinct way the effects that each SAV has on 15 protein molecular phenotypes, such as protein-protein interaction, small molecules binding, function, post-translational modifications (PTMs), sub-cellular localization, mimetic PTM, folding and protein expression. Moreover, we provide researchers with all necessary means to re-producibly train and test their models on our database. The webserver and the data described in this paper are available at hpmp.esat.kuleuven.be.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.60
自引率
0.00%
发文量
33
审稿时长
104 days
期刊最新文献
AI-based quality assessment methods for protein structure models from cryo-EM An easy-to-use three-dimensional protein-structure-prediction online platform "DPL3D" based on deep learning algorithms In silico approaches for developing sesquiterpene derivatives as antagonists of human nicotinic acetylcholine receptors (nAChRs) for nicotine addiction treatment Editorial Board Table of Contents
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1