Creation of a Curated Database of Experimentally Determined Human Protein Structures for the Identification of Its Targetome.

Armand Ovanessians, Carson Snow, Thomas Jennewein, Susanta Sarkar, Gil Speyer, Judith Klein-Seetharaman
{"title":"Creation of a Curated Database of Experimentally Determined Human Protein Structures for the Identification of Its Targetome.","authors":"Armand Ovanessians, Carson Snow, Thomas Jennewein, Susanta Sarkar, Gil Speyer, Judith Klein-Seetharaman","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Assembling an \"integrated structural map of the human cell\" at atomic resolution will require a complete set of all human protein structures available for interaction with other biomolecules - the human protein structure targetome - and a pipeline of automated tools that allow quantitative analysis of millions of protein-ligand interactions. Toward this goal, we here describe the creation of a curated database of experimentally determined human protein structures. Starting with the sequences of 20,422 human proteins, we selected the most representative structure for each protein (if available) from the protein database (PDB), ranking structures by coverage of sequence by structure, depth (the difference between the final and initial residue number of each chain), resolution, and experimental method used to determine the structure. To enable expansion into an entire human targetome, we docked small molecule ligands to our curated set of protein structures. Using design constraints derived from comparing structure assembly and ligand docking results obtained with challenging protein examples, we here propose to combine this curated database of experimental structures with AlphaFold predictions and multi-domain assembly using DEMO2 in the future. To demonstrate the utility of our curated database in identification of the human protein structure targetome, we used docking with AutoDock Vina and created tools for automated analysis of affinity and binding site locations of the thousands of protein-ligand prediction results. The resulting human targetome, which can be updated and expanded with an evolving curated database and increasing numbers of ligands, is a valuable addition to the growing toolkit of structural bioinformatics.</p>","PeriodicalId":34954,"journal":{"name":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","volume":"29 ","pages":"291-305"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Assembling an "integrated structural map of the human cell" at atomic resolution will require a complete set of all human protein structures available for interaction with other biomolecules - the human protein structure targetome - and a pipeline of automated tools that allow quantitative analysis of millions of protein-ligand interactions. Toward this goal, we here describe the creation of a curated database of experimentally determined human protein structures. Starting with the sequences of 20,422 human proteins, we selected the most representative structure for each protein (if available) from the protein database (PDB), ranking structures by coverage of sequence by structure, depth (the difference between the final and initial residue number of each chain), resolution, and experimental method used to determine the structure. To enable expansion into an entire human targetome, we docked small molecule ligands to our curated set of protein structures. Using design constraints derived from comparing structure assembly and ligand docking results obtained with challenging protein examples, we here propose to combine this curated database of experimental structures with AlphaFold predictions and multi-domain assembly using DEMO2 in the future. To demonstrate the utility of our curated database in identification of the human protein structure targetome, we used docking with AutoDock Vina and created tools for automated analysis of affinity and binding site locations of the thousands of protein-ligand prediction results. The resulting human targetome, which can be updated and expanded with an evolving curated database and increasing numbers of ligands, is a valuable addition to the growing toolkit of structural bioinformatics.

分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
创建实验确定的人类蛋白质结构编辑数据库,以确定其目标组。
要绘制原子分辨率的 "人类细胞综合结构图",需要一套完整的可与其他生物大分子相互作用的人类蛋白质结构--人类蛋白质结构目标组--以及一套可对数百万种蛋白质配体相互作用进行定量分析的自动化工具。为了实现这一目标,我们在此介绍了如何创建一个经实验确定的人类蛋白质结构数据库。从 20,422 个人类蛋白质的序列开始,我们从蛋白质数据库(PDB)中为每个蛋白质选择了最具代表性的结构(如果有的话),按照结构的序列覆盖率、深度(每条链的最终残基数与初始残基数之差)、分辨率以及确定结构所用的实验方法对结构进行排序。为了能够扩展到整个人类靶标组,我们将小分子配体与我们策划的蛋白质结构集对接。通过比较结构组装和配体对接结果与具有挑战性的蛋白质实例得出的设计约束,我们在此建议将来将这个实验结构策展数据库与 AlphaFold 预测和使用 DEMO2 的多域组装结合起来。为了证明我们所策划的数据库在识别人类蛋白质结构目标组方面的实用性,我们使用了 AutoDock Vina 进行对接,并创建了用于自动分析数千个蛋白质配体预测结果的亲和力和结合位点位置的工具。由此产生的人类靶标组可以随着不断发展的数据库和配体数量的增加而更新和扩展,是对结构生物信息学日益增长的工具包的宝贵补充。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.50
自引率
0.00%
发文量
0
期刊最新文献
FedBrain: Federated Training of Graph Neural Networks for Connectome-based Brain Imaging Analysis. Generating new drug repurposing hypotheses using disease-specific hypergraphs. Impact of Measurement Noise on Genetic Association Studies of Cardiac Function. Imputation of race and ethnicity categories using genetic ancestry from real-world genomic testing data. intCC: An efficient weighted integrative consensus clustering of multimodal data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1