Duplicate entries in the Protein Data Bank: how to detect and handle them.

IF 2.6 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS Acta Crystallographica. Section D, Structural Biology Pub Date : 2025-04-01 DOI:10.1107/S2059798325001883
Alexander Wlodawer, Zbigniew Dauter, Pawel Rubach, Wladek Minor, Mariusz Jaskolski, Ziqiu Jiang, William Jeffcott, Olga Anosova, Vitaliy Kurlin
{"title":"Duplicate entries in the Protein Data Bank: how to detect and handle them.","authors":"Alexander Wlodawer, Zbigniew Dauter, Pawel Rubach, Wladek Minor, Mariusz Jaskolski, Ziqiu Jiang, William Jeffcott, Olga Anosova, Vitaliy Kurlin","doi":"10.1107/S2059798325001883","DOIUrl":null,"url":null,"abstract":"<p><p>A global analysis of protein crystal structures in the Protein Data Bank (PDB) using a newly developed computational approach reveals many pairs with (nearly) identical main-chain coordinates. Such cases are identified and analyzed, showing that duplication is possible since the PDB does not currently have tools or mechanisms that would detect potentially duplicate submissions. Some duplicated entries represent modeling efforts of ligand binding that masquerade as experimentally determined structures. We propose that duplicate entries should either be obsoleted by the PDB or, as a minimum, marked with a clear `CAVEAT' record that would alert potential users to the presence of such problems. We also suggest that using a tool for verifying the uniqueness of the deposited structure, such as that presented in this work, should become part of the routine validation procedure for new depositions.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Crystallographica. Section D, Structural Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1107/S2059798325001883","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

A global analysis of protein crystal structures in the Protein Data Bank (PDB) using a newly developed computational approach reveals many pairs with (nearly) identical main-chain coordinates. Such cases are identified and analyzed, showing that duplication is possible since the PDB does not currently have tools or mechanisms that would detect potentially duplicate submissions. Some duplicated entries represent modeling efforts of ligand binding that masquerade as experimentally determined structures. We propose that duplicate entries should either be obsoleted by the PDB or, as a minimum, marked with a clear `CAVEAT' record that would alert potential users to the presence of such problems. We also suggest that using a tool for verifying the uniqueness of the deposited structure, such as that presented in this work, should become part of the routine validation procedure for new depositions.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Acta Crystallographica. Section D, Structural Biology
Acta Crystallographica. Section D, Structural Biology BIOCHEMICAL RESEARCH METHODSBIOCHEMISTRY &-BIOCHEMISTRY & MOLECULAR BIOLOGY
CiteScore
4.50
自引率
13.60%
发文量
216
期刊介绍: Acta Crystallographica Section D welcomes the submission of articles covering any aspect of structural biology, with a particular emphasis on the structures of biological macromolecules or the methods used to determine them. Reports on new structures of biological importance may address the smallest macromolecules to the largest complex molecular machines. These structures may have been determined using any structural biology technique including crystallography, NMR, cryoEM and/or other techniques. The key criterion is that such articles must present significant new insights into biological, chemical or medical sciences. The inclusion of complementary data that support the conclusions drawn from the structural studies (such as binding studies, mass spectrometry, enzyme assays, or analysis of mutants or other modified forms of biological macromolecule) is encouraged. Methods articles may include new approaches to any aspect of biological structure determination or structure analysis but will only be accepted where they focus on new methods that are demonstrated to be of general applicability and importance to structural biology. Articles describing particularly difficult problems in structural biology are also welcomed, if the analysis would provide useful insights to others facing similar problems.
期刊最新文献
Duplicate entries in the Protein Data Bank: how to detect and handle them. Structural basis for the fast maturation of pcStar, a photoconvertible fluorescent protein. Slice'N'Dice: maximizing the value of predicted models for structural biologists. Human dystrophin tandem calponin homology actin-binding domain crystallized in a closed-state conformation. Expansion of the diversity of dispersin scaffolds.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1