Stephanie A. Wankowicz , James S. Fraser , Z.-J. Liu (Editor)
{"title":"通过 mmCIF 数据结构对构象和组成蛋白质结构组合进行全面编码。","authors":"Stephanie A. Wankowicz , James S. Fraser , Z.-J. Liu (Editor)","doi":"10.1107/S2052252524005098","DOIUrl":null,"url":null,"abstract":"<div><p>Traditional structural models of biomolecules typically represent only a single conformational state, even though biomolecules naturally exist in multiple states crucial for their function. Here, we propose enhancements to the macromolecular crystallographic information file (mmCIF) to better capture the complex conformational and compositional heterogeneity of biomolecules that is human- and machine-interpretable.</p></div><div><p>In the folded state, biomolecules exchange between multiple conformational states crucial for their function. However, most structural models derived from experiments and computational predictions only encode a single state. To represent biomolecules accurately, we must move towards modeling and predicting structural ensembles. Information about structural ensembles exists within experimental data from X-ray crystallography and cryo-electron microscopy. Although new tools are available to detect conformational and compositional heterogeneity within these ensembles, the legacy PDB data structure does not robustly encapsulate this complexity. We propose modifications to the macromolecular crystallographic information file (mmCIF) to improve the representation and interrelation of conformational and compositional heterogeneity. These modifications will enable the capture of macromolecular ensembles in a human and machine-interpretable way, potentially catalyzing breakthroughs for ensemble–function predictions, analogous to the achievements of <em>AlphaFold</em> with single-structure prediction.</p></div>","PeriodicalId":14775,"journal":{"name":"IUCrJ","volume":"11 4","pages":"Pages 494-501"},"PeriodicalIF":2.9000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11220883/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comprehensive encoding of conformational and compositional protein structural ensembles through the mmCIF data structure\",\"authors\":\"Stephanie A. Wankowicz , James S. Fraser , Z.-J. Liu (Editor)\",\"doi\":\"10.1107/S2052252524005098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Traditional structural models of biomolecules typically represent only a single conformational state, even though biomolecules naturally exist in multiple states crucial for their function. Here, we propose enhancements to the macromolecular crystallographic information file (mmCIF) to better capture the complex conformational and compositional heterogeneity of biomolecules that is human- and machine-interpretable.</p></div><div><p>In the folded state, biomolecules exchange between multiple conformational states crucial for their function. However, most structural models derived from experiments and computational predictions only encode a single state. To represent biomolecules accurately, we must move towards modeling and predicting structural ensembles. Information about structural ensembles exists within experimental data from X-ray crystallography and cryo-electron microscopy. Although new tools are available to detect conformational and compositional heterogeneity within these ensembles, the legacy PDB data structure does not robustly encapsulate this complexity. We propose modifications to the macromolecular crystallographic information file (mmCIF) to improve the representation and interrelation of conformational and compositional heterogeneity. These modifications will enable the capture of macromolecular ensembles in a human and machine-interpretable way, potentially catalyzing breakthroughs for ensemble–function predictions, analogous to the achievements of <em>AlphaFold</em> with single-structure prediction.</p></div>\",\"PeriodicalId\":14775,\"journal\":{\"name\":\"IUCrJ\",\"volume\":\"11 4\",\"pages\":\"Pages 494-501\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11220883/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IUCrJ\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/org/science/article/pii/S2052252524000423\",\"RegionNum\":2,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IUCrJ","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S2052252524000423","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
在折叠状态下,生物大分子会在对其功能至关重要的多种构象状态之间进行交换。然而,从实验和计算预测中得出的大多数结构模型只能编码单一状态。为了准确地表达生物大分子,我们必须转向结构组合的建模和预测。有关结构组合的信息存在于 X 射线晶体学和冷冻电镜的实验数据中。虽然有新的工具可以检测这些结构簇中的构象和组成异质性,但传统的 PDB 数据结构并不能稳健地囊括这种复杂性。我们建议修改大分子晶体学信息文件(mmCIF),以改进构象和组成异质性的表示和相互关系。这些修改将能以人类和机器可理解的方式捕捉大分子集合,从而有可能推动集合功能预测的突破,类似于 AlphaFold 在单结构预测方面取得的成就。
Comprehensive encoding of conformational and compositional protein structural ensembles through the mmCIF data structure
Traditional structural models of biomolecules typically represent only a single conformational state, even though biomolecules naturally exist in multiple states crucial for their function. Here, we propose enhancements to the macromolecular crystallographic information file (mmCIF) to better capture the complex conformational and compositional heterogeneity of biomolecules that is human- and machine-interpretable.
In the folded state, biomolecules exchange between multiple conformational states crucial for their function. However, most structural models derived from experiments and computational predictions only encode a single state. To represent biomolecules accurately, we must move towards modeling and predicting structural ensembles. Information about structural ensembles exists within experimental data from X-ray crystallography and cryo-electron microscopy. Although new tools are available to detect conformational and compositional heterogeneity within these ensembles, the legacy PDB data structure does not robustly encapsulate this complexity. We propose modifications to the macromolecular crystallographic information file (mmCIF) to improve the representation and interrelation of conformational and compositional heterogeneity. These modifications will enable the capture of macromolecular ensembles in a human and machine-interpretable way, potentially catalyzing breakthroughs for ensemble–function predictions, analogous to the achievements of AlphaFold with single-structure prediction.
期刊介绍:
IUCrJ is a new fully open-access peer-reviewed journal from the International Union of Crystallography (IUCr).
The journal will publish high-profile articles on all aspects of the sciences and technologies supported by the IUCr via its commissions, including emerging fields where structural results underpin the science reported in the article. Our aim is to make IUCrJ the natural home for high-quality structural science results. Chemists, biologists, physicists and material scientists will be actively encouraged to report their structural studies in IUCrJ.