Pub Date : 2026-02-01Epub Date: 2026-01-26DOI: 10.1107/S2059798326000021
Melanie Vollmar, Simon Westrip, Sreenath Nair, Balakumaran Balasubramaniyan, Sameer Velankar, Louise Jones, Peter Strickland
Protein structures are crucial in understanding the function, mechanism and disease-causing variants of proteins within any living cell. A number of experimental techniques are employed by researchers to determine such structures. Through structure inspection in molecular viewers, combined with supporting biochemical and biophysical experiments, scientists are able to identify the function of a protein, its reaction mechanism and effects caused by sequence variation. These detailed findings, supported by experimental results, are documented by being described in the scientific literature and by making the accompanying data open source. However, it has become increasingly difficult for a reader, in particular a non-expert, to access the correct additional information and assess the validity of the conclusions drawn based on experimental results. A reader is often required to resort to a number of different software packages to access the different data types. Here, we present a first-of-its-kind implementation of an artificial intelligence- and text-mining-supported software tool that allows the association of mentions in the text of one or more specific protein residues with their corresponding counterparts in the respective protein structure or structures. Our application allows a researcher to explore a residue of interest in the context of a publication and its respective protein structure, supported by its experimental evidence, in a single view. We describe model implementation, annotation extraction, downstream processing, dissemination and visualization at the IUCr and PDBe. The application presented is primarily aimed at readers of IUCr publications and users visiting the PDBe entry pages. However, we believe that in the future our application will be a valuable tool for reviewers of new submissions to IUCr journals and may even be useful as a curation tool involving the authors of a publication as annotation validators.
{"title":"Associating protein residues in the literature with structural data.","authors":"Melanie Vollmar, Simon Westrip, Sreenath Nair, Balakumaran Balasubramaniyan, Sameer Velankar, Louise Jones, Peter Strickland","doi":"10.1107/S2059798326000021","DOIUrl":"10.1107/S2059798326000021","url":null,"abstract":"<p><p>Protein structures are crucial in understanding the function, mechanism and disease-causing variants of proteins within any living cell. A number of experimental techniques are employed by researchers to determine such structures. Through structure inspection in molecular viewers, combined with supporting biochemical and biophysical experiments, scientists are able to identify the function of a protein, its reaction mechanism and effects caused by sequence variation. These detailed findings, supported by experimental results, are documented by being described in the scientific literature and by making the accompanying data open source. However, it has become increasingly difficult for a reader, in particular a non-expert, to access the correct additional information and assess the validity of the conclusions drawn based on experimental results. A reader is often required to resort to a number of different software packages to access the different data types. Here, we present a first-of-its-kind implementation of an artificial intelligence- and text-mining-supported software tool that allows the association of mentions in the text of one or more specific protein residues with their corresponding counterparts in the respective protein structure or structures. Our application allows a researcher to explore a residue of interest in the context of a publication and its respective protein structure, supported by its experimental evidence, in a single view. We describe model implementation, annotation extraction, downstream processing, dissemination and visualization at the IUCr and PDBe. The application presented is primarily aimed at readers of IUCr publications and users visiting the PDBe entry pages. However, we believe that in the future our application will be a valuable tool for reviewers of new submissions to IUCr journals and may even be useful as a curation tool involving the authors of a publication as annotation validators.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"113-125"},"PeriodicalIF":3.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12865887/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146049926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-01-19DOI: 10.1107/S2059798325011611
Alexandre G Urzhumtsev
In cryo-electron microscopy, a set of two-dimensional projections collected from different viewing directions may complicate image processing and subsequent model building if the distribution of these views is non-uniform. View distributions are traditionally represented as color-coded two-dimensional diagrams. However, such diagrams can introduce distortions and are cumbersome to manipulate, store and compare across data sets; they do not provide a commonly accepted quantitative measure of uniformity. In this work, we propose a method to characterize these angular distributions quantitatively and to represent them as simple one-dimensional curves rather than two-dimensional colored diagrams. The suggested measures could be incorporated into databases such as EMDB to provide a compact, standardized description of the angular distribution of particle views.
{"title":"Quantifying distributions of cryo-EM projections.","authors":"Alexandre G Urzhumtsev","doi":"10.1107/S2059798325011611","DOIUrl":"10.1107/S2059798325011611","url":null,"abstract":"<p><p>In cryo-electron microscopy, a set of two-dimensional projections collected from different viewing directions may complicate image processing and subsequent model building if the distribution of these views is non-uniform. View distributions are traditionally represented as color-coded two-dimensional diagrams. However, such diagrams can introduce distortions and are cumbersome to manipulate, store and compare across data sets; they do not provide a commonly accepted quantitative measure of uniformity. In this work, we propose a method to characterize these angular distributions quantitatively and to represent them as simple one-dimensional curves rather than two-dimensional colored diagrams. The suggested measures could be incorporated into databases such as EMDB to provide a compact, standardized description of the angular distribution of particle views.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"100-112"},"PeriodicalIF":3.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145996951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-01-08DOI: 10.1107/S2059798325011477
Alexander Wlodawer, Pawel Rubach, Zbigniew Dauter, Wojciech Dec, Dariusz Brzezinski, Wladek Minor, Mariusz Jaskolski
It is postulated that the PDB should use the CAVEAT record more prominently to warn scientists using its archives of potential risks and errors.
据推测,PDB应该更突出地使用警告记录来警告使用其档案的科学家潜在的风险和错误。
{"title":"How to mitigate the caveat emptor burden of human and machine users of the Protein Data Bank.","authors":"Alexander Wlodawer, Pawel Rubach, Zbigniew Dauter, Wojciech Dec, Dariusz Brzezinski, Wladek Minor, Mariusz Jaskolski","doi":"10.1107/S2059798325011477","DOIUrl":"10.1107/S2059798325011477","url":null,"abstract":"<p><p>It is postulated that the PDB should use the CAVEAT record more prominently to warn scientists using its archives of potential risks and errors.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"62-64"},"PeriodicalIF":3.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145916306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-01-19DOI: 10.1107/S2059798325011556
Heidar J Koning, Anuradha Pullakhandam, Andrew E Whitten, Charles S Bond, Michel Peyrard
SAXS studies of four 60-base-pair DNA duplexes with sequences closely related to part of the GAGE6 (G-antigen 6) promoter have been performed to study the role of DNA conformations in solution and their potential relationship to DNA-protein binding. We show that the SAXS data can be analysed using a simple polymer model, which nevertheless quantitatively describes the average persistence length and torsional rigidity of the DNA double helix, to determine the statistical distribution of local conformations of the DNA in solution to high accuracy. Although the SAXS data are averaged over time and all spatial orientations of the molecules, for sequences which have some asymmetry in the data we show that the conformations can be oriented with respect to the sequence. This allows specific features detected by the analysis to be precisely related to the DNA sequence, opening up new opportunities for SAXS to investigate the properties of DNA in solution. The biological implications of these results are discussed.
{"title":"Probing the statistics of sequence-dependent DNA conformations in solution using SAXS.","authors":"Heidar J Koning, Anuradha Pullakhandam, Andrew E Whitten, Charles S Bond, Michel Peyrard","doi":"10.1107/S2059798325011556","DOIUrl":"10.1107/S2059798325011556","url":null,"abstract":"<p><p>SAXS studies of four 60-base-pair DNA duplexes with sequences closely related to part of the GAGE6 (G-antigen 6) promoter have been performed to study the role of DNA conformations in solution and their potential relationship to DNA-protein binding. We show that the SAXS data can be analysed using a simple polymer model, which nevertheless quantitatively describes the average persistence length and torsional rigidity of the DNA double helix, to determine the statistical distribution of local conformations of the DNA in solution to high accuracy. Although the SAXS data are averaged over time and all spatial orientations of the molecules, for sequences which have some asymmetry in the data we show that the conformations can be oriented with respect to the sequence. This allows specific features detected by the analysis to be precisely related to the DNA sequence, opening up new opportunities for SAXS to investigate the properties of DNA in solution. The biological implications of these results are discussed.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"79-99"},"PeriodicalIF":3.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145996940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1107/S2059798325011258
Alaa Shaikhqasem, Farzad Hamdi, Lisa Machner, Christoph Parthier, Constanze Breithaupt, Fotis L Kyrilis, Stephan M Feller, Panagiotis L Kastritis, Milton T Stubbs
While 3D electron diffraction (3D-ED or microcrystal electron diffraction; MicroED) has emerged as a promising method for protein structure determination, its applicability is hindered by a high susceptibility to radiation damage, leading to a decreasing signal-to-noise ratio in consecutive diffraction patterns that limits the quality (resolution and redundancy) of the data. In addition, data completeness may be restricted due to the geometrical limitations of current sample holders and stages. Although specialized equipment can overcome these challenges, many laboratories do not have access to such instrumentation. In this work, we introduce an approach that addresses these issues using a commonly available 200 keV cryo-electron microscope. The multi-position acquisition technique that we present here combines (i) multiple data acquisitions from a single crystal over several tilt ranges and (ii) merging data from a small number of crystals each tilted about a different axis. The robustness of this approach is demonstrated by the de novo elucidation of a protein-peptide complex structure from only two orthorhombic microcrystals.
{"title":"Strategies for mitigating radiation damage and improving data completeness in 3D electron diffraction of protein crystals.","authors":"Alaa Shaikhqasem, Farzad Hamdi, Lisa Machner, Christoph Parthier, Constanze Breithaupt, Fotis L Kyrilis, Stephan M Feller, Panagiotis L Kastritis, Milton T Stubbs","doi":"10.1107/S2059798325011258","DOIUrl":"10.1107/S2059798325011258","url":null,"abstract":"<p><p>While 3D electron diffraction (3D-ED or microcrystal electron diffraction; MicroED) has emerged as a promising method for protein structure determination, its applicability is hindered by a high susceptibility to radiation damage, leading to a decreasing signal-to-noise ratio in consecutive diffraction patterns that limits the quality (resolution and redundancy) of the data. In addition, data completeness may be restricted due to the geometrical limitations of current sample holders and stages. Although specialized equipment can overcome these challenges, many laboratories do not have access to such instrumentation. In this work, we introduce an approach that addresses these issues using a commonly available 200 keV cryo-electron microscope. The multi-position acquisition technique that we present here combines (i) multiple data acquisitions from a single crystal over several tilt ranges and (ii) merging data from a small number of crystals each tilted about a different axis. The robustness of this approach is demonstrated by the de novo elucidation of a protein-peptide complex structure from only two orthorhombic microcrystals.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"11-22"},"PeriodicalIF":3.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12809435/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145773185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1107/S205979832501023X
Hui Bao, Di Jiang, Chengbing Tang, Zhijie Yin, Xiaofang Huang, Rao Fu, Yang Zhang, Zhonghan Li, Shiqian Qi, Haoyang Cai, Dan Tang
Echinacea purpurea hydroxycinnamoyl-CoA:tartaric acid hydroxycinnamoyl transferase (EpHTT) is a cytosolic BAHD acyltransferase that catalyzes the transfer of caffeoyl groups to tartaric acid, a key step in chicoric acid biosynthesis. Understanding the structure of EpHTT is essential to elucidate the molecular basis of substrate recognition and catalytic specificity. Here, we report the crystal structure of apo-form EpHTT at 2.38 Å resolution, revealing a compact, globular architecture typical of the BAHD superfamily. The enzyme adopts a two-domain fold with conserved HXXXD and DFGWG motifs, forming a V-shaped catalytic cleft characteristic of BAHD acyltransferases. Structural comparison with homologous hydroxycinnamoyl transferase enzymes shows high conservation of the overall fold, while EpHTT exhibits unique adaptations that confer specificity for tartaric acid. These results provide a molecular framework for understanding the function and substrate specificity of HTT, offering insights for the metabolic engineering of chicoric acid production.
{"title":"The crystal structure of EpHTT, a hydroxycinnamoyl transferase from Echinacea purpurea.","authors":"Hui Bao, Di Jiang, Chengbing Tang, Zhijie Yin, Xiaofang Huang, Rao Fu, Yang Zhang, Zhonghan Li, Shiqian Qi, Haoyang Cai, Dan Tang","doi":"10.1107/S205979832501023X","DOIUrl":"10.1107/S205979832501023X","url":null,"abstract":"<p><p>Echinacea purpurea hydroxycinnamoyl-CoA:tartaric acid hydroxycinnamoyl transferase (EpHTT) is a cytosolic BAHD acyltransferase that catalyzes the transfer of caffeoyl groups to tartaric acid, a key step in chicoric acid biosynthesis. Understanding the structure of EpHTT is essential to elucidate the molecular basis of substrate recognition and catalytic specificity. Here, we report the crystal structure of apo-form EpHTT at 2.38 Å resolution, revealing a compact, globular architecture typical of the BAHD superfamily. The enzyme adopts a two-domain fold with conserved HXXXD and DFGWG motifs, forming a V-shaped catalytic cleft characteristic of BAHD acyltransferases. Structural comparison with homologous hydroxycinnamoyl transferase enzymes shows high conservation of the overall fold, while EpHTT exhibits unique adaptations that confer specificity for tartaric acid. These results provide a molecular framework for understanding the function and substrate specificity of HTT, offering insights for the metabolic engineering of chicoric acid production.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"43-52"},"PeriodicalIF":3.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145653141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-11-19DOI: 10.1107/S2059798325009350
Wai Shing Tang, Jeff Soules, Aaditya Rangan, Pilar Cossio
Extracting conformational heterogeneity from cryo-electron microscopy (cryo-EM) images is particularly challenging for flexible biomolecules, where traditional 3D classification approaches often fail. Over the past few decades, advancements in experimental and computational techniques have been made to tackle this challenge, especially Bayesian-based approaches that provide physically interpretable insights into cryo-EM heterogeneity. To reduce the computational cost for Bayesian approaches, and building upon previously developed Fourier-Bessel image-representation methods, we created CryoLike, computationally efficient software for evaluating image-to-structure (or image-to-volume) likelihoods across large image data sets, packaged in a user-friendly Python workflow.
{"title":"CryoLike: a Python package for cryo-electron microscopy image-to-structure likelihood calculations.","authors":"Wai Shing Tang, Jeff Soules, Aaditya Rangan, Pilar Cossio","doi":"10.1107/S2059798325009350","DOIUrl":"10.1107/S2059798325009350","url":null,"abstract":"<p><p>Extracting conformational heterogeneity from cryo-electron microscopy (cryo-EM) images is particularly challenging for flexible biomolecules, where traditional 3D classification approaches often fail. Over the past few decades, advancements in experimental and computational techniques have been made to tackle this challenge, especially Bayesian-based approaches that provide physically interpretable insights into cryo-EM heterogeneity. To reduce the computational cost for Bayesian approaches, and building upon previously developed Fourier-Bessel image-representation methods, we created CryoLike, computationally efficient software for evaluating image-to-structure (or image-to-volume) likelihoods across large image data sets, packaged in a user-friendly Python workflow.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"660-667"},"PeriodicalIF":3.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145547591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-11-25DOI: 10.1107/S2059798325009520
Khadiza Begam, Zachary Morgan, Dean A A Myles, Jens Glaser
Many fundamental biological processes, including those in photosynthetic reaction centers and enzyme active sites, involve charge and energy transfer, bond cleavage, protonation and hydrogen bonding. Because H atoms play such central roles in these reactions, accurately determining their positions is essential. Yet, conventional X-ray crystallography primarily resolves the heavy atoms in biological structures and provides limited insight into hydrogen, even at atomic resolution. Neutron macromolecular crystallography (NMC) overcomes this limitation by offering exceptional sensitivity to hydrogen and deuterium. Here, we present a theoretical framework for the development of dynamic nuclear polarization NMC (DNP-NMC) techniques, which exploit the alignment of neutron and proton nuclear spins to enhance and tune the hydrogen signal contribution. The DNP-NMC approach advances the resolution of H atoms within biomolecular crystals, whether bound to protein residues or present in solvent. The method establishes key relationships for the coherent structure factor of polarized neutron scattering from hydrogenous matter. It theoretically achieves full accuracy in phase reconstruction and offers a path to improve neutron structure determination, achieving accuracies exceeding ≳80% by incorporating titration states. Using a variant of the hybrid input/output phase-retrieval algorithm, it allows recovery of the hydrogen density with ≳90% phase accuracy. We further discuss sources of experimental uncertainty for the upcoming DNP-enabled, quasi-Laue IMAGINE-X experiment at Oak Ridge National Laboratory's High Flux Isotope Reactor.
{"title":"Hydrogen density mapping in biomolecular crystals through dynamic nuclear polarization.","authors":"Khadiza Begam, Zachary Morgan, Dean A A Myles, Jens Glaser","doi":"10.1107/S2059798325009520","DOIUrl":"10.1107/S2059798325009520","url":null,"abstract":"<p><p>Many fundamental biological processes, including those in photosynthetic reaction centers and enzyme active sites, involve charge and energy transfer, bond cleavage, protonation and hydrogen bonding. Because H atoms play such central roles in these reactions, accurately determining their positions is essential. Yet, conventional X-ray crystallography primarily resolves the heavy atoms in biological structures and provides limited insight into hydrogen, even at atomic resolution. Neutron macromolecular crystallography (NMC) overcomes this limitation by offering exceptional sensitivity to hydrogen and deuterium. Here, we present a theoretical framework for the development of dynamic nuclear polarization NMC (DNP-NMC) techniques, which exploit the alignment of neutron and proton nuclear spins to enhance and tune the hydrogen signal contribution. The DNP-NMC approach advances the resolution of H atoms within biomolecular crystals, whether bound to protein residues or present in solvent. The method establishes key relationships for the coherent structure factor of polarized neutron scattering from hydrogenous matter. It theoretically achieves full accuracy in phase reconstruction and offers a path to improve neutron structure determination, achieving accuracies exceeding ≳80% by incorporating titration states. Using a variant of the hybrid input/output phase-retrieval algorithm, it allows recovery of the hydrogen density with ≳90% phase accuracy. We further discuss sources of experimental uncertainty for the upcoming DNP-enabled, quasi-Laue IMAGINE-X experiment at Oak Ridge National Laboratory's High Flux Isotope Reactor.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"758-768"},"PeriodicalIF":3.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
More than 80% of protein structure models in the Protein Data Bank have been solved using X-ray crystallography. Despite continuous improvements in this experimental technique, crystallographic structure models may still present artifacts related to the crystallization process as well as errors introduced during model building and refinement, even in high-resolution cases. Such limitations can alter atomic or residue positions, leading to local misconformations, local or domain rearrangements, and occasionally global distortions. In this study, we developed a protocol to locate residues with questionable conformations, where conformations may be uncertain, atypical or influenced by crystallographic modeling and refinement. To do so, we started from a set of 826 nonredundant X-ray protein structure models. Each X-ray model underwent an energy-minimization step that relaxes atomic geometry by reducing potential energy. Residues that exhibited different local conformations between the X-ray and minimized models were therefore considered as having questionable conformations. To identify them, we compared the X-ray and minimized models of each protein using the HMM-SA structural alphabet. Our results revealed that over 18% of the residues in the protein set have questionable conformations in their backbone. These conformations can occur either as isolated events within the protein sequence or can form patterns. Moreover, we observed that the frequency of questionable conformations per X-ray model was independent of factors such as the date of deposition, resolution or crystal system. Analysis of the properties of residues associated with questionable conformations revealed that they do not specifically occur in flexible or accessible regions. However, there is a correlation between questionable conformations and secondary structures, with a particular overrepresentation of residues with questionable conformations in α-helices. We then further investigated questionable conformations in the structure model of ligand-free HIV-2 protease (PR2). By combining our questionable conformation-detection protocol with molecular-dynamics simulation, we demonstrated that approximately half of the questionable conformations in PDB entry 1hsi correspond to local conformations that are sparsely sampled by PR2 during the molecular-dynamics simulation or are structural outliers detected by the wwPDB report. In addition, our results suggested that these questionable conformations may affect the position of the flaps, two β-sheets forming the top of the binding site. In PDB entry 1hsi, their relative arrangement appears atypical compared with MD simulations, raising questions about the biological relevance of this conformation. To conclude, we have developed a protocol to quantify and localize questionable backbone conformations in X-ray structure models, which can affect the interpretation of structural data.
{"title":"Exploration of questionable backbone conformations in crystallographic structure models using a structural alphabet.","authors":"Clémence Sarrau, Marine Baillif, Lucas Mantel, Dounia Benyakhlaf, Shamima Peerbux, Leslie Regad","doi":"10.1107/S2059798325009301","DOIUrl":"10.1107/S2059798325009301","url":null,"abstract":"<p><p>More than 80% of protein structure models in the Protein Data Bank have been solved using X-ray crystallography. Despite continuous improvements in this experimental technique, crystallographic structure models may still present artifacts related to the crystallization process as well as errors introduced during model building and refinement, even in high-resolution cases. Such limitations can alter atomic or residue positions, leading to local misconformations, local or domain rearrangements, and occasionally global distortions. In this study, we developed a protocol to locate residues with questionable conformations, where conformations may be uncertain, atypical or influenced by crystallographic modeling and refinement. To do so, we started from a set of 826 nonredundant X-ray protein structure models. Each X-ray model underwent an energy-minimization step that relaxes atomic geometry by reducing potential energy. Residues that exhibited different local conformations between the X-ray and minimized models were therefore considered as having questionable conformations. To identify them, we compared the X-ray and minimized models of each protein using the HMM-SA structural alphabet. Our results revealed that over 18% of the residues in the protein set have questionable conformations in their backbone. These conformations can occur either as isolated events within the protein sequence or can form patterns. Moreover, we observed that the frequency of questionable conformations per X-ray model was independent of factors such as the date of deposition, resolution or crystal system. Analysis of the properties of residues associated with questionable conformations revealed that they do not specifically occur in flexible or accessible regions. However, there is a correlation between questionable conformations and secondary structures, with a particular overrepresentation of residues with questionable conformations in α-helices. We then further investigated questionable conformations in the structure model of ligand-free HIV-2 protease (PR2). By combining our questionable conformation-detection protocol with molecular-dynamics simulation, we demonstrated that approximately half of the questionable conformations in PDB entry 1hsi correspond to local conformations that are sparsely sampled by PR2 during the molecular-dynamics simulation or are structural outliers detected by the wwPDB report. In addition, our results suggested that these questionable conformations may affect the position of the flaps, two β-sheets forming the top of the binding site. In PDB entry 1hsi, their relative arrangement appears atypical compared with MD simulations, raising questions about the biological relevance of this conformation. To conclude, we have developed a protocol to quantify and localize questionable backbone conformations in X-ray structure models, which can affect the interpretation of structural data.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"734-757"},"PeriodicalIF":3.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12809442/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-11-26DOI: 10.1107/S2059798325009647
Airlie J McCoy, Randy J Read
Analysis of crystallographic diffraction data after collection and integration but before phasing gives the crystallographer a `first-look' assessment of data quality and flags potential challenges in subsequent structure determination. We here report the development of Xtricorder, a `first-look' application specifically targeted at likelihood-based phasing. Xtricorder incorporates the full array of analyses previously available in the Phaser codebase, with some enhancements and updates, in a more streamlined and accessible implementation. In addition, Xtricorder offers a likelihood-enhanced self-rotation function. A novel graphical representation of the self-rotation function, the `composite-section diagram', presents the results for user inspection and has the added advantage that, in an adapted form, it is appropriate for training a convolutional neural network to enhance the standard Matthews analysis and double the accuracy of asymmetric unit copy-number prediction. We investigate the usefulness of the likelihood-enhanced self-rotation function in `first-look' analyses, exploring the circumstances under which the self-rotation function results are useful, and discuss the application to AI-generated structure prediction.
{"title":"Xtricorder: a likelihood-enhanced self-rotation function and application to a machine learning-enhanced Matthews prediction of asymmetric unit copy number.","authors":"Airlie J McCoy, Randy J Read","doi":"10.1107/S2059798325009647","DOIUrl":"10.1107/S2059798325009647","url":null,"abstract":"<p><p>Analysis of crystallographic diffraction data after collection and integration but before phasing gives the crystallographer a `first-look' assessment of data quality and flags potential challenges in subsequent structure determination. We here report the development of Xtricorder, a `first-look' application specifically targeted at likelihood-based phasing. Xtricorder incorporates the full array of analyses previously available in the Phaser codebase, with some enhancements and updates, in a more streamlined and accessible implementation. In addition, Xtricorder offers a likelihood-enhanced self-rotation function. A novel graphical representation of the self-rotation function, the `composite-section diagram', presents the results for user inspection and has the added advantage that, in an adapted form, it is appropriate for training a convolutional neural network to enhance the standard Matthews analysis and double the accuracy of asymmetric unit copy-number prediction. We investigate the usefulness of the likelihood-enhanced self-rotation function in `first-look' analyses, exploring the circumstances under which the self-rotation function results are useful, and discuss the application to AI-generated structure prediction.</p>","PeriodicalId":7116,"journal":{"name":"Acta Crystallographica. Section D, Structural Biology","volume":" ","pages":"678-692"},"PeriodicalIF":3.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12809497/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145601705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}