{"title":"Embedding AI in the Protein Crystallography Workflow","authors":"Richard J. Gildea, C. Orr, N. Paterson, D. Hall","doi":"10.1080/08940886.2022.2114723","DOIUrl":null,"url":null,"abstract":"Historically, solving the structure of a protein required deep knowledge of crystallography and the ability to produce protein crystals of suitable quality to generate high-quality diffraction data. Over the years, as beamline optics, end-stations, detectors, and data collection strategies have improved, it has become more feasible to extract highquality diffraction data from ever smaller or less perfect protein crystals and from very large arrays of crystals for techniques such as serial synchrotron crystallography and fragment-based drug discovery. At Diamond, these improvements have been coupled with highly integrated automated pipelines for data reduction and structure solution using techniques such as molecular replacement and experimental phasing. This has led to the dichotomy, and benefits, of being able to do increasingly challenging experiments requiring deep crystallographic knowledge with facility staff support at the same time as lowering the barrier to entry where automated structure solution tools of the facility perform this task for those scientists with less experience. This enables users to focus on the science rather than the process. Diamond Light Source, the UK’s national synchrotron, has a suite of instruments dedicated to solving the 3D structure of large biological molecules, including seven macromolecular crystallography (MX) beamlines. Solved 3D structures are deposited into the publicly available Protein Data Bank (PDB) and the depositions are released on a weekly basis. In 2020, following 13 years of operation, Diamond hit the milestone of 10,000 structures deposited in the PDB. Two years on, this number is now more than 12,000. Thanks to decades of work across the world, there is an ocean of information in the PDB that serves as an invaluable reference when solving the structures of new proteins.","PeriodicalId":39020,"journal":{"name":"Synchrotron Radiation News","volume":"35 1","pages":"51 - 54"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Synchrotron Radiation News","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/08940886.2022.2114723","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Physics and Astronomy","Score":null,"Total":0}
引用次数: 1
Abstract
Historically, solving the structure of a protein required deep knowledge of crystallography and the ability to produce protein crystals of suitable quality to generate high-quality diffraction data. Over the years, as beamline optics, end-stations, detectors, and data collection strategies have improved, it has become more feasible to extract highquality diffraction data from ever smaller or less perfect protein crystals and from very large arrays of crystals for techniques such as serial synchrotron crystallography and fragment-based drug discovery. At Diamond, these improvements have been coupled with highly integrated automated pipelines for data reduction and structure solution using techniques such as molecular replacement and experimental phasing. This has led to the dichotomy, and benefits, of being able to do increasingly challenging experiments requiring deep crystallographic knowledge with facility staff support at the same time as lowering the barrier to entry where automated structure solution tools of the facility perform this task for those scientists with less experience. This enables users to focus on the science rather than the process. Diamond Light Source, the UK’s national synchrotron, has a suite of instruments dedicated to solving the 3D structure of large biological molecules, including seven macromolecular crystallography (MX) beamlines. Solved 3D structures are deposited into the publicly available Protein Data Bank (PDB) and the depositions are released on a weekly basis. In 2020, following 13 years of operation, Diamond hit the milestone of 10,000 structures deposited in the PDB. Two years on, this number is now more than 12,000. Thanks to decades of work across the world, there is an ocean of information in the PDB that serves as an invaluable reference when solving the structures of new proteins.