{"title":"Progress in protein pK<sub>a</sub> prediction","authors":"None Fangfang Luo, None Zhitao Cai, None Yandong Huang","doi":"10.7498/aps.72.20231356","DOIUrl":null,"url":null,"abstract":"pH represents solution acidity and plays a key role in many life events that are associated with human diseases. For instance, the β-site amyloid precursor protein cleavage enzyme, BACE1, which is a major therapeutic target of treating Alzheimer’s disease, functions within a narrow pH region around 4.5. In addition, the sodium-proton antiporter NhaA from <i>Escherichia coli</i> is activated only when the cytoplasmic pH is higher than 6.5 and the activity reaches the maximal around pH 8.8. To explore the molecular mechanism of a protein regulated by pH, it’s of importance to measure, typically by NMR, the binding affinities of protons to ionizable key residues, namely pK<sub>a</sub>’s, which determine the deprotonation equilibria under a pH condition. However, web-lab experiments are often expensive and time consuming. In some cases, due to the structural complexity of a protein, pK<sub>a</sub> measurements become difficult, making theoretical pK<sub>a</sub> predictions in a try lab more advantageous.In the past thirty years, many efforts had been made for accurate and fast protein pK<sub>a</sub> predictions with physics-based methods. Theoretically, constant pH molecular dynamics (CpHMD) methods that take conformational fluctuations into account give the most accurate predictions, especially the explicit-solvent CpHMD model proposed by Huang and coworkers (<i>J. Chem. Theory Comput.</i> 2016, 12, 5411-5421) which in principle is applicable to any system that a force field can describe. However, lengthy molecular simulations are usually necessary for extensive sampling in conformation. In particular, the computational complexity increases significantly if water molecules are included explicitly in the simulation systems. Thus, CpHMD is not suitable for high-throughout computing requested in industry. To accelerate pK<sub>a</sub> prediction, Poisson-Boltzmann (PB) or empirical equation-based schemes, such as H++ and PropKa, have been developed and widely applied where pK<sub>a</sub>’s are obtained via one-structure calculations. Recently, artificial intelligence (AI) is applied to the area of protein pK<sub>a</sub> prediction, which leads to the development of DeepKa by Huang lab (<i>ACS Omega</i> 2021, 6, 34823-34831), the first AI-driven pK<sub>a</sub> predictor. In this paper, we review the advances in protein pK<sub>a</sub> prediction contributed mainly by CpHMD methods, PB or empirical equation-based schemes, and AI models. Notably, the modeling hypotheses explained in the review would shed light on future developments of more powerful protein pK<sub>a</sub> predictors.","PeriodicalId":10252,"journal":{"name":"Chinese Physics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7498/aps.72.20231356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
pH represents solution acidity and plays a key role in many life events that are associated with human diseases. For instance, the β-site amyloid precursor protein cleavage enzyme, BACE1, which is a major therapeutic target of treating Alzheimer’s disease, functions within a narrow pH region around 4.5. In addition, the sodium-proton antiporter NhaA from Escherichia coli is activated only when the cytoplasmic pH is higher than 6.5 and the activity reaches the maximal around pH 8.8. To explore the molecular mechanism of a protein regulated by pH, it’s of importance to measure, typically by NMR, the binding affinities of protons to ionizable key residues, namely pKa’s, which determine the deprotonation equilibria under a pH condition. However, web-lab experiments are often expensive and time consuming. In some cases, due to the structural complexity of a protein, pKa measurements become difficult, making theoretical pKa predictions in a try lab more advantageous.In the past thirty years, many efforts had been made for accurate and fast protein pKa predictions with physics-based methods. Theoretically, constant pH molecular dynamics (CpHMD) methods that take conformational fluctuations into account give the most accurate predictions, especially the explicit-solvent CpHMD model proposed by Huang and coworkers (J. Chem. Theory Comput. 2016, 12, 5411-5421) which in principle is applicable to any system that a force field can describe. However, lengthy molecular simulations are usually necessary for extensive sampling in conformation. In particular, the computational complexity increases significantly if water molecules are included explicitly in the simulation systems. Thus, CpHMD is not suitable for high-throughout computing requested in industry. To accelerate pKa prediction, Poisson-Boltzmann (PB) or empirical equation-based schemes, such as H++ and PropKa, have been developed and widely applied where pKa’s are obtained via one-structure calculations. Recently, artificial intelligence (AI) is applied to the area of protein pKa prediction, which leads to the development of DeepKa by Huang lab (ACS Omega 2021, 6, 34823-34831), the first AI-driven pKa predictor. In this paper, we review the advances in protein pKa prediction contributed mainly by CpHMD methods, PB or empirical equation-based schemes, and AI models. Notably, the modeling hypotheses explained in the review would shed light on future developments of more powerful protein pKa predictors.