Pub Date : 2024-12-02DOI: 10.1021/acs.jcim.4c01357
Martin Ljubič, Andrej Perdih, Jure Borišek
Understanding how membrane composition influences the dynamics and function of transmembrane proteins is crucial for the comprehensive elucidation of cellular signaling mechanisms and the development of targeted therapeutics. In this study, we employed all-atom molecular dynamics simulations to investigate the impact of different membrane compositions on the conformational dynamics of the NKG2A/CD94/HLA-E immune receptor complex, a key negative regulator of natural killer cell cytotoxic activity. Our results reveal significant variations in the behavior of the immune complex structure across five different membrane compositions, which include POPC, POPA, DPPC, and DLPC phospholipids, and a mixed POPC/cholesterol system. These variations are particularly evident in the intracellular domain of NKG2A, manifested as changes in mobility, tyrosine exposure, and interdomain communication. Additionally, we found that a large concentration of negative charge at the surface of the POPA-based membrane greatly increased the number of contacts with lipid molecules and significantly decreased the exposure of intracellular NKG2A ITIM regions to water molecules, thus likely halting the signal transduction process. Furthermore, the DPPC model with a membrane possessing a high transition temperature in a gel-like state became curved, affecting the exposure of one ITIM region. The decreased membrane thickness in the DPLC model caused a significant transmembrane domain tilt, altering the linker protrusion angle and potentially disrupting the hydrogen bonding network in the extracellular domain. Overall, our findings highlight the importance of considering membrane composition in the analysis of transmembrane protein dynamics and in the exploration of novel strategies for the external modulation of their signaling pathways.
{"title":"All-Atom Simulations Reveal the Effect of Membrane Composition on the Signaling of the NKG2A/CD94/HLA-E Immune Receptor Complex.","authors":"Martin Ljubič, Andrej Perdih, Jure Borišek","doi":"10.1021/acs.jcim.4c01357","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01357","url":null,"abstract":"<p><p>Understanding how membrane composition influences the dynamics and function of transmembrane proteins is crucial for the comprehensive elucidation of cellular signaling mechanisms and the development of targeted therapeutics. In this study, we employed all-atom molecular dynamics simulations to investigate the impact of different membrane compositions on the conformational dynamics of the NKG2A/CD94/HLA-E immune receptor complex, a key negative regulator of natural killer cell cytotoxic activity. Our results reveal significant variations in the behavior of the immune complex structure across five different membrane compositions, which include POPC, POPA, DPPC, and DLPC phospholipids, and a mixed POPC/cholesterol system. These variations are particularly evident in the intracellular domain of NKG2A, manifested as changes in mobility, tyrosine exposure, and interdomain communication. Additionally, we found that a large concentration of negative charge at the surface of the POPA-based membrane greatly increased the number of contacts with lipid molecules and significantly decreased the exposure of intracellular NKG2A ITIM regions to water molecules, thus likely halting the signal transduction process. Furthermore, the DPPC model with a membrane possessing a high transition temperature in a gel-like state became curved, affecting the exposure of one ITIM region. The decreased membrane thickness in the DPLC model caused a significant transmembrane domain tilt, altering the linker protrusion angle and potentially disrupting the hydrogen bonding network in the extracellular domain. Overall, our findings highlight the importance of considering membrane composition in the analysis of transmembrane protein dynamics and in the exploration of novel strategies for the external modulation of their signaling pathways.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142764628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The frozen domain (FD) approximation with the fragment molecular orbital (FMO) method is efficient for partial geometry optimization of large systems. We implemented the FD formulation (FD and frozen domain dimer [FDD] methods) already proposed by Fedorov, D. G. et al. (J. Phys. Chem. Lett.2011, 2, 282-288); proposed a variation of it, namely frozen domain and partial dimer (FDPD) method; and applied it to several protein-ligand complexes. The computational time for geometry optimization at the FDPD/HF/6-31G* level for the active site (six fragments) of the largest β2-adrenergic G-protein-coupled receptor (440 residues) was almost half that of the conventional partial geometry optimization method. In the human estrogen receptor, the crystal structure was refined by FDPD geometry optimization of estradiol, surrounding hydrogen-bonded residues and a water molecule. The rather polarized ligand binding site of influenza virus neuraminidase was also optimized by FDPD optimization, which relaxed steric repulsion around the ligand in the crystal structure and optimized hydrogen bonding. For Serine-Threonine Kinase Pim1 and six inhibitors, the structures of the ligand binding site, Lys67, Glu121, Arg122, and benzofuranone ring and indole/azaindole ring of the ligand, were optimized at FDPD/HF/6-31G* and the ligand binding energy was estimated at the FMO-MP2/6-31G* level. As a result of examining three different optimization regions, the correlation coefficient between pIC50 and ligand binding energy was considerably improved by expanding the optimized region; in other words, better structure-activity relationships was obtained. Thus, this approach is promising as a high-precision structure refinement method for structure-based drug discovery.
{"title":"Geometry Optimization Using the Frozen Domain and Partial Dimer Approaches in the Fragment Molecular Orbital Method: Implementation, Benchmark, and Applications to Protein Ligand-Binding Sites.","authors":"Koji Okuwaki, Naoki Watanabe, Koichiro Kato, Chiduru Watanabe, Naofumi Nakayama, Akifumi Kato, Yuji Mochizuki, Tatsuya Nakano, Teruki Honma, Kaori Fukuzawa","doi":"10.1021/acs.jcim.4c01280","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01280","url":null,"abstract":"<p><p>The frozen domain (FD) approximation with the fragment molecular orbital (FMO) method is efficient for partial geometry optimization of large systems. We implemented the FD formulation (FD and frozen domain dimer [FDD] methods) already proposed by Fedorov, D. G. et al. (<i>J. Phys. Chem. Lett.</i> <b>2011</b>, 2, 282-288); proposed a variation of it, namely frozen domain and partial dimer (FDPD) method; and applied it to several protein-ligand complexes. The computational time for geometry optimization at the FDPD/HF/6-31G* level for the active site (six fragments) of the largest β<sub>2</sub>-adrenergic G-protein-coupled receptor (440 residues) was almost half that of the conventional partial geometry optimization method. In the human estrogen receptor, the crystal structure was refined by FDPD geometry optimization of estradiol, surrounding hydrogen-bonded residues and a water molecule. The rather polarized ligand binding site of influenza virus neuraminidase was also optimized by FDPD optimization, which relaxed steric repulsion around the ligand in the crystal structure and optimized hydrogen bonding. For Serine-Threonine Kinase Pim1 and six inhibitors, the structures of the ligand binding site, Lys67, Glu121, Arg122, and benzofuranone ring and indole/azaindole ring of the ligand, were optimized at FDPD/HF/6-31G* and the ligand binding energy was estimated at the FMO-MP2/6-31G* level. As a result of examining three different optimization regions, the correlation coefficient between pIC<sub>50</sub> and ligand binding energy was considerably improved by expanding the optimized region; in other words, better structure-activity relationships was obtained. Thus, this approach is promising as a high-precision structure refinement method for structure-based drug discovery.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142764654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-29DOI: 10.1021/acs.jcim.4c01212
Seenivasan Hariharan, Sachin Kinge, Lucas Visscher
Heterogeneous catalysis plays a critical role in many industrial processes, including the production of fuels, chemicals, and pharmaceuticals, and research to improve current catalytic processes is important to make the chemical industry more sustainable. Despite its importance, the challenge of identifying optimal catalysts with the required activity and selectivity persists, demanding a detailed understanding of the complex interactions between catalysts and reactants at various length and time scales. Density functional theory (DFT) has been the workhorse in modeling heterogeneous catalysis for more than three decades. While DFT has been instrumental, this review explores the application of quantum computing algorithms in modeling heterogeneous catalysis, which could bring a paradigm shift in our approach to understanding catalytic interfaces. Bridging academic and industrial perspectives by focusing on emerging materials, such as multicomponent alloys, single-atom catalysts, and magnetic catalysts, we delve into the limitations of DFT in capturing strong correlation effects and spin-related phenomena. The review also presents important algorithms and their applications relevant to heterogeneous catalysis modeling to showcase advancements in the field. Additionally, the review explores embedding strategies where quantum computing algorithms handle strongly correlated regions, while traditional quantum chemistry algorithms address the remainder, thereby offering a promising approach for large-scale heterogeneous catalysis modeling. Looking forward, ongoing investments by academia and industry reflect a growing enthusiasm for quantum computing's potential in heterogeneous catalysis research. The review concludes by envisioning a future where quantum computing algorithms seamlessly integrate into research workflows, propelling us into a new era of computational chemistry and thereby reshaping the landscape of modeling heterogeneous catalysis.
{"title":"Modeling Heterogeneous Catalysis Using Quantum Computers: An Academic and Industry Perspective.","authors":"Seenivasan Hariharan, Sachin Kinge, Lucas Visscher","doi":"10.1021/acs.jcim.4c01212","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01212","url":null,"abstract":"<p><p>Heterogeneous catalysis plays a critical role in many industrial processes, including the production of fuels, chemicals, and pharmaceuticals, and research to improve current catalytic processes is important to make the chemical industry more sustainable. Despite its importance, the challenge of identifying optimal catalysts with the required activity and selectivity persists, demanding a detailed understanding of the complex interactions between catalysts and reactants at various length and time scales. Density functional theory (DFT) has been the workhorse in modeling heterogeneous catalysis for more than three decades. While DFT has been instrumental, this review explores the application of quantum computing algorithms in modeling heterogeneous catalysis, which could bring a paradigm shift in our approach to understanding catalytic interfaces. Bridging academic and industrial perspectives by focusing on emerging materials, such as multicomponent alloys, single-atom catalysts, and magnetic catalysts, we delve into the limitations of DFT in capturing strong correlation effects and spin-related phenomena. The review also presents important algorithms and their applications relevant to heterogeneous catalysis modeling to showcase advancements in the field. Additionally, the review explores embedding strategies where quantum computing algorithms handle strongly correlated regions, while traditional quantum chemistry algorithms address the remainder, thereby offering a promising approach for large-scale heterogeneous catalysis modeling. Looking forward, ongoing investments by academia and industry reflect a growing enthusiasm for quantum computing's potential in heterogeneous catalysis research. The review concludes by envisioning a future where quantum computing algorithms seamlessly integrate into research workflows, propelling us into a new era of computational chemistry and thereby reshaping the landscape of modeling heterogeneous catalysis.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142749499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-29DOI: 10.1021/acs.jcim.4c01657
Jinfu Peng, Li Fu, Guoping Yang, Dongshen Cao
Ensuring drug safety during pregnancy is critical due to the potential risks to both the mother and fetus. However, the exclusion of pregnant women from clinical trials complicates the assessment of adverse drug reactions (ADRs) in this population. This study aimed to develop and validate risk prediction models for pregnancy-related ADRs of drugs using advanced Machine Learning (ML) and Deep Learning (DL) techniques, leveraging real-world data from the FDA Adverse Event Reporting System. We explored three methods─Information Component, Reporting Odds Ratio, and 95% confidence interval of ROR─for classifying drugs into high-risk and low-risk categories. DL models, including Directed Message Passing Neural Networks (DMPNN), Graph Neural Networks, and Graph Convolutional Networks, were developed and compared to traditional ML models like Random Forest, Support Vector Machines, and XGBoost. Among these, the DMPNN model, which integrated molecular graph information and molecular descriptors, exhibited the highest predictive performance, particularly at the preferred term level. The model was validated against external data sets from SIDER and DailyMed, demonstrating strong generalizability. Additionally, the model was applied to assess the risk of 22 oral hypoglycemic drugs, and potential substructure alerts for pregnancy-related ADRs were identified. These findings suggest that the DMPNN model is a valuable tool for predicting ADRs in pregnant women, offering significant advancement in drug safety assessment and providing crucial insights for safer medication use during pregnancy.
{"title":"Advanced AI-Driven Prediction of Pregnancy-Related Adverse Drug Reactions.","authors":"Jinfu Peng, Li Fu, Guoping Yang, Dongshen Cao","doi":"10.1021/acs.jcim.4c01657","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01657","url":null,"abstract":"<p><p>Ensuring drug safety during pregnancy is critical due to the potential risks to both the mother and fetus. However, the exclusion of pregnant women from clinical trials complicates the assessment of adverse drug reactions (ADRs) in this population. This study aimed to develop and validate risk prediction models for pregnancy-related ADRs of drugs using advanced Machine Learning (ML) and Deep Learning (DL) techniques, leveraging real-world data from the FDA Adverse Event Reporting System. We explored three methods─Information Component, Reporting Odds Ratio, and 95% confidence interval of ROR─for classifying drugs into high-risk and low-risk categories. DL models, including Directed Message Passing Neural Networks (DMPNN), Graph Neural Networks, and Graph Convolutional Networks, were developed and compared to traditional ML models like Random Forest, Support Vector Machines, and XGBoost. Among these, the DMPNN model, which integrated molecular graph information and molecular descriptors, exhibited the highest predictive performance, particularly at the preferred term level. The model was validated against external data sets from SIDER and DailyMed, demonstrating strong generalizability. Additionally, the model was applied to assess the risk of 22 oral hypoglycemic drugs, and potential substructure alerts for pregnancy-related ADRs were identified. These findings suggest that the DMPNN model is a valuable tool for predicting ADRs in pregnant women, offering significant advancement in drug safety assessment and providing crucial insights for safer medication use during pregnancy.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142749488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28DOI: 10.1021/acs.jcim.4c01728
Daniel B K Chu, David A González-Narváez, Ralf Meyer, Aditya Nandy, Heather J Kulik
Methods that accelerate the evaluation of molecular properties are essential for chemical discovery. While some degree of ligand additivity has been established for transition metal complexes, it is underutilized in asymmetric complexes, such as the square pyramidal coordination geometries highly relevant to catalysis. To develop predictive methods beyond simple additivity, we apply a many-body expansion to octahedral and square pyramidal complexes and introduce a correction based on adjacent ligands (i.e., the cis interaction model). We first test the cis interaction model on adiabatic spin-splitting energies of octahedral Fe(II) complexes, predicting DFT-calculated values of unseen binary complexes to within an average error of 1.4 kcal/mol. Uncertainty analysis reveals the optimal basis, comprising the homoleptic and mer symmetric complexes. We next show that the cis model (i.e., the cis interaction model solved for the optimal basis) infers both DFT- and CCSD(T)-calculated model catalytic reaction energies to within 1 kcal/mol on average. The cis model predicts low-symmetry complexes with reaction energies outside the range of binary complex reaction energies. We observe that trans interactions are unnecessary for most monodentate systems but can be important for some combinations of ligands, such as complexes containing a mixture of bidentate and monodentate ligands. Finally, we demonstrate that the cis model may be combined with Δ-learning to predict CCSD(T) reaction energies from exhaustively calculated DFT reaction energies and the same fraction of CCSD(T) reaction energies needed for the cis model, achieving around 30% of the error from using the CCSD(T) reaction energies in the cis model alone.
{"title":"Ligand Many-Body Expansion as a General Approach for Accelerating Transition Metal Complex Discovery.","authors":"Daniel B K Chu, David A González-Narváez, Ralf Meyer, Aditya Nandy, Heather J Kulik","doi":"10.1021/acs.jcim.4c01728","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01728","url":null,"abstract":"<p><p>Methods that accelerate the evaluation of molecular properties are essential for chemical discovery. While some degree of ligand additivity has been established for transition metal complexes, it is underutilized in asymmetric complexes, such as the square pyramidal coordination geometries highly relevant to catalysis. To develop predictive methods beyond simple additivity, we apply a many-body expansion to octahedral and square pyramidal complexes and introduce a correction based on adjacent ligands (i.e., the <i>cis</i> interaction model). We first test the <i>cis</i> interaction model on adiabatic spin-splitting energies of octahedral Fe(II) complexes, predicting DFT-calculated values of unseen binary complexes to within an average error of 1.4 kcal/mol. Uncertainty analysis reveals the optimal basis, comprising the homoleptic and <i>mer</i> symmetric complexes. We next show that the <i>cis</i> model (i.e., the <i>cis</i> interaction model solved for the optimal basis) infers both DFT- and CCSD(T)-calculated model catalytic reaction energies to within 1 kcal/mol on average. The <i>cis</i> model predicts low-symmetry complexes with reaction energies outside the range of binary complex reaction energies. We observe that <i>trans</i> interactions are unnecessary for most monodentate systems but can be important for some combinations of ligands, such as complexes containing a mixture of bidentate and monodentate ligands. Finally, we demonstrate that the <i>cis</i> model may be combined with Δ-learning to predict CCSD(T) reaction energies from exhaustively calculated DFT reaction energies and the same fraction of CCSD(T) reaction energies needed for the <i>cis</i> model, achieving around 30% of the error from using the CCSD(T) reaction energies in the <i>cis</i> model alone.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142737775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28DOI: 10.1021/acs.jcim.4c01999
Xiaohua Wang, Ming Zhang, Xibei Yang, Dong-Jun Yu, Fang Ge
Accurately predicting mutations in G protein-coupled receptors (GPCRs) is critical for advancing disease diagnosis and drug discovery. In response to this imperative, GPTrans has emerged as a highly accurate predictor of disease-related mutations in GPCRs. The core innovation of GPTrans resides in the design of a novel feature extraction network, that is capable of integrating features from both wildtype and mutant protein variant sites, utilizing multifeature connections within a transformer framework to ensure comprehensive feature extraction. A key aspect of GPTrans's effectiveness is our introduction of an innovative deep feature integration strategy, which merges embeddings and class tokens from multiple protein language models, including evolutionary scale modeling and ProtTrans, thus shedding light on the biochemical properties of proteins. Leveraging transformer components and a self-attention mechanism, GPTrans captures higher-level representations of protein features. Employing both wildtype and mutation site information for feature fusion not only enriches the predictive feature set but also avoids the common issue of overestimation associated with sequence-based predictions. This approach distinguishes GPTrans, enabling it to significantly outperform existing methods. Our evaluations across diverse GPCR data sets, including ClinVar and MutHTP, demonstrate GPTrans's superior performance, with average AUC values of 0.874 and 0.590 in 10-fold cross-validation. Notably, compared to the AlphaMissense method, GPTrans exhibited a remarkable 38.03% improvement in accuracy when predicting disease-associated mutations in the MutHTP data set. A thorough analysis of the predicted results further validates the model's effectiveness. The source code, data sets, and prediction results for GPTrans are available for academic use at https://github.com/EduardWang/GPTrans.
{"title":"GPTrans: A Biological Language Model-Based Approach for Predicting Disease-Associated Mutations in G Protein-Coupled Receptors.","authors":"Xiaohua Wang, Ming Zhang, Xibei Yang, Dong-Jun Yu, Fang Ge","doi":"10.1021/acs.jcim.4c01999","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01999","url":null,"abstract":"<p><p>Accurately predicting mutations in G protein-coupled receptors (GPCRs) is critical for advancing disease diagnosis and drug discovery. In response to this imperative, GPTrans has emerged as a highly accurate predictor of disease-related mutations in GPCRs. The core innovation of GPTrans resides in the design of a novel feature extraction network, that is capable of integrating features from both wildtype and mutant protein variant sites, utilizing multifeature connections within a transformer framework to ensure comprehensive feature extraction. A key aspect of GPTrans's effectiveness is our introduction of an innovative deep feature integration strategy, which merges embeddings and class tokens from multiple protein language models, including evolutionary scale modeling and ProtTrans, thus shedding light on the biochemical properties of proteins. Leveraging transformer components and a self-attention mechanism, GPTrans captures higher-level representations of protein features. Employing both wildtype and mutation site information for feature fusion not only enriches the predictive feature set but also avoids the common issue of overestimation associated with sequence-based predictions. This approach distinguishes GPTrans, enabling it to significantly outperform existing methods. Our evaluations across diverse GPCR data sets, including ClinVar and MutHTP, demonstrate GPTrans's superior performance, with average AUC values of 0.874 and 0.590 in 10-fold cross-validation. Notably, compared to the AlphaMissense method, GPTrans exhibited a remarkable 38.03% improvement in accuracy when predicting disease-associated mutations in the MutHTP data set. A thorough analysis of the predicted results further validates the model's effectiveness. The source code, data sets, and prediction results for GPTrans are available for academic use at https://github.com/EduardWang/GPTrans.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142749498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-28DOI: 10.1021/acs.jcim.4c01594
Muya Xiong, Tianqing Nie, Zhewen Li, Meiyi Hu, Haixia Su, Hangchen Hu, Yechun Xu, Qiang Shao
3-Chymotrypsin-like protease (3CLpro) is a prominent target against pathogenic coronaviruses. Expert knowledge of the cysteine-targeted covalent reaction mechanism is crucial to predict the inhibitory potency of approved inhibitors against 3CLpros of SARS-CoV-2 variants and perform structure-based drug design against newly emerging coronaviruses. We carried out an extensive array of classical and hybrid QM/MM molecular dynamics simulations to explore covalent inhibition mechanisms of five well-characterized inhibitors toward SARS-CoV-2 3CLpro and its mutants. The calculated binding affinity and reactivity of the inhibitors are highly consistent with experimental data, and the predicted inhibitory potency of the inhibitors against 3CLpro with L167F, E166V, or T21I/E166V mutant is in full agreement with IC50s determined by the accompanying enzymatic assays. The explored mechanisms unveil the impact of residue mutagenesis on structural dynamics that communicates to change not only noncovalent binding strength but also covalent reaction free energy. Such a change is inhibitor dependent, corresponding to varied levels of drug resistance of these 3CLpro mutants against nirmatrelvir and simnotrelvir and no resistance to the 11a compound. These results together suggest that the present simulations with a suitable protocol can efficiently evaluate the reactivity and potency of covalent inhibitors along with the elucidated molecular mechanisms of covalent inhibition.
{"title":"Potency Prediction of Covalent Inhibitors against SARS-CoV-2 3CL-like Protease and Multiple Mutants by Multiscale Simulations.","authors":"Muya Xiong, Tianqing Nie, Zhewen Li, Meiyi Hu, Haixia Su, Hangchen Hu, Yechun Xu, Qiang Shao","doi":"10.1021/acs.jcim.4c01594","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01594","url":null,"abstract":"<p><p>3-Chymotrypsin-like protease (3CL<sup>pro</sup>) is a prominent target against pathogenic coronaviruses. Expert knowledge of the cysteine-targeted covalent reaction mechanism is crucial to predict the inhibitory potency of approved inhibitors against 3CL<sup>pro</sup>s of SARS-CoV-2 variants and perform structure-based drug design against newly emerging coronaviruses. We carried out an extensive array of classical and hybrid QM/MM molecular dynamics simulations to explore covalent inhibition mechanisms of five well-characterized inhibitors toward SARS-CoV-2 3CL<sup>pro</sup> and its mutants. The calculated binding affinity and reactivity of the inhibitors are highly consistent with experimental data, and the predicted inhibitory potency of the inhibitors against 3CL<sup>pro</sup> with L167F, E166V, or T21I/E166V mutant is in full agreement with IC<sub>50</sub>s determined by the accompanying enzymatic assays. The explored mechanisms unveil the impact of residue mutagenesis on structural dynamics that communicates to change not only noncovalent binding strength but also covalent reaction free energy. Such a change is inhibitor dependent, corresponding to varied levels of drug resistance of these 3CL<sup>pro</sup> mutants against nirmatrelvir and simnotrelvir and no resistance to the <b>11a</b> compound. These results together suggest that the present simulations with a suitable protocol can efficiently evaluate the reactivity and potency of covalent inhibitors along with the elucidated molecular mechanisms of covalent inhibition.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142737777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-27DOI: 10.1021/acs.jcim.4c01632
Yi Li, Qin-Wei Xu, Guo-Lei Jian, Xiao-Ling Zhang, Hua Wang
Accurately identifying sites of metabolism (SoM) mediated by cytochrome P450 (CYP) enzymes, which are responsible for drug metabolism in the body, is critical in the early stage of drug discovery and development. Current computational methods for CYP-mediated SoM prediction face several challenges, including limitations to traditional machine learning models at the atomic level, heavy reliance on complex feature engineering, and the lack of interpretability relevant to medicinal chemistry. Here, we propose GraphCySoM, a novel molecule-level modeling approach based on graph neural networks, utilizing lightweight features and interpretable annotations on substructures, to effectively and interpretably predict CYP-mediated SoM. Unlike computationally expensive atomic descriptors derived from resource-intensive chemistry or even quantum chemistry calculations, we emphasize that graph-based molecular modeling initialized solely with lightweight features enables the adaptive learning of molecular topology through message-passing mechanisms combined with various aggregation kernels. Extensive ablation experiments demonstrate that GraphCySoM significantly outperforms baseline models and achieves superior performance compared with competing methods while exhibiting advantages in computational efficiency. Moreover, the attention mechanism and subgraph information bottlenecks are incorporated to analyze node importance and feature significance, resulting in mining substructures associated with the SoM. To the best of our knowledge, this is the first comprehensive study of CYP-mediated SoM using molecule-level modeling and interpretable technology. Our method achieves new state-of-the-art performance and provides potential insights into the molecular and pharmacological mechanisms underlying drug metabolism catalyzed by CYP enzymes. All source files and trained models are freely available at https://github.com/liyigerry/GraphCySoM.
准确识别由细胞色素 P450(CYP)酶介导的代谢位点(SoM)对于药物发现和开发的早期阶段至关重要。目前用于 CYP 介导的 SoM 预测的计算方法面临着一些挑战,包括传统机器学习模型在原子水平上的局限性、对复杂特征工程的严重依赖以及缺乏与药物化学相关的可解释性。在此,我们提出了基于图神经网络的新型分子级建模方法 GraphCySoM,该方法利用轻量级特征和可解释的子结构注释,有效且可解释地预测 CYP 介导的 SoM。与从资源密集型化学甚至量子化学计算中得出的计算昂贵的原子描述符不同,我们强调基于图的分子建模仅用轻量级特征初始化,通过消息传递机制结合各种聚合内核实现分子拓扑的自适应学习。广泛的消融实验证明,GraphCySoM 的性能明显优于基线模型,与其他竞争方法相比性能更优,同时在计算效率方面也有优势。此外,GraphCySoM 还结合了注意力机制和子图信息瓶颈来分析节点重要性和特征重要性,从而挖掘出与 SoM 相关的子结构。据我们所知,这是首次利用分子级建模和可解释技术对 CYP 介导的 SoM 进行全面研究。我们的方法达到了最先进的新性能,为了解 CYP 酶催化药物代谢的分子和药理机制提供了潜在的见解。所有源文件和训练好的模型都可在 https://github.com/liyigerry/GraphCySoM 免费获取。
{"title":"Improved and Interpretable Prediction of Cytochrome P450-Mediated Metabolism by Molecule-Level Graph Modeling and Subgraph Information Bottlenecks.","authors":"Yi Li, Qin-Wei Xu, Guo-Lei Jian, Xiao-Ling Zhang, Hua Wang","doi":"10.1021/acs.jcim.4c01632","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01632","url":null,"abstract":"<p><p>Accurately identifying sites of metabolism (SoM) mediated by cytochrome P450 (CYP) enzymes, which are responsible for drug metabolism in the body, is critical in the early stage of drug discovery and development. Current computational methods for CYP-mediated SoM prediction face several challenges, including limitations to traditional machine learning models at the atomic level, heavy reliance on complex feature engineering, and the lack of interpretability relevant to medicinal chemistry. Here, we propose GraphCySoM, a novel molecule-level modeling approach based on graph neural networks, utilizing lightweight features and interpretable annotations on substructures, to effectively and interpretably predict CYP-mediated SoM. Unlike computationally expensive atomic descriptors derived from resource-intensive chemistry or even quantum chemistry calculations, we emphasize that graph-based molecular modeling initialized solely with lightweight features enables the adaptive learning of molecular topology through message-passing mechanisms combined with various aggregation kernels. Extensive ablation experiments demonstrate that GraphCySoM significantly outperforms baseline models and achieves superior performance compared with competing methods while exhibiting advantages in computational efficiency. Moreover, the attention mechanism and subgraph information bottlenecks are incorporated to analyze node importance and feature significance, resulting in mining substructures associated with the SoM. To the best of our knowledge, this is the first comprehensive study of CYP-mediated SoM using molecule-level modeling and interpretable technology. Our method achieves new state-of-the-art performance and provides potential insights into the molecular and pharmacological mechanisms underlying drug metabolism catalyzed by CYP enzymes. All source files and trained models are freely available at https://github.com/liyigerry/GraphCySoM.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142737774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-27DOI: 10.1021/acs.jcim.4c00995
Renato Soares, Luísa Azevedo, Vitor Vasconcelos, Diogo Pratas, Sérgio F Sousa, João Carneiro
Cyanobacteria strains have the potential to produce bioactive compounds that can be used in therapeutics and bioremediation. Therefore, compiling all information about these compounds to consider their value as bioresources for industrial and research applications is essential. In this study, a searchable, updated, curated, and downloadable database of cyanobacteria bioactive compounds was designed, along with a machine-learning model to predict the compounds' targets of newly discovered molecules. A Python programming protocol obtained 3431 cyanobacteria bioactive compounds, 373 unique protein targets, and 3027 molecular descriptors. PaDEL-descriptor, Mordred, and Drugtax software were used to calculate the chemical descriptors for each bioactive compound database record. The biochemical descriptors were then used to determine the most promising protein targets for human therapeutic approaches and environmental bioremediation using the best machine learning (ML) model. The creation of our database, coupled with the integration of computational docking protocols, represents an innovative approach to understanding the potential of cyanobacteria bioactive compounds. This resource, adhering to the findability, accessibility, interoperability, and reuse of digital assets (FAIR) principles, is an excellent tool for pharmaceutical and bioremediation researchers. Moreover, its capacity to facilitate the exploration of specific compounds' interactions with environmental pollutants is a significant advancement, aligning with the increasing reliance on data science and machine learning to address environmental challenges. This study is a notable step forward in leveraging cyanobacteria for both therapeutic and ecological sustainability.
{"title":"Machine Learning-Driven Discovery and Database of Cyanobacteria Bioactive Compounds: A Resource for Therapeutics and Bioremediation.","authors":"Renato Soares, Luísa Azevedo, Vitor Vasconcelos, Diogo Pratas, Sérgio F Sousa, João Carneiro","doi":"10.1021/acs.jcim.4c00995","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00995","url":null,"abstract":"<p><p>Cyanobacteria strains have the potential to produce bioactive compounds that can be used in therapeutics and bioremediation. Therefore, compiling all information about these compounds to consider their value as bioresources for industrial and research applications is essential. In this study, a searchable, updated, curated, and downloadable database of cyanobacteria bioactive compounds was designed, along with a machine-learning model to predict the compounds' targets of newly discovered molecules. A Python programming protocol obtained 3431 cyanobacteria bioactive compounds, 373 unique protein targets, and 3027 molecular descriptors. PaDEL-descriptor, Mordred, and Drugtax software were used to calculate the chemical descriptors for each bioactive compound database record. The biochemical descriptors were then used to determine the most promising protein targets for human therapeutic approaches and environmental bioremediation using the best machine learning (ML) model. The creation of our database, coupled with the integration of computational docking protocols, represents an innovative approach to understanding the potential of cyanobacteria bioactive compounds. This resource, adhering to the findability, accessibility, interoperability, and reuse of digital assets (FAIR) principles, is an excellent tool for pharmaceutical and bioremediation researchers. Moreover, its capacity to facilitate the exploration of specific compounds' interactions with environmental pollutants is a significant advancement, aligning with the increasing reliance on data science and machine learning to address environmental challenges. This study is a notable step forward in leveraging cyanobacteria for both therapeutic and ecological sustainability.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142737776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-27DOI: 10.1021/acs.jcim.4c00855
Qilong Wu, Sheng-You Huang
Molecular docking is an essential computational tool in structure-based drug discovery and the investigation of the molecular mechanisms underlying biological processes. Despite the development of many molecular docking programs for various systems, a universal tool that can accurately dock ligands across multiple system types remains elusive. Meeting the need, we developed XDock, a versatile docking framework built for both protein-ligand and nucleic acid-ligand interactions. XDock efficiently accounts for ligand flexibility by docking multiple conformations of a ligand and flexibly refining the final binding poses. It utilizes a distance geometric method for ligand sampling and leverages our knowledge-based scoring functions for assessing protein-ligand and nucleic acid-ligand interactions. XDock has undergone extensive validations on diverse benchmarks of protein-ligand and nucleic acid-ligand complexes and was compared with six other docking methods, including DOCK 6, AutoDock Vina, PLANTS, LeDock, rDock, and RLDock. In addition, XDock is also computationally efficient and on average can dock a ligand within 1 min.
{"title":"XDock: A General Docking Method for Modeling Protein-Ligand and Nucleic Acid-Ligand Interactions.","authors":"Qilong Wu, Sheng-You Huang","doi":"10.1021/acs.jcim.4c00855","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00855","url":null,"abstract":"<p><p>Molecular docking is an essential computational tool in structure-based drug discovery and the investigation of the molecular mechanisms underlying biological processes. Despite the development of many molecular docking programs for various systems, a universal tool that can accurately dock ligands across multiple system types remains elusive. Meeting the need, we developed XDock, a versatile docking framework built for both protein-ligand and nucleic acid-ligand interactions. XDock efficiently accounts for ligand flexibility by docking multiple conformations of a ligand and flexibly refining the final binding poses. It utilizes a distance geometric method for ligand sampling and leverages our knowledge-based scoring functions for assessing protein-ligand and nucleic acid-ligand interactions. XDock has undergone extensive validations on diverse benchmarks of protein-ligand and nucleic acid-ligand complexes and was compared with six other docking methods, including DOCK 6, AutoDock Vina, PLANTS, LeDock, rDock, and RLDock. In addition, XDock is also computationally efficient and on average can dock a ligand within 1 min.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142737779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}