Pub Date : 2024-12-12DOI: 10.1021/acs.jcim.4c01419
Alexander Kravberg, Didier Devaurs, Anastasiia Varava, Lydia E Kavraki, Danica Kragic
Although being able to determine whether a host molecule can enclose a guest molecule and form a caging complex could benefit numerous chemical and medical applications, the experimental discovery of molecular caging complexes has not yet been achieved at scale. Here, we propose MoleQCage, a simple tool for the high-throughput screening of host and guest candidates based on an efficient robotics-inspired geometric algorithm for molecular caging prediction, providing theoretical guarantees and robustness assessment. MoleQCage is distributed as Linux-based software with a graphical user interface and is available online at https://hub.docker.com/r/dantrigne/moleqcage in the form of a Docker container. Documentation and examples are available as Supporting Information and online at https://hub.docker.com/r/dantrigne/moleqcage.
{"title":"MoleQCage: Geometric High-Throughput Screening for Molecular Caging Prediction.","authors":"Alexander Kravberg, Didier Devaurs, Anastasiia Varava, Lydia E Kavraki, Danica Kragic","doi":"10.1021/acs.jcim.4c01419","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01419","url":null,"abstract":"<p><p>Although being able to determine whether a host molecule can enclose a guest molecule and form a caging complex could benefit numerous chemical and medical applications, the experimental discovery of molecular caging complexes has not yet been achieved at scale. Here, we propose MoleQCage, a simple tool for the high-throughput screening of host and guest candidates based on an efficient robotics-inspired geometric algorithm for molecular caging prediction, providing theoretical guarantees and robustness assessment. MoleQCage is distributed as Linux-based software with a graphical user interface and is available online at https://hub.docker.com/r/dantrigne/moleqcage in the form of a Docker container. Documentation and examples are available as Supporting Information and online at https://hub.docker.com/r/dantrigne/moleqcage.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142811435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-11DOI: 10.1021/acs.jcim.4c01520
Apoorva Mathur, Rikhia Ghosh, Ariane Nunes-Alves
Macromolecular crowding in the cellular cytoplasm can potentially impact diffusion rates of proteins, their intrinsic structural stability, binding of proteins to their corresponding partners as well as biomolecular organization and phase separation. While such intracellular crowding can have a large impact on biomolecular structure and function, the molecular mechanisms and driving forces that determine the effect of crowding on dynamics and conformations of macromolecules are so far not well understood. At a molecular level, computational methods can provide a unique lens to investigate the effect of macromolecular crowding on biomolecular behavior, providing us with a resolution that is challenging to reach with experimental techniques alone. In this review, we focus on the various physics-based and data-driven computational methods developed in the past few years to investigate macromolecular crowding and intracellular protein condensation. We review recent progress in modeling and simulation of biomolecular systems of varying sizes, ranging from single protein molecules to the entire cellular cytoplasm. We further discuss the effects of macromolecular crowding on different phenomena, such as diffusion, protein-ligand binding, and mechanical and viscoelastic properties, such as surface tension of condensates. Finally, we discuss some of the outstanding challenges that we anticipate the community addressing in the next few years in order to investigate biological phenomena in model cellular environments by reproducing in vivo conditions as accurately as possible.
{"title":"Recent Progress in Modeling and Simulation of Biomolecular Crowding and Condensation Inside Cells.","authors":"Apoorva Mathur, Rikhia Ghosh, Ariane Nunes-Alves","doi":"10.1021/acs.jcim.4c01520","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01520","url":null,"abstract":"<p><p>Macromolecular crowding in the cellular cytoplasm can potentially impact diffusion rates of proteins, their intrinsic structural stability, binding of proteins to their corresponding partners as well as biomolecular organization and phase separation. While such intracellular crowding can have a large impact on biomolecular structure and function, the molecular mechanisms and driving forces that determine the effect of crowding on dynamics and conformations of macromolecules are so far not well understood. At a molecular level, computational methods can provide a unique lens to investigate the effect of macromolecular crowding on biomolecular behavior, providing us with a resolution that is challenging to reach with experimental techniques alone. In this review, we focus on the various physics-based and data-driven computational methods developed in the past few years to investigate macromolecular crowding and intracellular protein condensation. We review recent progress in modeling and simulation of biomolecular systems of varying sizes, ranging from single protein molecules to the entire cellular cytoplasm. We further discuss the effects of macromolecular crowding on different phenomena, such as diffusion, protein-ligand binding, and mechanical and viscoelastic properties, such as surface tension of condensates. Finally, we discuss some of the outstanding challenges that we anticipate the community addressing in the next few years in order to investigate biological phenomena in model cellular environments by reproducing <i>in vivo</i> conditions as accurately as possible.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142805528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-11DOI: 10.1021/acs.jcim.4c01811
Francisco L Feitosa, Victoria F Cabral, Igor H Sanches, Sabrina Silva-Mendonca, Joyce V V B Borba, Rodolpho C Braga, Carolina Horta Andrade
Cytotoxicity is essential in drug discovery, enabling early evaluation of toxic compounds during screenings to minimize toxicological risks. In vitro assays support high-throughput screening, allowing for efficient detection of toxic substances while considerably reducing the need for animal testing. Additionally, AI-based Quantitative Structure-Activity Relationship (AI-QSAR) models enhance early stage predictions by assessing the cytotoxic potential of molecular structures, which helps prioritize low-risk compounds for further validation. We present a freely accessible web application designed for identifying potential cytotoxic compounds utilizing QSAR models. This application utilizes machine learning techniques and is built on a data set of approximately 90,000 compounds, evaluated against two cell lines, 3T3 and HEK 293. Users can interact with the app by inputting a SMILES representation, uploading CSV or SDF files, or sketching molecules. The output includes a binary prediction for each cell line, a confidence percentage, and an explainable AI (XAI) analysis. Cyto-Safe web-app version 1.0 is available at http://insightai.labmol.com.br/.
{"title":"Cyto-Safe: A Machine Learning Tool for Early Identification of Cytotoxic Compounds in Drug Discovery.","authors":"Francisco L Feitosa, Victoria F Cabral, Igor H Sanches, Sabrina Silva-Mendonca, Joyce V V B Borba, Rodolpho C Braga, Carolina Horta Andrade","doi":"10.1021/acs.jcim.4c01811","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01811","url":null,"abstract":"<p><p>Cytotoxicity is essential in drug discovery, enabling early evaluation of toxic compounds during screenings to minimize toxicological risks. <i>In vitro</i> assays support high-throughput screening, allowing for efficient detection of toxic substances while considerably reducing the need for animal testing. Additionally, AI-based Quantitative Structure-Activity Relationship (AI-QSAR) models enhance early stage predictions by assessing the cytotoxic potential of molecular structures, which helps prioritize low-risk compounds for further validation. We present a freely accessible web application designed for identifying potential cytotoxic compounds utilizing QSAR models. This application utilizes machine learning techniques and is built on a data set of approximately 90,000 compounds, evaluated against two cell lines, 3T3 and HEK 293. Users can interact with the app by inputting a SMILES representation, uploading CSV or SDF files, or sketching molecules. The output includes a binary prediction for each cell line, a confidence percentage, and an explainable AI (XAI) analysis. Cyto-Safe web-app version 1.0 is available at http://insightai.labmol.com.br/.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142811432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10DOI: 10.1021/acs.jcim.4c01573
Elena Xerxa, Martin Vogt, Jürgen Bajorath
While data curation principles and practices are a major topic in data science, they are often not explicitly considered in machine learning (ML) applications in chemistry. We have been interested in evaluating the potential effects of data curation on the performance of molecular ML models. Therefore, a sequential curation scheme was developed for compounds and activity data, and different ML classification models were generated at increasing data confidence levels and evaluated. Sequential data curation was found to systematically increase classification performance in an incremental manner due to cumulative effects of individual data curation criteria. The analysis of chemical space distributions of compound subsets at different data confidence levels revealed that the separation of compounds with different class labels in chemical space generally increased during sequential activity data curation, which was mostly due to subsequent elimination of singletons rather than compounds from analogue series. These findings provided a rationale for increasing the classification performance of ML models as a consequence of increasingly stringent data curation. Taken together, the results reported herein suggest that further attention should be paid to varying data curation and confidence levels when deriving and assessing ML models for chemical applications.
{"title":"Influence of Data Curation and Confidence Levels on Compound Predictions Using Machine Learning Models.","authors":"Elena Xerxa, Martin Vogt, Jürgen Bajorath","doi":"10.1021/acs.jcim.4c01573","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01573","url":null,"abstract":"<p><p>While data curation principles and practices are a major topic in data science, they are often not explicitly considered in machine learning (ML) applications in chemistry. We have been interested in evaluating the potential effects of data curation on the performance of molecular ML models. Therefore, a sequential curation scheme was developed for compounds and activity data, and different ML classification models were generated at increasing data confidence levels and evaluated. Sequential data curation was found to systematically increase classification performance in an incremental manner due to cumulative effects of individual data curation criteria. The analysis of chemical space distributions of compound subsets at different data confidence levels revealed that the separation of compounds with different class labels in chemical space generally increased during sequential activity data curation, which was mostly due to subsequent elimination of singletons rather than compounds from analogue series. These findings provided a rationale for increasing the classification performance of ML models as a consequence of increasingly stringent data curation. Taken together, the results reported herein suggest that further attention should be paid to varying data curation and confidence levels when deriving and assessing ML models for chemical applications.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142826587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10DOI: 10.1021/acs.jcim.4c00890
Zheng Zhao, Philip E Bourne
In designing covalent kinase inhibitors (CKIs), the inclusion of electrophiles as attacking warheads demands careful choreography, ensuring not only their presence on the scaffold moiety but also their precise interaction with nucleophiles in the binding sites. Given the limited number of known electrophiles, exploring adjacent chemical space to broaden the palette of available electrophiles capable of covalent inhibition is desirable. Here, we systematically analyze the characteristics of warheads and the corresponding adjacent fragments for use in CKI design. We first collect all the released cysteine-targeted CKIs from multiple databases and create one CKI data set containing 16,961 kinase-inhibitor data points from 12,381 unique CKIs covering 146 kinases with accessible cysteines in their binding pockets. Then, we analyze this data set, focusing on the extended warheads (i.e., warheads + adjacent fragments)─including 30 common warheads and 1344 unique adjacent fragments. In so doing, we provide structural insights and delineate chemical properties and patterns in these extended warheads. Notably, we highlight the popular patterns observed within reversible CKIs for the popular warheads cyanoacrylamide and aldehyde. This study provides medicinal chemists with novel insights into extended warheads and a comprehensive source of adjacent fragments, thus guiding the design, synthesis, and optimization of CKIs.
{"title":"Exploring Extended Warheads toward Developing Cysteine-Targeted Covalent Kinase Inhibitors.","authors":"Zheng Zhao, Philip E Bourne","doi":"10.1021/acs.jcim.4c00890","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00890","url":null,"abstract":"<p><p>In designing covalent kinase inhibitors (CKIs), the inclusion of electrophiles as attacking warheads demands careful choreography, ensuring not only their presence on the scaffold moiety but also their precise interaction with nucleophiles in the binding sites. Given the limited number of known electrophiles, exploring adjacent chemical space to broaden the palette of available electrophiles capable of covalent inhibition is desirable. Here, we systematically analyze the characteristics of warheads and the corresponding adjacent fragments for use in CKI design. We first collect all the released cysteine-targeted CKIs from multiple databases and create one CKI data set containing 16,961 kinase-inhibitor data points from 12,381 unique CKIs covering 146 kinases with accessible cysteines in their binding pockets. Then, we analyze this data set, focusing on the extended warheads (i.e., warheads + adjacent fragments)─including 30 common warheads and 1344 unique adjacent fragments. In so doing, we provide structural insights and delineate chemical properties and patterns in these extended warheads. Notably, we highlight the popular patterns observed within reversible CKIs for the popular warheads cyanoacrylamide and aldehyde. This study provides medicinal chemists with novel insights into extended warheads and a comprehensive source of adjacent fragments, thus guiding the design, synthesis, and optimization of CKIs.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142798644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10DOI: 10.1021/acs.jcim.4c01260
Jordy Schifferstein, Andrius Bernatavicius, Antonius P A Janssen
Kinase inhibitors are an important class of anticancer drugs, with 80 inhibitors clinically approved and >100 in active clinical testing. Most bind competitively in the ATP-binding site, leading to challenges with selectivity for a specific kinase, resulting in risks for toxicity and general off-target effects. Assessing the binding of an inhibitor for the entire kinome is experimentally possible but expensive. A reliable and interpretable computational prediction of kinase selectivity would greatly benefit the inhibitor discovery and optimization process. Here, we use machine learning on docked poses to address this need. To this end, we aggregated all known inhibitor-kinase affinities and generated the complete accompanying 3D interactome by docking all inhibitors to the respective high-quality X-ray structures. We then used this resource to train a neural network as a kinase-specific scoring function, which achieved an overall performance (R2) of 0.63-0.74 on unseen inhibitors across the kinome. The entire pipeline from molecule to 3D-based affinity prediction has been fully automated and wrapped in a freely available package. This has a graphical user interface that is tightly integrated with PyMOL to allow immediate adoption in the medicinal chemistry practice.
激酶抑制剂是一类重要的抗癌药物,目前已有 80 种抑制剂获得临床批准,超过 100 种正在进行临床试验。大多数抑制剂与 ATP 结合位点竞争性结合,导致对特定激酶的选择性面临挑战,从而产生毒性和一般脱靶效应的风险。评估抑制剂与整个激酶组的结合在实验上是可行的,但成本高昂。对激酶选择性进行可靠、可解释的计算预测,将大大有利于抑制剂的发现和优化过程。在此,我们利用对接姿势的机器学习来满足这一需求。为此,我们汇总了所有已知的抑制剂-激酶亲和力,并通过将所有抑制剂与各自的高质量 X 射线结构对接,生成了完整的伴随三维相互作用组。然后,我们利用这一资源训练神经网络作为激酶特异性评分函数,该函数在整个激酶组的未见抑制剂上的总体性能(R2)达到了 0.63-0.74。从分子到基于三维的亲和力预测的整个流程已经完全自动化,并封装在一个免费提供的软件包中。它的图形用户界面与 PyMOL 紧密集成,可立即用于药物化学实践。
{"title":"Docking-Informed Machine Learning for Kinome-wide Affinity Prediction.","authors":"Jordy Schifferstein, Andrius Bernatavicius, Antonius P A Janssen","doi":"10.1021/acs.jcim.4c01260","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01260","url":null,"abstract":"<p><p>Kinase inhibitors are an important class of anticancer drugs, with 80 inhibitors clinically approved and >100 in active clinical testing. Most bind competitively in the ATP-binding site, leading to challenges with selectivity for a specific kinase, resulting in risks for toxicity and general off-target effects. Assessing the binding of an inhibitor for the entire kinome is experimentally possible but expensive. A reliable and interpretable computational prediction of kinase selectivity would greatly benefit the inhibitor discovery and optimization process. Here, we use machine learning on docked poses to address this need. To this end, we aggregated all known inhibitor-kinase affinities and generated the complete accompanying 3D interactome by docking all inhibitors to the respective high-quality X-ray structures. We then used this resource to train a neural network as a kinase-specific scoring function, which achieved an overall performance (<i>R</i><sup>2</sup>) of 0.63-0.74 on unseen inhibitors across the kinome. The entire pipeline from molecule to 3D-based affinity prediction has been fully automated and wrapped in a freely available package. This has a graphical user interface that is tightly integrated with PyMOL to allow immediate adoption in the medicinal chemistry practice.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142826570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10DOI: 10.1021/acs.jcim.4c02211
Martina Piga, Zoltan Varga, Adam Feher, Ferenc Papp, Eva Korpos, Kavya C Bangera, Rok Frlan, Janez Ilaš, Jaka Dernovšek, Tihomir Tomašič, Nace Zidar
{"title":"Correction to \"Identification of a Novel Structural Class of HV1 Inhibitors by Structure-Based Virtual Screening\".","authors":"Martina Piga, Zoltan Varga, Adam Feher, Ferenc Papp, Eva Korpos, Kavya C Bangera, Rok Frlan, Janez Ilaš, Jaka Dernovšek, Tihomir Tomašič, Nace Zidar","doi":"10.1021/acs.jcim.4c02211","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02211","url":null,"abstract":"","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142826567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10DOI: 10.1021/acs.jcim.4c00986
Richard E Overstreet, Dennis G Thomas, John R Cort
Noncanonical amino acids (ncAAs) provide numerous avenues for the introduction of novel functionality to peptides and proteins. ncAAs can be incorporated through solid-phase synthesis or genetic code expansion in conjugation with heterologous expression of the encoded protein modification. Due to the difficulty of synthesis or overexpression, wide chemical space, and lack of empirically resolved structures, modeling the effects of ncAA mutation is critical for rational protein design. To evaluate the structural and functional perturbations ncAAs introduce, we utilize molecular potentials that describe the forces in the protein structure. Most potentials such as CHARMM are designed to model canonical residues but can be parametrized to include novel ncAAs. In this work, we introduce NCAP, a software package to generate CHARMM-compatible parameters from quantum chemical calculations. Unlike currently available tools, NCAP is designed to recognize the ncAA structure and automatically bridge the gap between quantum chemical calculations and CHARMM potential parameters. For our software, we discuss the workflow, validation against canonical parameter sets, and comparison with published ncAA-protein structures.
{"title":"NCAP: Noncanonical Amino Acid Parameterization Software for CHARMM Potentials.","authors":"Richard E Overstreet, Dennis G Thomas, John R Cort","doi":"10.1021/acs.jcim.4c00986","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00986","url":null,"abstract":"<p><p>Noncanonical amino acids (ncAAs) provide numerous avenues for the introduction of novel functionality to peptides and proteins. ncAAs can be incorporated through solid-phase synthesis or genetic code expansion in conjugation with heterologous expression of the encoded protein modification. Due to the difficulty of synthesis or overexpression, wide chemical space, and lack of empirically resolved structures, modeling the effects of ncAA mutation is critical for rational protein design. To evaluate the structural and functional perturbations ncAAs introduce, we utilize molecular potentials that describe the forces in the protein structure. Most potentials such as CHARMM are designed to model canonical residues but can be parametrized to include novel ncAAs. In this work, we introduce NCAP, a software package to generate CHARMM-compatible parameters from quantum chemical calculations. Unlike currently available tools, NCAP is designed to recognize the ncAA structure and automatically bridge the gap between quantum chemical calculations and CHARMM potential parameters. For our software, we discuss the workflow, validation against canonical parameter sets, and comparison with published ncAA-protein structures.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142826528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-10DOI: 10.1021/acs.jcim.4c01190
Negin Forouzesh, Fatemeh Ghafouri, Igor S Tolokh, Alexey V Onufriev
Accuracy of binding free energy calculations utilizing implicit solvent models is critically affected by parameters of the underlying dielectric boundary, specifically, the atomic and water probe radii. Here, a multidimensional optimization pipeline is used to find optimal atomic radii, specifically for binding calculations in the implicit solvent. To reduce overfitting, the optimization target includes separate, weighted contributions from both binding and hydration free energies. The resulting five-parameter radii set, OPT_BIND5D, is evaluated against experiment for binding free energies of 20 host-guest (H-G) systems, unrelated to the types of structures used in the training. The resulting accuracy for this H-G test set (root mean square error of 2.03 kcal/mol, mean signed error of -0.13 kcal/mol, mean absolute error of 1.68 kcal/mol, and Pearson's correlation of r = 0.79 with the experimental values) is on par with what can be expected from the fixed charge explicit solvent models. Best agreement with the experiment is achieved when the implicit salt concentration is set equal or close to the experimental conditions.
{"title":"Optimal Dielectric Boundary for Binding Free Energy Estimates in the Implicit Solvent.","authors":"Negin Forouzesh, Fatemeh Ghafouri, Igor S Tolokh, Alexey V Onufriev","doi":"10.1021/acs.jcim.4c01190","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01190","url":null,"abstract":"<p><p>Accuracy of binding free energy calculations utilizing implicit solvent models is critically affected by parameters of the underlying dielectric boundary, specifically, the atomic and water probe radii. Here, a multidimensional optimization pipeline is used to find optimal atomic radii, specifically for binding calculations in the implicit solvent. To reduce overfitting, the optimization target includes separate, weighted contributions from both binding and hydration free energies. The resulting five-parameter radii set, OPT_BIND5D, is evaluated against experiment for binding free energies of 20 host-guest (H-G) systems, unrelated to the types of structures used in the training. The resulting accuracy for this H-G test set (root mean square error of 2.03 kcal/mol, mean signed error of -0.13 kcal/mol, mean absolute error of 1.68 kcal/mol, and Pearson's correlation of <i>r</i> = 0.79 with the experimental values) is on par with what can be expected from the fixed charge explicit solvent models. Best agreement with the experiment is achieved when the implicit salt concentration is set equal or close to the experimental conditions.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142798650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-09Epub Date: 2024-11-22DOI: 10.1021/acs.jcim.4c01291
Tibor Viktor Szalai, Dávid Bajusz, Rita Börzsei, Balázs Zoltán Zsidó, Janez Ilaš, György G Ferenczy, Csaba Hetényi, György M Keserű
Rational drug design focuses on the explanation and prediction of complex formation between therapeutic targets and small-molecule ligands. As a third and often overlooked interacting partner, water molecules play a critical role in the thermodynamics of protein-ligand binding, impacting both the entropy and enthalpy components of the binding free energy and by extension, on-target affinity and bioactivity. The community has realized the importance of binding site waters, as evidenced by the number of computational tools to predict the structure and thermodynamics of their networks. However, quantitative experimental characterization of relevant protein-ligand-water systems, and consequently the validation of these modeling methods, remains challenging. Here, we investigated the impact of solvent exchange from light (H2O) to heavy water (D2O) to provide complete thermodynamic profiling of these ternary systems. Utilizing the solvent isotope effects, we gain a deeper understanding of the energetic contributions of various components. Specifically, we conducted isothermal titration calorimetry experiments on trypsin with a series of p-substituted benzamidines, as well as carbonic anhydrase II (CAII) with a series of aromatic sulfonamides. Significant differences in binding enthalpies found between light vs heavy water indicate a substantial role of the binding site water network in protein-ligand binding. Next, we challenged two conceptually distinct modeling methods, the grid-based WaterFLAP and the molecular dynamics-based MobyWat, by predicting and scoring relevant water networks. The predicted water positions accurately reproduce those in available high-resolution X-ray and neutron diffraction structures of the relevant protein-ligand complexes. Estimated energetic contributions of the identified water networks were corroborated by the experimental thermodynamics data. Besides providing a direct validation for the predictive power of these methods, our findings confirmed the importance of considering binding site water networks in computational ligand design.
合理药物设计的重点是解释和预测治疗目标与小分子配体之间形成的复合物。水分子作为第三个经常被忽视的相互作用伙伴,在蛋白质与配体结合的热力学中发挥着至关重要的作用,影响着结合自由能的熵和焓成分,进而影响目标亲和力和生物活性。研究界已经意识到结合位点水域的重要性,预测其网络结构和热力学的计算工具的数量就证明了这一点。然而,对相关蛋白质-配体-水系统进行定量实验表征,进而验证这些建模方法,仍然具有挑战性。在这里,我们研究了从轻水(H2O)到重水(D2O)的溶剂交换的影响,从而为这些三元系统提供完整的热力学分析。利用溶剂同位素效应,我们对各种成分的能量贡献有了更深入的了解。具体来说,我们对胰蛋白酶与一系列对取代苯甲脒以及碳酸酐酶 II (CAII) 与一系列芳香族磺酰胺进行了等温滴定量热实验。发现轻水与重水的结合焓存在显著差异,这表明结合位点水网在蛋白质与配体的结合中发挥了重要作用。接下来,我们对两种概念不同的建模方法(基于网格的 WaterFLAP 和基于分子动力学的 MobyWat)进行了挑战,对相关的水网络进行了预测和评分。预测的水位置准确地再现了相关蛋白质配体复合物的现有高分辨率 X 射线和中子衍射结构中的水位置。实验热力学数据证实了对已识别水网络的能量贡献的估计。除了直接验证了这些方法的预测能力,我们的研究结果还证实了在计算配体设计中考虑结合位点水网络的重要性。
{"title":"Effect of Water Networks On Ligand Binding: Computational Predictions vs Experiments.","authors":"Tibor Viktor Szalai, Dávid Bajusz, Rita Börzsei, Balázs Zoltán Zsidó, Janez Ilaš, György G Ferenczy, Csaba Hetényi, György M Keserű","doi":"10.1021/acs.jcim.4c01291","DOIUrl":"10.1021/acs.jcim.4c01291","url":null,"abstract":"<p><p>Rational drug design focuses on the explanation and prediction of complex formation between therapeutic targets and small-molecule ligands. As a third and often overlooked interacting partner, water molecules play a critical role in the thermodynamics of protein-ligand binding, impacting both the entropy and enthalpy components of the binding free energy and by extension, on-target affinity and bioactivity. The community has realized the importance of binding site waters, as evidenced by the number of computational tools to predict the structure and thermodynamics of their networks. However, quantitative experimental characterization of relevant protein-ligand-water systems, and consequently the validation of these modeling methods, remains challenging. Here, we investigated the impact of solvent exchange from light (H<sub>2</sub>O) to heavy water (D<sub>2</sub>O) to provide complete thermodynamic profiling of these ternary systems. Utilizing the solvent isotope effects, we gain a deeper understanding of the energetic contributions of various components. Specifically, we conducted isothermal titration calorimetry experiments on trypsin with a series of <i>p</i>-substituted benzamidines, as well as carbonic anhydrase II (CAII) with a series of aromatic sulfonamides. Significant differences in binding enthalpies found between light vs heavy water indicate a substantial role of the binding site water network in protein-ligand binding. Next, we challenged two conceptually distinct modeling methods, the grid-based WaterFLAP and the molecular dynamics-based MobyWat, by predicting and scoring relevant water networks. The predicted water positions accurately reproduce those in available high-resolution X-ray and neutron diffraction structures of the relevant protein-ligand complexes. Estimated energetic contributions of the identified water networks were corroborated by the experimental thermodynamics data. Besides providing a direct validation for the predictive power of these methods, our findings confirmed the importance of considering binding site water networks in computational ligand design.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8980-8998"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11632780/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142685409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}