Pub Date : 2024-05-28DOI: 10.1021/acs.jcim.4c00421
Arnab Jana, Sam Shepherd, Yair Litman and David M. Wilkins*,
The polarization of periodically repeating systems is a discontinuous function of the atomic positions, a fact which seems at first to stymie attempts at their statistical learning. Two approaches to build models for bulk polarizations are compared: one in which a simple point charge model is used to preprocess the raw polarization to give a learning target that is a smooth function of atomic positions and the total polarization is learned as a sum of atom-centered dipoles and one in which instead the average position of Wannier centers around atoms is predicted. For a range of bulk aqueous systems, both of these methods perform perform comparatively well, with the former being slightly better but often requiring an extra effort to find a suitable point charge model. As a challenging test, we also analyze the performance of the models at the air–water interface. In this case, while the Wannier center approach delivers accurate predictions without further modifications, the preprocessing method requires augmentation with information from isolated water molecules to reach similar accuracy. Finally, we present a simple protocol to preprocess the polarizations in a data-driven way using a small number of derivatives calculated at a much lower level of theory, thus overcoming the need to find point charge models without appreciably increasing the computation cost. We believe that the training strategies presented here help the construction of accurate polarization models required for the study of the dielectric properties of realistic complex bulk systems and interfaces with ab initio accuracy.
{"title":"Learning Electronic Polarizations in Aqueous Systems","authors":"Arnab Jana, Sam Shepherd, Yair Litman and David M. Wilkins*, ","doi":"10.1021/acs.jcim.4c00421","DOIUrl":"10.1021/acs.jcim.4c00421","url":null,"abstract":"<p >The polarization of periodically repeating systems is a discontinuous function of the atomic positions, a fact which seems at first to stymie attempts at their statistical learning. Two approaches to build models for bulk polarizations are compared: one in which a simple point charge model is used to preprocess the raw polarization to give a learning target that is a smooth function of atomic positions and the total polarization is learned as a sum of atom-centered dipoles and one in which instead the average position of Wannier centers around atoms is predicted. For a range of bulk aqueous systems, both of these methods perform perform comparatively well, with the former being slightly better but often requiring an extra effort to find a suitable point charge model. As a challenging test, we also analyze the performance of the models at the air–water interface. In this case, while the Wannier center approach delivers accurate predictions without further modifications, the preprocessing method requires augmentation with information from isolated water molecules to reach similar accuracy. Finally, we present a simple protocol to preprocess the polarizations in a data-driven way using a small number of derivatives calculated at a much lower level of theory, thus overcoming the need to find point charge models without appreciably increasing the computation cost. We believe that the training strategies presented here help the construction of accurate polarization models required for the study of the dielectric properties of realistic complex bulk systems and interfaces with ab initio accuracy.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c00421","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141157305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1021/acs.jcim.4c00420
Stefano Bosio, Mattia Bernetti*, Walter Rocchia and Matteo Masetti,
It is nowadays clear that RNA molecules can play active roles in several biological processes. As a result, an increasing number of RNAs are gradually being identified as potentially druggable targets. In particular, noncoding RNAs can adopt highly organized conformations that are suitable for drug binding. However, RNAs are still considered challenging targets due to their complex structural dynamics and high charge density. Thus, elucidating relevant features of drug-RNA binding is fundamental for advancing drug discovery. Here, by using Molecular Dynamics simulations, we compare key features of ligand binding to proteins with those observed in RNA. Specifically, we explore similarities and differences in terms of (i) conformational flexibility of the target, (ii) electrostatic contribution to binding free energy, and (iii) water and ligand dynamics. As a test case, we examine binding of the same ligand, namely riboflavin, to protein and RNA targets, specifically the riboflavin (RF) kinase and flavin mononucleotide (FMN) riboswitch. The FMN riboswitch exhibited enhanced fluctuations and explored a wider conformational space, compared to the protein target, underscoring the importance of RNA flexibility in ligand binding. Conversely, a similar electrostatic contribution to the binding free energy of riboflavin was found. Finally, greater stability of water molecules was observed in the FMN riboswitch compared to the RF kinase, possibly due to the different shape and polarity of the pockets.
{"title":"Similarities and Differences in Ligand Binding to Protein and RNA Targets: The Case of Riboflavin","authors":"Stefano Bosio, Mattia Bernetti*, Walter Rocchia and Matteo Masetti, ","doi":"10.1021/acs.jcim.4c00420","DOIUrl":"10.1021/acs.jcim.4c00420","url":null,"abstract":"<p >It is nowadays clear that RNA molecules can play active roles in several biological processes. As a result, an increasing number of RNAs are gradually being identified as potentially druggable targets. In particular, noncoding RNAs can adopt highly organized conformations that are suitable for drug binding. However, RNAs are still considered challenging targets due to their complex structural dynamics and high charge density. Thus, elucidating relevant features of drug-RNA binding is fundamental for advancing drug discovery. Here, by using Molecular Dynamics simulations, we compare key features of ligand binding to proteins with those observed in RNA. Specifically, we explore similarities and differences in terms of (i) conformational flexibility of the target, (ii) electrostatic contribution to binding free energy, and (iii) water and ligand dynamics. As a test case, we examine binding of the same ligand, namely riboflavin, to protein and RNA targets, specifically the riboflavin (RF) kinase and flavin mononucleotide (FMN) riboswitch. The FMN riboswitch exhibited enhanced fluctuations and explored a wider conformational space, compared to the protein target, underscoring the importance of RNA flexibility in ligand binding. Conversely, a similar electrostatic contribution to the binding free energy of riboflavin was found. Finally, greater stability of water molecules was observed in the FMN riboswitch compared to the RF kinase, possibly due to the different shape and polarity of the pockets.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141154095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27Epub Date: 2024-05-07DOI: 10.1021/acs.jcim.4c00080
Rafael G Viegas, Ingrid B S Martins, Vitor B P Leite
A substantial portion of various organisms' proteomes comprises intrinsically disordered proteins (IDPs) that lack a defined three-dimensional structure. These IDPs exhibit a diverse array of conformations, displaying remarkable spatiotemporal heterogeneity and exceptional conformational flexibility. Characterizing the structure or structural ensemble of IDPs presents significant conceptual and methodological challenges owing to the absence of a well-defined native structure. While databases such as the Protein Ensemble Database (PED) provide IDP ensembles obtained through a combination of experimental data and molecular modeling, the absence of reaction coordinates poses challenges in comprehensively understanding pertinent aspects of the system. In this study, we leverage the energy landscape visualization method (JCTC, 6482, 2019) to scrutinize four IDP ensembles sourced from PED. ELViM, a methodology that circumvents the need for a priori reaction coordinates, aids in analyzing the ensembles. The specific IDP ensembles investigated are as follows: two fragments of nucleoporin (NUL: 884-993 and NUS: 1313-1390), yeast sic 1 N-terminal (1-90), and the N-terminal SH3 domain of Drk (1-59). Utilizing ELViM enables the comprehensive validation of ensembles, facilitating the detection of potential inconsistencies in the sampling process. Additionally, it allows for identifying and characterizing the most prevalent conformations within an ensemble. Moreover, ELViM facilitates the comparative analysis of ensembles obtained under diverse conditions, thereby providing a powerful tool for investigating the functional mechanisms of IDPs.
{"title":"Understanding the Energy Landscape of Intrinsically Disordered Protein Ensembles.","authors":"Rafael G Viegas, Ingrid B S Martins, Vitor B P Leite","doi":"10.1021/acs.jcim.4c00080","DOIUrl":"10.1021/acs.jcim.4c00080","url":null,"abstract":"<p><p>A substantial portion of various organisms' proteomes comprises intrinsically disordered proteins (IDPs) that lack a defined three-dimensional structure. These IDPs exhibit a diverse array of conformations, displaying remarkable spatiotemporal heterogeneity and exceptional conformational flexibility. Characterizing the structure or structural ensemble of IDPs presents significant conceptual and methodological challenges owing to the absence of a well-defined native structure. While databases such as the Protein Ensemble Database (PED) provide IDP ensembles obtained through a combination of experimental data and molecular modeling, the absence of reaction coordinates poses challenges in comprehensively understanding pertinent aspects of the system. In this study, we leverage the energy landscape visualization method (JCTC, 6482, 2019) to scrutinize four IDP ensembles sourced from PED. ELViM, a methodology that circumvents the need for a priori reaction coordinates, aids in analyzing the ensembles. The specific IDP ensembles investigated are as follows: two fragments of nucleoporin (NUL: 884-993 and NUS: 1313-1390), yeast sic 1 N-terminal (1-90), and the N-terminal SH3 domain of Drk (1-59). Utilizing ELViM enables the comprehensive validation of ensembles, facilitating the detection of potential inconsistencies in the sampling process. Additionally, it allows for identifying and characterizing the most prevalent conformations within an ensemble. Moreover, ELViM facilitates the comparative analysis of ensembles obtained under diverse conditions, thereby providing a powerful tool for investigating the functional mechanisms of IDPs.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140848199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27Epub Date: 2024-05-06DOI: 10.1021/acs.jcim.4c00478
Juan S Grassano, Ignacio Pickering, Adrian E Roitberg, Mariano C González Lebrero, Dario A Estrin, Jonathan A Semelak
Machine learning (ML) methods have reached high accuracy levels for the prediction of in vacuo molecular properties. However, the simulation of large systems solely through ML methods (such as those based on neural network potentials) is still a challenge. In this context, one of the most promising frameworks for integrating ML schemes in the simulation of complex molecular systems are the so-called ML/MM methods. These multiscale approaches combine ML methods with classical force fields (MM), in the same spirit as the successful hybrid quantum mechanics-molecular mechanics methods (QM/MM). The key issue for such ML/MM methods is an adequate description of the coupling between the region of the system described by ML and the region described at the MM level. In the context of QM/MM schemes, the main ingredient of the interaction is electrostatic, and the state of the art is the so-called electrostatic-embedding. In this study, we analyze the quality of simpler mechanical embedding-based approaches, specifically focusing on their application within a ML/MM framework utilizing atomic partial charges derived in vacuo. Taking as reference electrostatic embedding calculations performed at a QM(DFT)/MM level, we explore different atomic charges schemes, as well as a polarization correction computed using atomic polarizabilites. Our benchmark data set comprises a set of about 80k small organic structures from the ANI-1x and ANI-2x databases, solvated in water. The results suggest that the minimal basis iterative stockholder (MBIS) atomic charges yield the best agreement with the reference coupling energy. Remarkable enhancements are achieved by including a simple polarization correction.
机器学习(ML)方法在预测空泡分子特性方面已经达到了很高的准确度。然而,仅通过 ML 方法(如基于神经网络势能的方法)来模拟大型系统仍然是一项挑战。在这种情况下,将 ML 方案集成到复杂分子系统模拟中的最有前途的框架之一就是所谓的 ML/MM 方法。这些多尺度方法将 ML 方法与经典力场(MM)相结合,其精神与成功的量子力学-分子力学混合方法(QM/MM)相同。这类 ML/MM 方法的关键问题是充分描述 ML 所描述的系统区域与 MM 层面所描述的区域之间的耦合。在 QM/MM 方案中,相互作用的主要成分是静电,而最先进的技术是所谓的静电嵌入。在本研究中,我们分析了较简单的基于机械嵌入的方法的质量,特别关注它们在利用真空中得出的原子偏电荷的 ML/MM 框架中的应用。以在 QM(DFT)/MM 水平上进行的静电嵌入计算为参考,我们探索了不同的原子电荷方案,以及使用原子极化率计算的极化修正。我们的基准数据集包括 ANI-1x 和 ANI-2x 数据库中约 8 万个溶于水的小型有机结构。结果表明,最小基迭代股东(MBIS)原子电荷与参考耦合能的一致性最好。通过加入简单的极化校正,效果显著增强。
{"title":"Assessment of Embedding Schemes in a Hybrid Machine Learning/Classical Potentials (ML/MM) Approach.","authors":"Juan S Grassano, Ignacio Pickering, Adrian E Roitberg, Mariano C González Lebrero, Dario A Estrin, Jonathan A Semelak","doi":"10.1021/acs.jcim.4c00478","DOIUrl":"10.1021/acs.jcim.4c00478","url":null,"abstract":"<p><p>Machine learning (ML) methods have reached high accuracy levels for the prediction of in vacuo molecular properties. However, the simulation of large systems solely through ML methods (such as those based on neural network potentials) is still a challenge. In this context, one of the most promising frameworks for integrating ML schemes in the simulation of complex molecular systems are the so-called ML/MM methods. These multiscale approaches combine ML methods with classical force fields (MM), in the same spirit as the successful hybrid quantum mechanics-molecular mechanics methods (QM/MM). The key issue for such ML/MM methods is an adequate description of the coupling between the region of the system described by ML and the region described at the MM level. In the context of QM/MM schemes, the main ingredient of the interaction is electrostatic, and the state of the art is the so-called electrostatic-embedding. In this study, we analyze the quality of simpler mechanical embedding-based approaches, specifically focusing on their application within a ML/MM framework utilizing atomic partial charges derived in vacuo. Taking as reference electrostatic embedding calculations performed at a QM(DFT)/MM level, we explore different atomic charges schemes, as well as a polarization correction computed using atomic polarizabilites. Our benchmark data set comprises a set of about 80k small organic structures from the ANI-1x and ANI-2x databases, solvated in water. The results suggest that the minimal basis iterative stockholder (MBIS) atomic charges yield the best agreement with the reference coupling energy. Remarkable enhancements are achieved by including a simple polarization correction.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140846624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1021/acs.jcim.4c00727
Kenneth M. Merz*, Yee Siew Choong*, Zoe Cournia*, Olexandr Isayev*, Thereza A. Soares*, Guo-Wei Wei* and Feng Zhu*,
{"title":"Editorial: Machine Learning in Materials Science","authors":"Kenneth M. Merz*, Yee Siew Choong*, Zoe Cournia*, Olexandr Isayev*, Thereza A. Soares*, Guo-Wei Wei* and Feng Zhu*, ","doi":"10.1021/acs.jcim.4c00727","DOIUrl":"10.1021/acs.jcim.4c00727","url":null,"abstract":"","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141154097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-23DOI: 10.1021/acs.jcim.4c00354
Siyuan Liu, Qi Yang*, Long Zhang and Sanzhong Luo*,
Protein pKa is a fundamental physicochemical parameter that dictates protein structure and function. However, accurately determining protein site-pKa values remains a substantial challenge, both experimentally and theoretically. In this study, we introduce a physical organic approach, leveraging a protein structural and physical-organic-parameter-based representation (P-SPOC), to develop a rapid and intuitive model for protein pKa prediction. Our P-SPOC model achieves state-of-the-art predictive accuracy, with a mean absolute error (MAE) of 0.33 pKa units. Furthermore, we have incorporated advanced protein structure prediction models, like AlphaFold2, to approximate structures for proteins lacking three-dimensional representations, which enhances the applicability of our model in the context of structure-undetermined protein research. To promote broader accessibility within the research community, an online prediction interface was also established at isyn.luoszgroup.com.
{"title":"Accurate Protein pKa Prediction with Physical Organic Chemistry Guided 3D Protein Representation","authors":"Siyuan Liu, Qi Yang*, Long Zhang and Sanzhong Luo*, ","doi":"10.1021/acs.jcim.4c00354","DOIUrl":"10.1021/acs.jcim.4c00354","url":null,"abstract":"<p >Protein p<i>K</i><sub>a</sub> is a fundamental physicochemical parameter that dictates protein structure and function. However, accurately determining protein site-p<i>K</i><sub>a</sub> values remains a substantial challenge, both experimentally and theoretically. In this study, we introduce a physical organic approach, leveraging a protein structural and physical-organic-parameter-based representation (P-SPOC), to develop a rapid and intuitive model for protein p<i>K</i><sub>a</sub> prediction. Our P-SPOC model achieves state-of-the-art predictive accuracy, with a mean absolute error (MAE) of 0.33 p<i>K</i><sub>a</sub> units. Furthermore, we have incorporated advanced protein structure prediction models, like AlphaFold2, to approximate structures for proteins lacking three-dimensional representations, which enhances the applicability of our model in the context of structure-undetermined protein research. To promote broader accessibility within the research community, an online prediction interface was also established at isyn.luoszgroup.com.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141079810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-22DOI: 10.1021/acs.jcim.3c02040
Nikolai V. Krivoshchapov*, and , Michael G. Medvedev*,
Identification of all of the influential conformers of biomolecules is a crucial step in many tasks of computational biochemistry. Specifically, molecular docking, a key component of in silico drug development, requires a comprehensive set of conformations for potential candidates in order to generate the optimal ligand–receptor poses and, ultimately, find the best drug candidates. However, the presence of flexible cycles in a molecule complicates the initial search for conformers since exhaustive sampling algorithms via torsional random and systematic searches become very inefficient. The devised inverse-kinematics-based Monte Carlo with refinement (MCR) algorithm identifies independently rotatable dihedral angles in (poly)cyclic molecules and uses them to perform global conformational sampling, outperforming popular alternatives (MacroModel, CREST, and RDKit) in terms of speed and diversity of the resulting conformer ensembles. Moreover, MCR quickly and accurately recovers naturally occurring macrocycle conformations for most of the considered molecules.
{"title":"Accurate and Efficient Conformer Sampling of Cyclic Drug-Like Molecules with Inverse Kinematics","authors":"Nikolai V. Krivoshchapov*, and , Michael G. Medvedev*, ","doi":"10.1021/acs.jcim.3c02040","DOIUrl":"10.1021/acs.jcim.3c02040","url":null,"abstract":"<p >Identification of all of the influential conformers of biomolecules is a crucial step in many tasks of computational biochemistry. Specifically, molecular docking, a key component of <i>in silico</i> drug development, requires a comprehensive set of conformations for potential candidates in order to generate the optimal ligand–receptor poses and, ultimately, find the best drug candidates. However, the presence of flexible cycles in a molecule complicates the initial search for conformers since exhaustive sampling algorithms <i>via</i> torsional random and systematic searches become very inefficient. The devised inverse-kinematics-based Monte Carlo with refinement (MCR) algorithm identifies independently rotatable dihedral angles in (poly)cyclic molecules and uses them to perform global conformational sampling, outperforming popular alternatives (MacroModel, CREST, and RDKit) in terms of speed and diversity of the resulting conformer ensembles. Moreover, MCR quickly and accurately recovers naturally occurring macrocycle conformations for most of the considered molecules.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141079809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-22DOI: 10.1021/acs.jcim.3c01773
Brunno A. Salvatti, Marcelo A. Chagas, Phillipe O. Fernandes, Yan F. X. Ladeira, Aline S. Bozzi, Veronica S. Valadares, Ana Paula Valente, Amanda S. de Miranda, Willian R. Rocha, Vinicius G. Maltarollo and Adolfo H. Moraes*,
The (S)-norcoclaurine synthase from Thalictrum flavum (TfNCS) stereoselectively catalyzes the Pictet–Spengler reaction between dopamine and 4-hydroxyphenylacetaldehyde to give (S)-norcoclaurine. TfNCS can catalyze the Pictet–Spengler reaction with various aldehydes and ketones, leading to diverse tetrahydroisoquinolines. This substrate promiscuity positions TfNCS as a highly promising enzyme for synthesizing fine chemicals. Understanding carbonyl-containing substrates’ structural and electronic signatures that influence TfNCS activity can help expand its applications in the synthesis of different compounds and aid in protein optimization strategies. In this study, we investigated the influence of the molecular properties of aldehydes and ketones on their reactivity in the TfNCS-catalyzed Pictet–Spengler reaction. Initially, we compiled a library of reactive and unreactive compounds from previous publications. We also performed enzymatic assays using nuclear magnetic resonance to identify some reactive and unreactive carbonyl compounds, which were then included in the library. Subsequently, we employed QSAR and DFT calculations to establish correlations between substrate-candidate structures and reactivity. Our findings highlight correlations of structural and stereoelectronic features, including the electrophilicity of the carbonyl group, to the reactivity of aldehydes and ketones toward the TfNCS-catalyzed Pictet–Spengler reaction. Interestingly, experimental data of seven compounds out of fifty-three did not correlate with the electrophilicity of the carbonyl group. For these seven compounds, we identified unfavorable interactions between them and the TfNCS. Our results demonstrate the applications of in silico techniques in understanding enzyme promiscuity and specificity, with a particular emphasis on machine learning methodologies, DFT electronic structure calculations, and molecular dynamic (MD) simulations.
{"title":"Understanding the Enzyme (S)-Norcoclaurine Synthase Promiscuity to Aldehydes and Ketones","authors":"Brunno A. Salvatti, Marcelo A. Chagas, Phillipe O. Fernandes, Yan F. X. Ladeira, Aline S. Bozzi, Veronica S. Valadares, Ana Paula Valente, Amanda S. de Miranda, Willian R. Rocha, Vinicius G. Maltarollo and Adolfo H. Moraes*, ","doi":"10.1021/acs.jcim.3c01773","DOIUrl":"10.1021/acs.jcim.3c01773","url":null,"abstract":"<p >The (<i>S</i>)-norcoclaurine synthase from <i>Thalictrum flavum</i> (<i>Tf</i>NCS) stereoselectively catalyzes the Pictet–Spengler reaction between dopamine and 4-hydroxyphenylacetaldehyde to give (<i>S</i>)-norcoclaurine. <i>Tf</i>NCS can catalyze the Pictet–Spengler reaction with various aldehydes and ketones, leading to diverse tetrahydroisoquinolines. This substrate promiscuity positions <i>Tf</i>NCS as a highly promising enzyme for synthesizing fine chemicals. Understanding carbonyl-containing substrates’ structural and electronic signatures that influence <i>Tf</i>NCS activity can help expand its applications in the synthesis of different compounds and aid in protein optimization strategies. In this study, we investigated the influence of the molecular properties of aldehydes and ketones on their reactivity in the <i>Tf</i>NCS-catalyzed Pictet–Spengler reaction. Initially, we compiled a library of reactive and unreactive compounds from previous publications. We also performed enzymatic assays using nuclear magnetic resonance to identify some reactive and unreactive carbonyl compounds, which were then included in the library. Subsequently, we employed QSAR and DFT calculations to establish correlations between substrate-candidate structures and reactivity. Our findings highlight correlations of structural and stereoelectronic features, including the electrophilicity of the carbonyl group, to the reactivity of aldehydes and ketones toward the <i>Tf</i>NCS-catalyzed Pictet–Spengler reaction. Interestingly, experimental data of seven compounds out of fifty-three did not correlate with the electrophilicity of the carbonyl group. For these seven compounds, we identified unfavorable interactions between them and the <i>Tf</i>NCS. Our results demonstrate the applications of <i>in silico</i> techniques in understanding enzyme promiscuity and specificity, with a particular emphasis on machine learning methodologies, DFT electronic structure calculations, and molecular dynamic (MD) simulations.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141079782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-21DOI: 10.1021/acs.jcim.4c00298
Christopher Vorreiter, Dina Robaa and Wolfgang Sippl*,
Cosolvent molecular dynamics (MD) simulations have proven to be powerful in silico tools to predict hotspots for binding regions on protein surfaces. In the current study, the method was adapted and applied to two Tudor domain-containing proteins, namely Spindlin1 (SPIN1) and survival motor neuron protein (SMN). Tudor domains are characterized by so-called aromatic cages that recognize methylated lysine residues of protein targets. In the study, the conformational transitions from closed to open aromatic cage conformations were investigated by performing MD simulations with cosolvents using six different probe molecules. It is shown that a trajectory clustering approach in combination with volume and atomic distance tracking allows a reasonable discrimination between open and closed aromatic cage conformations and the docking of inhibitors yields very good reproducibility with crystal structures. Cosolvent MDs are suitable to capture the flexibility of aromatic cages and thus represent a promising tool for the optimization of inhibitors.
{"title":"Exploring Aromatic Cage Flexibility Using Cosolvent Molecular Dynamics Simulations─An In-Silico Case Study of Tudor Domains","authors":"Christopher Vorreiter, Dina Robaa and Wolfgang Sippl*, ","doi":"10.1021/acs.jcim.4c00298","DOIUrl":"10.1021/acs.jcim.4c00298","url":null,"abstract":"<p >Cosolvent molecular dynamics (MD) simulations have proven to be powerful in silico tools to predict hotspots for binding regions on protein surfaces. In the current study, the method was adapted and applied to two Tudor domain-containing proteins, namely Spindlin1 (SPIN1) and survival motor neuron protein (SMN). Tudor domains are characterized by so-called aromatic cages that recognize methylated lysine residues of protein targets. In the study, the conformational transitions from closed to open aromatic cage conformations were investigated by performing MD simulations with cosolvents using six different probe molecules. It is shown that a trajectory clustering approach in combination with volume and atomic distance tracking allows a reasonable discrimination between open and closed aromatic cage conformations and the docking of inhibitors yields very good reproducibility with crystal structures. Cosolvent MDs are suitable to capture the flexibility of aromatic cages and thus represent a promising tool for the optimization of inhibitors.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c00298","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141069565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-20DOI: 10.1021/acs.jcim.4c00520
Christian Kersten*, Philippe Archambault and Luca P. Köhler,
With increasing interest in RNA as a therapeutic and a potential target, the role of RNA structures has become more important. Even slight changes in nucleobases, such as modifications or protomeric and tautomeric states, can have a large impact on RNA structure and function, while local environments in turn affect protonation and tautomerization. In this work, the application of empirical tools for pKa and tautomer prediction for RNA modifications was elucidated and compared with ab initio quantum mechanics (QM) methods and expanded toward macromolecular RNA structures, where QM is no longer feasible. In this regard, the Protonate3D functionality within the molecular operating environment (MOE) was expanded for nucleobase protomer and tautomer predictions and applied to reported examples of altered protonation states depending on the local environment. Overall, observations of nonstandard protomers and tautomers were well reproduced, including structural C+G:C(A) and A+GG motifs, several mismatches, and protonation of adenosine or cytidine as the general acid in nucleolytic ribozymes. Special cases, such as cobalt hexamine-soaked complexes or the deprotonation of guanosine as the general base in nucleolytic ribozymes, proved to be challenging. The collected set of examples shall serve as a starting point for the development of further RNA protonation prediction tools, while the presented Protonate3D implementation already delivers reasonable protonation predictions for RNA and DNA macromolecules. For cases where higher accuracy is needed, like following catalytic pathways of ribozymes, incorporation of QM-based methods can build upon the Protonate3D-generated starting structures. Likewise, this protonation prediction can be used for structure-based RNA-ligand design approaches.
{"title":"Assessment of Nucleobase Protomeric and Tautomeric States in Nucleic Acid Structures for Interaction Analysis and Structure-Based Ligand Design","authors":"Christian Kersten*, Philippe Archambault and Luca P. Köhler, ","doi":"10.1021/acs.jcim.4c00520","DOIUrl":"10.1021/acs.jcim.4c00520","url":null,"abstract":"<p >With increasing interest in RNA as a therapeutic and a potential target, the role of RNA structures has become more important. Even slight changes in nucleobases, such as modifications or protomeric and tautomeric states, can have a large impact on RNA structure and function, while local environments in turn affect protonation and tautomerization. In this work, the application of empirical tools for p<i>K</i><sub>a</sub> and tautomer prediction for RNA modifications was elucidated and compared with ab initio quantum mechanics (QM) methods and expanded toward macromolecular RNA structures, where QM is no longer feasible. In this regard, the Protonate3D functionality within the molecular operating environment (MOE) was expanded for nucleobase protomer and tautomer predictions and applied to reported examples of altered protonation states depending on the local environment. Overall, observations of nonstandard protomers and tautomers were well reproduced, including structural C<sup>+</sup>G:C(A) and A<sup>+</sup>GG motifs, several mismatches, and protonation of adenosine or cytidine as the general acid in nucleolytic ribozymes. Special cases, such as cobalt hexamine-soaked complexes or the deprotonation of guanosine as the general base in nucleolytic ribozymes, proved to be challenging. The collected set of examples shall serve as a starting point for the development of further RNA protonation prediction tools, while the presented Protonate3D implementation already delivers reasonable protonation predictions for RNA and DNA macromolecules. For cases where higher accuracy is needed, like following catalytic pathways of ribozymes, incorporation of QM-based methods can build upon the Protonate3D-generated starting structures. Likewise, this protonation prediction can be used for structure-based RNA-ligand design approaches.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":null,"pages":null},"PeriodicalIF":5.6,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141064540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}