首页 > 最新文献

Journal of Chemical Theory and Computation最新文献

英文 中文
Convergent Protocols for Computing Protein-Ligand Interaction Energies Using Fragment-Based Quantum Chemistry.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-01-02 DOI: 10.1021/acs.jctc.4c01429
Paige E Bowling, Dustin R Broderick, John M Herbert

Fragment-based quantum chemistry methods offer a means to sidestep the steep nonlinear scaling of electronic structure calculations so that large molecular systems can be investigated using high-level methods. Here, we use fragmentation to compute protein-ligand interaction energies in systems with several thousand atoms, using a new software platform for managing fragment-based calculations that implements a screened many-body expansion. Convergence tests using a minimal-basis semiempirical method (HF-3c) indicate that two-body calculations, with single-residue fragments and simple hydrogen caps, are sufficient to reproduce interaction energies obtained using conventional supramolecular electronic structure calculations, to within 1 kcal/mol at about 1% of the computational cost. We also demonstrate that the HF-3c results are illustrative of trends obtained with density functional theory in basis sets up to augmented quadruple-ζ quality. Strategic deployment of fragmentation facilitates the use of converged biomolecular model systems alongside high-quality electronic structure methods and basis sets, bringing ab initio quantum chemistry to systems of hitherto unimaginable size. This will be useful for generation of high-quality training data for machine learning applications.

{"title":"Convergent Protocols for Computing Protein-Ligand Interaction Energies Using Fragment-Based Quantum Chemistry.","authors":"Paige E Bowling, Dustin R Broderick, John M Herbert","doi":"10.1021/acs.jctc.4c01429","DOIUrl":"https://doi.org/10.1021/acs.jctc.4c01429","url":null,"abstract":"<p><p>Fragment-based quantum chemistry methods offer a means to sidestep the steep nonlinear scaling of electronic structure calculations so that large molecular systems can be investigated using high-level methods. Here, we use fragmentation to compute protein-ligand interaction energies in systems with several thousand atoms, using a new software platform for managing fragment-based calculations that implements a screened many-body expansion. Convergence tests using a minimal-basis semiempirical method (HF-3c) indicate that two-body calculations, with single-residue fragments and simple hydrogen caps, are sufficient to reproduce interaction energies obtained using conventional supramolecular electronic structure calculations, to within 1 kcal/mol at about 1% of the computational cost. We also demonstrate that the HF-3c results are illustrative of trends obtained with density functional theory in basis sets up to augmented quadruple-ζ quality. Strategic deployment of fragmentation facilitates the use of converged biomolecular model systems alongside high-quality electronic structure methods and basis sets, bringing <i>ab initio</i> quantum chemistry to systems of hitherto unimaginable size. This will be useful for generation of high-quality training data for machine learning applications.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":""},"PeriodicalIF":5.7,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142918657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Imputation of Missing Data in Materials Science through Nearest Neighbors and Iterative Predictions.
IF 5.5 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-26 DOI: 10.1021/acs.jctc.4c01237
Chunhui Xie,Rui Li,Yunqi Li,Haibo Xie,Qibin Liu
Missing data in tabular data sets is ubiquitous in statistical analysis, big data analysis, and machine learning studies. Many strategies have been proposed to impute missing data, but their reliability has not been stringently assessed in materials science. Here, we carried out a benchmark test for six imputation strategies: Mean, MissForest, HyperImpute, Gain, Sinkhorn, and a newly proposed MatImpute on seven representative data sets in materials science. The imputation-induced errors (IIEs) were evaluated through the difference between imputed and original values, by root mean square error (RMSE), Wasserstein distance (WD), and a newly introduced metrics data set correlation convergence (DCC), to measure the difference at three aspects for individual data, column-wise distribution, and correlation stability of a data set. MatImpute outperformed the others with the least RMSE and WD and the highest DCC. The IIE increases with the increase of data missing ratio and in the order of missing at random < missing completely at random ≤ missing not at random, considering inherent correlations among missing data. A similar trend was observed for the increase of IIE along the central departure distance in units of the standard deviation, which is consistent with the increase of difficulty from interpolation to extrapolation. Further tests of IIE in regression and classification machine learning predictive models, MatImpute also preserved the highest data recovery fidelity. We released the code of MatImpute to facilitate the construction of high-quality data sets in materials science.
{"title":"Imputation of Missing Data in Materials Science through Nearest Neighbors and Iterative Predictions.","authors":"Chunhui Xie,Rui Li,Yunqi Li,Haibo Xie,Qibin Liu","doi":"10.1021/acs.jctc.4c01237","DOIUrl":"https://doi.org/10.1021/acs.jctc.4c01237","url":null,"abstract":"Missing data in tabular data sets is ubiquitous in statistical analysis, big data analysis, and machine learning studies. Many strategies have been proposed to impute missing data, but their reliability has not been stringently assessed in materials science. Here, we carried out a benchmark test for six imputation strategies: Mean, MissForest, HyperImpute, Gain, Sinkhorn, and a newly proposed MatImpute on seven representative data sets in materials science. The imputation-induced errors (IIEs) were evaluated through the difference between imputed and original values, by root mean square error (RMSE), Wasserstein distance (WD), and a newly introduced metrics data set correlation convergence (DCC), to measure the difference at three aspects for individual data, column-wise distribution, and correlation stability of a data set. MatImpute outperformed the others with the least RMSE and WD and the highest DCC. The IIE increases with the increase of data missing ratio and in the order of missing at random < missing completely at random ≤ missing not at random, considering inherent correlations among missing data. A similar trend was observed for the increase of IIE along the central departure distance in units of the standard deviation, which is consistent with the increase of difficulty from interpolation to extrapolation. Further tests of IIE in regression and classification machine learning predictive models, MatImpute also preserved the highest data recovery fidelity. We released the code of MatImpute to facilitate the construction of high-quality data sets in materials science.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"23 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142888618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
True Dynamics of Pillararene Host-Guest Binding.
IF 5.5 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-25 DOI: 10.1021/acs.jctc.4c01361
Xiaohui Wang,Zuo-Yuan Zhang,Xiao He,Zhirong Liu,Zhaoxi Sun
Accurate modeling of host-guest systems is challenging in modern computational chemistry. It requires intermolecular interaction patterns to be correctly described and, more importantly, the dynamic behaviors of macrocyclic hosts to be accurately modeled. Pillar[n]arenes as a crucial family of macrocycles play a critical role in host-guest chemistry and biomedical applications. The carboxylated form with 6 or 7 repeating units is of high popularity due to increased solubility and the compatibility between cavity size and drugs. While prefitted transferable force fields are dominantly applied in host-guest modeling, their reliability and accuracy for macrocyclic hosts remain unjustified. In the current work, based on solid numerical evidence about energetics and dynamics, we prove that all transferable force fields fail to provide a correct description of host dynamics for the most popular carboxylated pillararenes. Therefore, all existing simulation reports on this host family could be biased due to the unsuitability of the force-field description. Such huge modeling problems do not occur in other host families that are relatively rigid (e.g., octa acids and cucurbiturils), highlighting the difficulties in modeling pillararene host-guest interactions. To pursue the true picture of the pillararene dynamics and host-guest binding, we fit high-quality molecule-specific parameters for the carboxylated pillararene based on ab initio calculations and perform an exhaustive conformational search of host-guest binding modes with advanced sampling techniques. We provide estimates of binding thermodynamics, report the true dynamic behavior of the WP6 host in the bound and unbound states, and reveal a general multimodal binding behavior of pillararene host-guest complexes. The current work serves as a critical step toward a reliable all-atom description of pillararene host-guest coordination.
{"title":"True Dynamics of Pillararene Host-Guest Binding.","authors":"Xiaohui Wang,Zuo-Yuan Zhang,Xiao He,Zhirong Liu,Zhaoxi Sun","doi":"10.1021/acs.jctc.4c01361","DOIUrl":"https://doi.org/10.1021/acs.jctc.4c01361","url":null,"abstract":"Accurate modeling of host-guest systems is challenging in modern computational chemistry. It requires intermolecular interaction patterns to be correctly described and, more importantly, the dynamic behaviors of macrocyclic hosts to be accurately modeled. Pillar[n]arenes as a crucial family of macrocycles play a critical role in host-guest chemistry and biomedical applications. The carboxylated form with 6 or 7 repeating units is of high popularity due to increased solubility and the compatibility between cavity size and drugs. While prefitted transferable force fields are dominantly applied in host-guest modeling, their reliability and accuracy for macrocyclic hosts remain unjustified. In the current work, based on solid numerical evidence about energetics and dynamics, we prove that all transferable force fields fail to provide a correct description of host dynamics for the most popular carboxylated pillararenes. Therefore, all existing simulation reports on this host family could be biased due to the unsuitability of the force-field description. Such huge modeling problems do not occur in other host families that are relatively rigid (e.g., octa acids and cucurbiturils), highlighting the difficulties in modeling pillararene host-guest interactions. To pursue the true picture of the pillararene dynamics and host-guest binding, we fit high-quality molecule-specific parameters for the carboxylated pillararene based on ab initio calculations and perform an exhaustive conformational search of host-guest binding modes with advanced sampling techniques. We provide estimates of binding thermodynamics, report the true dynamic behavior of the WP6 host in the bound and unbound states, and reveal a general multimodal binding behavior of pillararene host-guest complexes. The current work serves as a critical step toward a reliable all-atom description of pillararene host-guest coordination.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"63 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142888678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reducing Numerical Precision Requirements in Quantum Chemistry Calculations.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-24 Epub Date: 2024-12-07 DOI: 10.1021/acs.jctc.4c00938
William Dawson, Katsuhisa Ozaki, Jens Domke, Takahito Nakajima

The abundant demand for deep learning compute resources has created a renaissance in low-precision hardware. Going forward, it will be essential for simulation software to run on this new generation of machines without sacrificing scientific fidelity. In this paper, we examine the precision requirements of a representative kernel from quantum chemistry calculations: the calculation of the single-particle density matrix from a given mean-field Hamiltonian (i.e., Hartree-Fock or density functional theory) represented in an LCAO basis. We find that double precision affords an unnecessarily high level of precision, leading to optimization opportunities. We show how an approximation built from an error-free matrix multiplication transformation can be used to potentially accelerate this kernel on future hardware. Our results provide a roadmap for adapting quantum chemistry software for the next generation of high-performance computing platforms.

{"title":"Reducing Numerical Precision Requirements in Quantum Chemistry Calculations.","authors":"William Dawson, Katsuhisa Ozaki, Jens Domke, Takahito Nakajima","doi":"10.1021/acs.jctc.4c00938","DOIUrl":"10.1021/acs.jctc.4c00938","url":null,"abstract":"<p><p>The abundant demand for deep learning compute resources has created a renaissance in low-precision hardware. Going forward, it will be essential for simulation software to run on this new generation of machines without sacrificing scientific fidelity. In this paper, we examine the precision requirements of a representative kernel from quantum chemistry calculations: the calculation of the single-particle density matrix from a given mean-field Hamiltonian (i.e., Hartree-Fock or density functional theory) represented in an LCAO basis. We find that double precision affords an unnecessarily high level of precision, leading to optimization opportunities. We show how an approximation built from an error-free matrix multiplication transformation can be used to potentially accelerate this kernel on future hardware. Our results provide a roadmap for adapting quantum chemistry software for the next generation of high-performance computing platforms.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"10826-10837"},"PeriodicalIF":5.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142790509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Implicit to Explicit: An Interaction-Reorganization Approach to Molecular Solvation Energy.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-24 Epub Date: 2024-12-13 DOI: 10.1021/acs.jctc.4c01283
Kaifang Huang, Lili Duan, John Z H Zhang

Accurate calculation of solvation energies has long fascinated researchers, but complex interactions within bulk water molecules pose significant challenges. Currently, molecular solvation energy calculations are mostly based on implicit solvent approximations in which the solvent molecules are treated as continuum dielectric media. However, the implicit solvent approach is not ideal because it lacks certain real solvation effects, such as that of the first solvation shell, etc. Here, we propose an explicit solvent approach, interaction-reorganization solvation (IRS) method, for molecular solvation energy calculations. The IRS approach achieves predictive accuracy comparable to that of the widely recognized solvation model based on the density (SMD) method and is significantly more accurate than that of the Poisson-Boltzmann/generalized Born surface area (PB/GBSA) methods. This is demonstrated in both the correlation coefficient and the mean absolute error (MAE) with respect to the experimental data. The IRS method is based on molecular dynamics simulation in explicit solvent and does not need to solve Poisson-Boltzmann or Schrödinger equations. On the other hand, the accuracy of the IRS method does depend on the accuracy of the molecular force field used in MD simulations. We expect that the IRS method will be very useful for the solvation energy calculations of molecules.

{"title":"From Implicit to Explicit: An Interaction-Reorganization Approach to Molecular Solvation Energy.","authors":"Kaifang Huang, Lili Duan, John Z H Zhang","doi":"10.1021/acs.jctc.4c01283","DOIUrl":"10.1021/acs.jctc.4c01283","url":null,"abstract":"<p><p>Accurate calculation of solvation energies has long fascinated researchers, but complex interactions within bulk water molecules pose significant challenges. Currently, molecular solvation energy calculations are mostly based on implicit solvent approximations in which the solvent molecules are treated as continuum dielectric media. However, the implicit solvent approach is not ideal because it lacks certain real solvation effects, such as that of the first solvation shell, etc. Here, we propose an explicit solvent approach, interaction-reorganization solvation (IRS) method, for molecular solvation energy calculations. The IRS approach achieves predictive accuracy comparable to that of the widely recognized solvation model based on the density (SMD) method and is significantly more accurate than that of the Poisson-Boltzmann/generalized Born surface area (PB/GBSA) methods. This is demonstrated in both the correlation coefficient and the mean absolute error (MAE) with respect to the experimental data. The IRS method is based on molecular dynamics simulation in explicit solvent and does not need to solve Poisson-Boltzmann or Schrödinger equations. On the other hand, the accuracy of the IRS method does depend on the accuracy of the molecular force field used in MD simulations. We expect that the IRS method will be very useful for the solvation energy calculations of molecules.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"10961-10971"},"PeriodicalIF":5.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11674157/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142816758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bootstrap Embedding for Molecules in Extended Basis Sets.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-24 Epub Date: 2024-12-09 DOI: 10.1021/acs.jctc.4c01267
Henry K Tran, Leah P Weisburn, Minsik Cho, Shaun Weatherly, Hong-Zhou Ye, Troy Van Voorhis

Quantum embedding methods are powerful tools to exploit the locality of electron correlation, but thus far many wave function-in-wave function methods have focused on small (e.g., minimal) basis sets. One major challenge for extended basis sets lies in defining consistent atom- or fragment-localized orbitals in spite of the larger spatial extent of the underlying atomic orbitals. In this work, we modify a particular form of quantum embedding, bootstrap embedding (BE), to the case of extended basis sets. We find that using intrinsic atomic orbital (IAO) localization schemes alongside BE converges to ∼99.7% of the CCSD correlation energy in 3-21G, 6-311G, and cc-pVDZ basis sets for reasonably sized fragments. These results mark an important first step in extending the success of embedding methods to properly studying dynamic correlation.

{"title":"Bootstrap Embedding for Molecules in Extended Basis Sets.","authors":"Henry K Tran, Leah P Weisburn, Minsik Cho, Shaun Weatherly, Hong-Zhou Ye, Troy Van Voorhis","doi":"10.1021/acs.jctc.4c01267","DOIUrl":"10.1021/acs.jctc.4c01267","url":null,"abstract":"<p><p>Quantum embedding methods are powerful tools to exploit the locality of electron correlation, but thus far many wave function-in-wave function methods have focused on small (e.g., minimal) basis sets. One major challenge for extended basis sets lies in defining consistent atom- or fragment-localized orbitals in spite of the larger spatial extent of the underlying atomic orbitals. In this work, we modify a particular form of quantum embedding, bootstrap embedding (BE), to the case of extended basis sets. We find that using intrinsic atomic orbital (IAO) localization schemes alongside BE converges to ∼99.7% of the CCSD correlation energy in 3-21G, 6-311G, and cc-pVDZ basis sets for reasonably sized fragments. These results mark an important first step in extending the success of embedding methods to properly studying dynamic correlation.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"10912-10921"},"PeriodicalIF":5.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142798664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GPU-Accelerated Solution of the Bethe-Salpeter Equation for Large and Heterogeneous Systems.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-24 Epub Date: 2024-12-11 DOI: 10.1021/acs.jctc.4c01253
Victor Wen-Zhe Yu, Yu Jin, Giulia Galli, Marco Govoni

We present a massively parallel GPU-accelerated implementation of the Bethe-Salpeter equation (BSE) for the calculation of the vertical excitation energies (VEEs) and optical absorption spectra of condensed and molecular systems, starting from single-particle eigenvalues and eigenvectors obtained with density functional theory. The algorithms adopted here circumvent the slowly converging sums over empty and occupied states and the inversion of large dielectric matrices through a density matrix perturbation theory approach and a low-rank decomposition of the screened Coulomb interaction, respectively. Further computational savings are achieved by exploiting the nearsightedness of the density matrix of semiconductors and insulators to reduce the number of screened Coulomb integrals. We scale our calculations to thousands of GPUs with a hierarchical loop and data distribution strategy. The efficacy of our method is demonstrated by computing the VEEs of several spin defects in wide-band-gap materials, showing that supercells with up to 1000 atoms are necessary to obtain converged results. We discuss the validity of the common approximation that solves the BSE with truncated sums over empty and occupied states. We then apply our GW-BSE implementation to a diamond lattice with 1727 atoms to study the symmetry breaking of triplet states caused by the interaction of a point defect with an extended line defect.

{"title":"GPU-Accelerated Solution of the Bethe-Salpeter Equation for Large and Heterogeneous Systems.","authors":"Victor Wen-Zhe Yu, Yu Jin, Giulia Galli, Marco Govoni","doi":"10.1021/acs.jctc.4c01253","DOIUrl":"10.1021/acs.jctc.4c01253","url":null,"abstract":"<p><p>We present a massively parallel GPU-accelerated implementation of the Bethe-Salpeter equation (BSE) for the calculation of the vertical excitation energies (VEEs) and optical absorption spectra of condensed and molecular systems, starting from single-particle eigenvalues and eigenvectors obtained with density functional theory. The algorithms adopted here circumvent the slowly converging sums over empty and occupied states and the inversion of large dielectric matrices through a density matrix perturbation theory approach and a low-rank decomposition of the screened Coulomb interaction, respectively. Further computational savings are achieved by exploiting the nearsightedness of the density matrix of semiconductors and insulators to reduce the number of screened Coulomb integrals. We scale our calculations to thousands of GPUs with a hierarchical loop and data distribution strategy. The efficacy of our method is demonstrated by computing the VEEs of several spin defects in wide-band-gap materials, showing that supercells with up to 1000 atoms are necessary to obtain converged results. We discuss the validity of the common approximation that solves the BSE with truncated sums over empty and occupied states. We then apply our GW-BSE implementation to a diamond lattice with 1727 atoms to study the symmetry breaking of triplet states caused by the interaction of a point defect with an extended line defect.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"10899-10911"},"PeriodicalIF":5.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142805530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Descriptor-Free Collective Variables from Geometric Graph Neural Networks.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-24 Epub Date: 2024-12-12 DOI: 10.1021/acs.jctc.4c01197
Jintu Zhang, Luigi Bonati, Enrico Trizio, Odin Zhang, Yu Kang, TingJun Hou, Michele Parrinello

Enhanced sampling simulations make the computational study of rare events feasible. A large family of such methods crucially depends on the definition of some collective variables (CVs) that could provide a low-dimensional representation of the relevant physics of the process. Recently, many methods have been proposed to semiautomatize the CV design by using machine learning tools to learn the variables directly from the simulation data. However, most methods are based on feedforward neural networks and require some user-defined physical descriptors. Here, we propose bypassing this step using a graph neural network to directly use the atomic coordinates as input for the CV model. This way, we achieve a fully automatic approach to CV determination that provides variables invariant under the relevant symmetries, especially the permutational one. Furthermore, we provide different analysis tools to favor the physical interpretation of the final CV. We prove the robustness of our approach using different methods from the literature for the optimization of the CV, and we prove its efficacy on several systems, including a small peptide, an ion dissociation in explicit solvent, and a simple chemical reaction.

{"title":"Descriptor-Free Collective Variables from Geometric Graph Neural Networks.","authors":"Jintu Zhang, Luigi Bonati, Enrico Trizio, Odin Zhang, Yu Kang, TingJun Hou, Michele Parrinello","doi":"10.1021/acs.jctc.4c01197","DOIUrl":"10.1021/acs.jctc.4c01197","url":null,"abstract":"<p><p>Enhanced sampling simulations make the computational study of rare events feasible. A large family of such methods crucially depends on the definition of some collective variables (CVs) that could provide a low-dimensional representation of the relevant physics of the process. Recently, many methods have been proposed to semiautomatize the CV design by using machine learning tools to learn the variables directly from the simulation data. However, most methods are based on feedforward neural networks and require some user-defined physical descriptors. Here, we propose bypassing this step using a graph neural network to directly use the atomic coordinates as input for the CV model. This way, we achieve a fully automatic approach to CV determination that provides variables invariant under the relevant symmetries, especially the permutational one. Furthermore, we provide different analysis tools to favor the physical interpretation of the final CV. We prove the robustness of our approach using different methods from the literature for the optimization of the CV, and we prove its efficacy on several systems, including a small peptide, an ion dissociation in explicit solvent, and a simple chemical reaction.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"10787-10797"},"PeriodicalIF":5.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142811463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Faster Sampling in Molecular Dynamics Simulations with TIP3P-F Water.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-24 Epub Date: 2024-12-12 DOI: 10.1021/acs.jctc.4c00990
José Guadalupe Rosas Jiménez, Balázs Fábián, Gerhard Hummer

The need for short time steps currently limits routine atomistic molecular dynamics (MD) simulations to the microsecond time scale. For long time steps, the numerical integration of the equations of motion becomes unstable, resulting in catastrophic crashes. Here, we combine mass repartitioning and rescaling to construct a water model that increases the sampling efficiency in biomolecular simulations without compromising integration stability and with preserved structural and thermodynamic properties. The resulting "fast water" is then used with a time step as before in combination with standard force fields. The reduced water viscosity and faster diffusion result in proportionally faster sampling of the larger-scale motions in the conformation space of both solute and solvent. We illustrate this approach by developing TIP3P-F based on the popular TIP3P model of water. A roughly 2-fold boost in the sampling efficiency at minimal cost in accuracy is substantial and helps lower the energy impact of large-scale MD simulations. The approach is general and can readily be applied to other water models and different types of solvents.

目前,对短时间步长的需求将常规原子分子动力学(MD)模拟限制在微秒级。对于较长的时间步长,运动方程的数值积分变得不稳定,从而导致灾难性的崩溃。在这里,我们将质量重新分配和重新缩放相结合,构建了一种水模型,在不影响积分稳定性的情况下提高了生物分子模拟的采样效率,并保留了结构和热力学特性。由此产生的 "快速水 "与标准力场结合使用,时间步长与之前相同。由于水的粘度降低,扩散速度加快,因此对溶质和溶剂构象空间中大尺度运动的采样速度也相应加快。我们基于流行的 TIP3P 水模型开发了 TIP3P-F,以此说明这种方法。以最小的精度代价提高约 2 倍的采样效率是非常可观的,有助于降低大规模 MD 模拟的能量影响。该方法具有通用性,可随时应用于其他水模型和不同类型的溶剂。
{"title":"Faster Sampling in Molecular Dynamics Simulations with TIP3P-F Water.","authors":"José Guadalupe Rosas Jiménez, Balázs Fábián, Gerhard Hummer","doi":"10.1021/acs.jctc.4c00990","DOIUrl":"10.1021/acs.jctc.4c00990","url":null,"abstract":"<p><p>The need for short time steps currently limits routine atomistic molecular dynamics (MD) simulations to the microsecond time scale. For long time steps, the numerical integration of the equations of motion becomes unstable, resulting in catastrophic crashes. Here, we combine mass repartitioning and rescaling to construct a water model that increases the sampling efficiency in biomolecular simulations without compromising integration stability and with preserved structural and thermodynamic properties. The resulting \"fast water\" is then used with a time step as before in combination with standard force fields. The reduced water viscosity and faster diffusion result in proportionally faster sampling of the larger-scale motions in the conformation space of both solute and solvent. We illustrate this approach by developing TIP3P-F based on the popular TIP3P model of water. A roughly 2-fold boost in the sampling efficiency at minimal cost in accuracy is substantial and helps lower the energy impact of large-scale MD simulations. The approach is general and can readily be applied to other water models and different types of solvents.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"11068-11081"},"PeriodicalIF":5.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11672673/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142816709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convergent Concordant Mode Approach for Molecular Vibrations: CMA-2.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2024-12-24 Epub Date: 2024-12-13 DOI: 10.1021/acs.jctc.4c01240
Nathaniel L Kitzmiller, Mitchell E Lahm, Laura N Olive Dornshuld, Jincan Jin, Wesley D Allen, Henry F Schaefer Iii

The concordant mode approach (CMA) is a promising new scheme for dramatically increasing the system size and level of theory achievable in quantum chemical computations of molecular vibrational frequencies. Here, we achieve advances in the CMA hierarchy by computations targeting CCSD(T)/cc-pVTZ (coupled cluster singles and doubles with perturbative triples using a correlation-consistent polarized-valence triple-ζ basis set) benchmarks within the G2 molecular test set, executing a statistical analysis for 1501 frequencies from 111 compounds and then separately solving the refractory case of pyridine. First, MP2/cc-pVTZ (second-order Møller-Plesset perturbation theory with the same basis set) proves to be an excellent and preferred choice for generating the underlying (Level B) normal modes of the CMA scheme. Utilizing this Level B within the CMA-0A method reproduces the 1501 benchmark frequencies with a mean absolute error (MAE) of only 0.11 cm-1 and an attendant standard deviation of 0.49 cm-1. Second, a convergent CMA-2 method is constituted that allows efficient computation of higher level (Level A) frequencies to any reasonable accuracy threshold by using only Hartree-Fock (HF) and MP2 or density functional theory (DFT) data to generate ξ parameters, which select the sparse off-diagonal force field elements for explicit evaluation at Level A. When Level B = MP2/cc-pVTZ, a cutoff of ξ = 0.02 provides an average maximum absolute error per molecule of only 0.17 cm-1 by incurring merely a 33% increase in average cost over CMA-0A. This CMA-2 method also eradicates the 4 problematic CMA-0A outliers of pyridine with even less effort (ξ = 0.04, 22% increase). Finally, the newly developed CMA procedures are shown to be highly successful when applied to 1-(1H-pyrrol-3-yl)ethanol, a new test molecule with diverse types of vibration.

{"title":"Convergent Concordant Mode Approach for Molecular Vibrations: CMA-2.","authors":"Nathaniel L Kitzmiller, Mitchell E Lahm, Laura N Olive Dornshuld, Jincan Jin, Wesley D Allen, Henry F Schaefer Iii","doi":"10.1021/acs.jctc.4c01240","DOIUrl":"10.1021/acs.jctc.4c01240","url":null,"abstract":"<p><p>The concordant mode approach (CMA) is a promising new scheme for dramatically increasing the system size and level of theory achievable in quantum chemical computations of molecular vibrational frequencies. Here, we achieve advances in the CMA hierarchy by computations targeting CCSD(T)/cc-pVTZ (coupled cluster singles and doubles with perturbative triples using a correlation-consistent polarized-valence triple-ζ basis set) benchmarks within the G2 molecular test set, executing a statistical analysis for 1501 frequencies from 111 compounds and then separately solving the refractory case of pyridine. First, MP2/cc-pVTZ (second-order Møller-Plesset perturbation theory with the same basis set) proves to be an excellent and preferred choice for generating the underlying (Level B) normal modes of the CMA scheme. Utilizing this Level B within the CMA-0A method reproduces the 1501 benchmark frequencies with a mean absolute error (MAE) of only 0.11 cm<sup>-1</sup> and an attendant standard deviation of 0.49 cm<sup>-1</sup>. Second, a convergent CMA-2 method is constituted that allows efficient computation of higher level (Level A) frequencies to any reasonable accuracy threshold by using only Hartree-Fock (HF) and MP2 or density functional theory (DFT) data to generate ξ parameters, which select the sparse off-diagonal force field elements for explicit evaluation at Level A. When Level B = MP2/cc-pVTZ, a cutoff of ξ = 0.02 provides an average maximum absolute error per molecule of only 0.17 cm<sup>-1</sup> by incurring merely a 33% increase in average cost over CMA-0A. This CMA-2 method also eradicates the 4 problematic CMA-0A outliers of pyridine with even less effort (ξ = 0.04, 22% increase). Finally, the newly developed CMA procedures are shown to be highly successful when applied to 1-(1<i>H</i>-pyrrol-3-yl)ethanol, a new test molecule with diverse types of vibration.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"10886-10898"},"PeriodicalIF":5.7,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11673116/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142821530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemical Theory and Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1