Pub Date : 2025-02-11Epub Date: 2025-01-22DOI: 10.1021/acs.jctc.4c01447
Yiping Hao, Xiaoxiao Lu, Bina Fu, Dong H Zhang
Symmetric functions, such as Permutationally Invariant Polynomials (PIPs) and Fundamental Invariants (FIs), are effective and concise descriptors for incorporating permutation symmetry into neural network (NN) potential energy surface (PES) fitting. The traditional algorithm for generating such symmetric polynomials has a factorial time complexity of N!, where N is the number of identical atoms, posing a significant challenge to applying symmetric polynomials as descriptors of NN PESs for larger systems, particularly with more than 10 atoms. Herein, we report a new algorithm which has only linear time complexity for identical atoms. It can tremendously accelerate generation process of symmetric polynomials for molecular systems. The proposed algorithm is based on graph connectivity analysis following the action of the generation set of molecular permutational group. For instance, in the case of calculating the invariant polynomials for a 15-atom molecule, such as tropolone, our algorithm is approximately 2 million times faster than the previous method. The efficiency of the new algorithm can be further enhanced with increasing molecular size and number of identical atoms, making the FI-NN approach feasible for systems with over 10 atoms and high symmetry demands.
{"title":"New Algorithms to Generate Permutationally Invariant Polynomials and Fundamental Invariants for Potential Energy Surface Fitting.","authors":"Yiping Hao, Xiaoxiao Lu, Bina Fu, Dong H Zhang","doi":"10.1021/acs.jctc.4c01447","DOIUrl":"10.1021/acs.jctc.4c01447","url":null,"abstract":"<p><p>Symmetric functions, such as Permutationally Invariant Polynomials (PIPs) and Fundamental Invariants (FIs), are effective and concise descriptors for incorporating permutation symmetry into neural network (NN) potential energy surface (PES) fitting. The traditional algorithm for generating such symmetric polynomials has a factorial time complexity of <i>N!</i>, where <i>N</i> is the number of identical atoms, posing a significant challenge to applying symmetric polynomials as descriptors of NN PESs for larger systems, particularly with more than 10 atoms. Herein, we report a new algorithm which has only linear time complexity for identical atoms. It can tremendously accelerate generation process of symmetric polynomials for molecular systems. The proposed algorithm is based on graph connectivity analysis following the action of the generation set of molecular permutational group. For instance, in the case of calculating the invariant polynomials for a 15-atom molecule, such as tropolone, our algorithm is approximately 2 million times faster than the previous method. The efficiency of the new algorithm can be further enhanced with increasing molecular size and number of identical atoms, making the FI-NN approach feasible for systems with over 10 atoms and high symmetry demands.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1046-1053"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11Epub Date: 2025-01-21DOI: 10.1021/acs.jctc.4c01297
Jonas E S Mikkelsen, Frank Jensen
The minimal basis iterative Stockholder (MBIS) decomposition of molecular electron densities into atomic quantities is an attractive approach for deriving electrostatic parameters in force fields. The MBIS-derived atomic charges, however, in general tend to overestimate the molecular dipole and quadrupole moments by ∼10%. We show that it is possible to derive a constrained MBIS model where the atomic charges or a combination of atomic charges and dipoles exactly reproduce the molecular dipole and quadrupole moments for molecules. The atomic multipole moments derived by the constrained procedure are better at reproducing the molecular electrostatic potential (ESP) than the unconstrained atomic multipole moments. They are, furthermore, significantly less conformationally dependent than atomic charges obtained by fitting to the molecular electrostatic potential.
{"title":"Minimal Basis Iterative Stockholder Decomposition with Multipole Constraints.","authors":"Jonas E S Mikkelsen, Frank Jensen","doi":"10.1021/acs.jctc.4c01297","DOIUrl":"10.1021/acs.jctc.4c01297","url":null,"abstract":"<p><p>The minimal basis iterative Stockholder (MBIS) decomposition of molecular electron densities into atomic quantities is an attractive approach for deriving electrostatic parameters in force fields. The MBIS-derived atomic charges, however, in general tend to overestimate the molecular dipole and quadrupole moments by ∼10%. We show that it is possible to derive a constrained MBIS model where the atomic charges or a combination of atomic charges and dipoles exactly reproduce the molecular dipole and quadrupole moments for molecules. The atomic multipole moments derived by the constrained procedure are better at reproducing the molecular electrostatic potential (ESP) than the unconstrained atomic multipole moments. They are, furthermore, significantly less conformationally dependent than atomic charges obtained by fitting to the molecular electrostatic potential.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1179-1193"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11Epub Date: 2025-01-24DOI: 10.1021/acs.jctc.4c01504
Nguyen Truong Co, Cezary Czaplewski, Emilia A Lubecka, Adam Liwo
Time-averaged restraints from nuclear magnetic resonance (NMR) measurements have been implemented in the UNRES coarse-grained model of polypeptide chains in order to develop a tool for data-assisted modeling of the conformational ensembles of multistate proteins, intrinsically disordered proteins (IDPs) and proteins with intrinsically disordered regions (IDRs), many of which are essential in cell biology. A numerically stable variant of molecular dynamics with time-averaged restraints has been introduced, in which the total energy is conserved in sections of a trajectory in microcanonical runs, the bath temperature is maintained in canonical runs, and the time-average-restraint-force components are scaled up with the length of the memory window so that the restraints affect the simulated structures. The new approach restores the conformational ensembles used to generate ensemble-averaged distances, as demonstrated with synthetic restraints. The approach results in a better fitting of the ensemble-averaged interproton distances to those determined experimentally for multistate proteins and proteins with intrinsically disordered regions, which puts it at an advantage over all-atom approaches with regard to the determination of the conformational ensembles of proteins with diffuse structures, owing to a faster and more robust conformational search.
{"title":"Implementation of Time-Averaged Restraints with UNRES Coarse-Grained Model of Polypeptide Chains.","authors":"Nguyen Truong Co, Cezary Czaplewski, Emilia A Lubecka, Adam Liwo","doi":"10.1021/acs.jctc.4c01504","DOIUrl":"10.1021/acs.jctc.4c01504","url":null,"abstract":"<p><p>Time-averaged restraints from nuclear magnetic resonance (NMR) measurements have been implemented in the UNRES coarse-grained model of polypeptide chains in order to develop a tool for data-assisted modeling of the conformational ensembles of multistate proteins, intrinsically disordered proteins (IDPs) and proteins with intrinsically disordered regions (IDRs), many of which are essential in cell biology. A numerically stable variant of molecular dynamics with time-averaged restraints has been introduced, in which the total energy is conserved in sections of a trajectory in microcanonical runs, the bath temperature is maintained in canonical runs, and the time-average-restraint-force components are scaled up with the length of the memory window so that the restraints affect the simulated structures. The new approach restores the conformational ensembles used to generate ensemble-averaged distances, as demonstrated with synthetic restraints. The approach results in a better fitting of the ensemble-averaged interproton distances to those determined experimentally for multistate proteins and proteins with intrinsically disordered regions, which puts it at an advantage over all-atom approaches with regard to the determination of the conformational ensembles of proteins with diffuse structures, owing to a faster and more robust conformational search.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1476-1493"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823420/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143027455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11Epub Date: 2025-01-28DOI: 10.1021/acs.jctc.4c01181
Marcos Casanova-Páez, Frank Neese
X-ray absorption spectroscopy (XAS) is a powerful method for exploring molecular electronic structure by exciting core electrons into higher unoccupied molecular orbitals. In this study, we present the first integration of the spin-unrestricted similarity-transformed equation-of-motion coupled cluster method (CVS-USTEOM-CCSD) for core-excited and core-ionized states into the ORCA quantum chemistry package. Using the core-valence separation (CVS) approach, we evaluate the accuracy of CVS-USTEOM-CCSD across 13 open-shell organic systems, covering over 20 core excitations with diverse spin multiplicities (doublet, triplet, and quartet). The implementation leverages automated active space selection, incorporating CIS natural orbitals to efficiently capture electronic transitions. We benchmark the predicted K- and L-edge spectra against experimental data, underscoring the accuracy of the CVS-USTEOM-CCSD method for high-precision core excitation studies.
X 射线吸收光谱(XAS)是一种通过激发核心电子进入较高的未占据分子轨道来探索分子电子结构的强大方法。在本研究中,我们首次将针对核激发态和核电离态的自旋无限制相似性变换运动方程耦合簇方法(CVS-USTEOM-CCSD)集成到 ORCA 量子化学软件包中。利用核价分离(CVS)方法,我们评估了 CVS-USTEOM-CCSD 在 13 个开壳有机体系中的准确性,涵盖了 20 多种具有不同自旋倍率(二重、三重和四重)的核激发。该方法利用自动活动空间选择,结合 CIS 自然轨道,有效捕捉电子跃迁。我们将预测的 K 边和 L 边光谱与实验数据进行了比对,从而证明了 CVS-USTEOM-CCSD 方法在高精度核激发研究中的准确性。
{"title":"Core-Excited States for Open-Shell Systems in Similarity-Transformed Equation-of-Motion Theory.","authors":"Marcos Casanova-Páez, Frank Neese","doi":"10.1021/acs.jctc.4c01181","DOIUrl":"10.1021/acs.jctc.4c01181","url":null,"abstract":"<p><p>X-ray absorption spectroscopy (XAS) is a powerful method for exploring molecular electronic structure by exciting core electrons into higher unoccupied molecular orbitals. In this study, we present the first integration of the spin-unrestricted similarity-transformed equation-of-motion coupled cluster method (CVS-USTEOM-CCSD) for core-excited and core-ionized states into the ORCA quantum chemistry package. Using the core-valence separation (CVS) approach, we evaluate the accuracy of CVS-USTEOM-CCSD across 13 open-shell organic systems, covering over 20 core excitations with diverse spin multiplicities (doublet, triplet, and quartet). The implementation leverages automated active space selection, incorporating CIS natural orbitals to efficiently capture electronic transitions. We benchmark the predicted K- and L-edge spectra against experimental data, underscoring the accuracy of the CVS-USTEOM-CCSD method for high-precision core excitation studies.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1306-1321"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11Epub Date: 2025-01-17DOI: 10.1021/acs.jctc.4c01462
Arta A Safari, Robert J Anderson, Ali Alavi, Giovanni Li Manni
A new method to perform complete active space second-order perturbation theory on top of large active spaces optimized with full configuration quantum Monte Carlo is presented. Computing the three- and Fock-contracted four-particle density matrix from imaginary-time-averaged wave functions is found to resolve fermionic positivity violations and to ensure numerical stability. The protocol is applied to [NiFe]-hydrogenase, [Cu2O2]-oxidase and Fe-porphyrin model systems up to 26 electrons in 27 orbitals and benchmarked against DMRG-CASPT2.
{"title":"FCIQMC-CASPT2 with Imaginary-Time-Averaged Wave Functions.","authors":"Arta A Safari, Robert J Anderson, Ali Alavi, Giovanni Li Manni","doi":"10.1021/acs.jctc.4c01462","DOIUrl":"10.1021/acs.jctc.4c01462","url":null,"abstract":"<p><p>A new method to perform complete active space second-order perturbation theory on top of large active spaces optimized with full configuration quantum Monte Carlo is presented. Computing the three- and Fock-contracted four-particle density matrix from imaginary-time-averaged wave functions is found to resolve fermionic positivity violations and to ensure numerical stability. The protocol is applied to [NiFe]-hydrogenase, [Cu<sub>2</sub>O<sub>2</sub>]-oxidase and Fe-porphyrin model systems up to 26 electrons in 27 orbitals and benchmarked against DMRG-CASPT2.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1029-1038"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823415/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Electrochemical energy storage and conversion play increasingly important roles in electrification and sustainable development across the globe. A key challenge therein is to understand, control, and design electrochemical energy materials with atomistic precision. This requires inputs from molecular modeling powered by machine learning (ML) techniques. In this work, we have upgraded our pairwise interaction neural network Python package PiNN via introducing equivariant features to the PiNet2 architecture for fitting potential energy surfaces along with PiNet2-dipole for dipole and charge predictions as well as PiNet2-χ for generating atom-condensed charge response kernels. By benchmarking publicly accessible data sets of small molecules, crystalline materials, and liquid electrolytes, we found that the equivariant PiNet2 shows significant improvements over the original PiNet architecture and provides a state-of-the-art overall performance. Furthermore, leveraging on plug-ins such as PiNNAcLe for an adaptive learn-on-the-fly workflow in generating ML potentials and PiNNwall for modeling heterogeneous electrodes under external bias, we expect PiNN to serve as a versatile and high-performing ML-accelerated platform for molecular modeling of electrochemical systems.
{"title":"PiNN: Equivariant Neural Network Suite for Modeling Electrochemical Systems.","authors":"Jichen Li, Lisanne Knijff, Zhan-Yun Zhang, Linnéa Andersson, Chao Zhang","doi":"10.1021/acs.jctc.4c01570","DOIUrl":"10.1021/acs.jctc.4c01570","url":null,"abstract":"<p><p>Electrochemical energy storage and conversion play increasingly important roles in electrification and sustainable development across the globe. A key challenge therein is to understand, control, and design electrochemical energy materials with atomistic precision. This requires inputs from molecular modeling powered by machine learning (ML) techniques. In this work, we have upgraded our pairwise interaction neural network Python package PiNN via introducing equivariant features to the PiNet2 architecture for fitting potential energy surfaces along with PiNet2-dipole for dipole and charge predictions as well as PiNet2-χ for generating atom-condensed charge response kernels. By benchmarking publicly accessible data sets of small molecules, crystalline materials, and liquid electrolytes, we found that the equivariant PiNet2 shows significant improvements over the original PiNet architecture and provides a state-of-the-art overall performance. Furthermore, leveraging on plug-ins such as PiNNAcLe for an adaptive learn-on-the-fly workflow in generating ML potentials and PiNNwall for modeling heterogeneous electrodes under external bias, we expect PiNN to serve as a versatile and high-performing ML-accelerated platform for molecular modeling of electrochemical systems.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1382-1395"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823406/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11Epub Date: 2025-01-18DOI: 10.1021/acs.jctc.4c01397
Akihide Hayashi, So Takamoto, Ju Li, Yuta Tsuboi, Daisuke Okanohara
Mapping the chemical reaction pathways and their corresponding activation barriers is a significant challenge in molecular simulation. Given the inherent complexities of 3D atomic geometries, even generating an initial guess of these paths can be difficult for humans. This paper presents an innovative approach that utilizes neural networks to generate initial guesses for reaction pathways based on the initial state and learning from a database of low-energy transition paths. The proposed method is initiated by inputting the coordinates of the initial state, followed by progressive alterations to its structure. This iterative process culminates in the generation of the guess reaction path and the coordinates of the final state. The method does not require one-the-fly computation of the actual potential energy surface and is therefore fast-acting. The application of this geometry-based method extends to complex reaction pathways illustrated by organic reactions. Training was executed on the Transition1x data set of organic reaction pathways. The results revealed the generation of reactions that bore substantial similarities with the test set of chemical reaction paths. The method's flexibility allows for reactions to be generated either to conform to predetermined conditions or in a randomized manner.
{"title":"Generative Model for Constructing Reaction Path from Initial to Final States.","authors":"Akihide Hayashi, So Takamoto, Ju Li, Yuta Tsuboi, Daisuke Okanohara","doi":"10.1021/acs.jctc.4c01397","DOIUrl":"10.1021/acs.jctc.4c01397","url":null,"abstract":"<p><p>Mapping the chemical reaction pathways and their corresponding activation barriers is a significant challenge in molecular simulation. Given the inherent complexities of 3D atomic geometries, even generating an initial guess of these paths can be difficult for humans. This paper presents an innovative approach that utilizes neural networks to generate initial guesses for reaction pathways based on the initial state and learning from a database of low-energy transition paths. The proposed method is initiated by inputting the coordinates of the initial state, followed by progressive alterations to its structure. This iterative process culminates in the generation of the guess reaction path and the coordinates of the final state. The method does not require one-the-fly computation of the actual potential energy surface and is therefore fast-acting. The application of this geometry-based method extends to complex reaction pathways illustrated by organic reactions. Training was executed on the Transition1x data set of organic reaction pathways. The results revealed the generation of reactions that bore substantial similarities with the test set of chemical reaction paths. The method's flexibility allows for reactions to be generated either to conform to predetermined conditions or in a randomized manner.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1292-1305"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11824368/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11Epub Date: 2025-01-30DOI: 10.1021/acs.jctc.4c01602
Mingzhe Shen, Daniel Kortzak, Simon Ambrozak, Shubham Bhatnagar, Ian Buchanan, Ruibin Liu, Jana Shen
Despite its importance in understanding biology and computer-aided drug discovery, the accurate prediction of protein ionization states remains a formidable challenge. Physics-based approaches struggle to capture the small, competing contributions in the complex protein environment, while machine learning (ML) is hampered by the scarcity of experimental data. Here, we report the development of pKa ML (KaML) models based on decision trees and graph attention networks (GAT), exploiting physicochemical understanding and a new experiment pKa database (PKAD-3) enriched with highly shifted pKa's. KaML-CBtree significantly outperforms the current state of the art in predicting pKa values and ionization states across all six titratable amino acids, notably achieving accurate predictions for deprotonated cysteines and lysines─a blind spot in previous models. The superior performance of KaMLs is achieved in part through several innovations, including the separate treatment of acid and base, data augmentation using AlphaFold structures, and model pretraining on a theoretical pKa database. We also introduce the classification of protonation states as a metric for evaluating pKa prediction models. A meta-feature analysis suggests a possible reason for the lightweight tree model to outperform the more complex deep learning GAT. We release an end-to-end pKa predictor based on KaML-CBtree and the new PKAD-3 database, which facilitates a variety of applications and provides the foundation for further advances in protein electrostatic research.
{"title":"KaMLs for Predicting Protein p<i>K</i><sub>a</sub> Values and Ionization States: Are Trees All You Need?","authors":"Mingzhe Shen, Daniel Kortzak, Simon Ambrozak, Shubham Bhatnagar, Ian Buchanan, Ruibin Liu, Jana Shen","doi":"10.1021/acs.jctc.4c01602","DOIUrl":"10.1021/acs.jctc.4c01602","url":null,"abstract":"<p><p>Despite its importance in understanding biology and computer-aided drug discovery, the accurate prediction of protein ionization states remains a formidable challenge. Physics-based approaches struggle to capture the small, competing contributions in the complex protein environment, while machine learning (ML) is hampered by the scarcity of experimental data. Here, we report the development of p<i>K</i><sub>a</sub> ML (KaML) models based on decision trees and graph attention networks (GAT), exploiting physicochemical understanding and a new experiment p<i>K</i><sub>a</sub> database (PKAD-3) enriched with highly shifted p<i>K</i><sub>a</sub>'s. KaML-CBtree significantly outperforms the current state of the art in predicting p<i>K</i><sub>a</sub> values and ionization states across all six titratable amino acids, notably achieving accurate predictions for deprotonated cysteines and lysines─a blind spot in previous models. The superior performance of KaMLs is achieved in part through several innovations, including the separate treatment of acid and base, data augmentation using AlphaFold structures, and model pretraining on a theoretical p<i>K</i><sub>a</sub> database. We also introduce the classification of protonation states as a metric for evaluating p<i>K</i><sub>a</sub> prediction models. A meta-feature analysis suggests a possible reason for the lightweight tree model to outperform the more complex deep learning GAT. We release an end-to-end p<i>K</i><sub>a</sub> predictor based on KaML-CBtree and the new PKAD-3 database, which facilitates a variety of applications and provides the foundation for further advances in protein electrostatic research.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1446-1458"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11Epub Date: 2025-01-29DOI: 10.1021/acs.jctc.4c01010
David Beyer, Pablo M Blanco, Jonas Landsgesell, Peter Košovan, Christian Holm
The constant-pH Monte Carlo method is a popular algorithm to study acid-base equilibria in coarse-grained simulations of charge regulating soft matter systems including weak polyelectrolytes and proteins. However, the method suffers from systematic errors in simulations with explicit ions, which lead to a symmetry-breaking between chemically equivalent implementations of the acid-base equilibrium. Here, we show that this artifact of the algorithm can be corrected a-posteriori by simply shifting the pH-scale. We present two analytical methods as well as a numerical method using Widom insertion to obtain the correction. By numerically investigating various sample systems, we assess the range of validity of the analytical approaches and show that the Widom approach always leads to consistent results, even when the analytical approaches fail. Overall, we provide practical guidelines on how to use constant-pH simulations to avoid systematic errors, including cases where special care is required, such as polyampholytes and proteins.
{"title":"How To Correct Erroneous Symmetry-Breaking in Coarse-Grained Constant-pH Simulations.","authors":"David Beyer, Pablo M Blanco, Jonas Landsgesell, Peter Košovan, Christian Holm","doi":"10.1021/acs.jctc.4c01010","DOIUrl":"10.1021/acs.jctc.4c01010","url":null,"abstract":"<p><p>The constant-pH Monte Carlo method is a popular algorithm to study acid-base equilibria in coarse-grained simulations of charge regulating soft matter systems including weak polyelectrolytes and proteins. However, the method suffers from systematic errors in simulations with explicit ions, which lead to a symmetry-breaking between chemically equivalent implementations of the acid-base equilibrium. Here, we show that this artifact of the algorithm can be corrected a-posteriori by simply shifting the pH-scale. We present two analytical methods as well as a numerical method using Widom insertion to obtain the correction. By numerically investigating various sample systems, we assess the range of validity of the analytical approaches and show that the Widom approach always leads to consistent results, even when the analytical approaches fail. Overall, we provide practical guidelines on how to use constant-pH simulations to avoid systematic errors, including cases where special care is required, such as polyampholytes and proteins.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1396-1404"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143057416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Exploring electronic states in actinide compounds is a critical aspect of nuclear science. However, considering relativistic effects and electron correlation in theoretical calculations poses a complex challenge. To tackle this, we developed the CASPT2/RASPT2 program along with the DIRAC program, enabling calculations of electron correlation methods using multiconfigurational perturbation theory with various relativistic Hamiltonians. Currently, we employ a method that combines the improved virtual orbital (IVO) approach and CASCI methodologies as reference functions, deviating from the traditional use of CASSCF. Additionally, we implemented the RASCI-RASPT2 method to treat larger active spaces and parallelized the entire program. Due to the intricate process of selecting orbital spaces in CASPT2 and RASPT2, we offer a GUI program to assist with input creation. All these programs and tutorials are freely accessible on GitHub for anyone to use. In our benchmark calculations, we demonstrated the efficiency of parallelization by utilizing 1-256 cores for CASCI-CASPT2 calculations on the UO22+ molecule. Despite encountering some anomalies, we achieved commendable parallelization efficiency with CASCI and CASPT2 computational times. We also computed the vertical excitation energies of UO22+ using the RASCI-RASPT2 approach. By adapting the IVO and setting the maximum number of holes and electrons to three for RAS1 and RAS3, we obtained trends consistent with those reported in previous studies using alternative methods. We plan to continue improving the program in the future, believing that its widespread use will contribute to further development in actinide chemistry.
{"title":"Relativistic CASPT2/RASPT2 Program along with DIRAC Software.","authors":"Yasuto Masuda, Kohei Noda, Sumika Iwamuro, Masahiko Hada, Naoki Nakatani, Minori Abe","doi":"10.1021/acs.jctc.4c01589","DOIUrl":"10.1021/acs.jctc.4c01589","url":null,"abstract":"<p><p>Exploring electronic states in actinide compounds is a critical aspect of nuclear science. However, considering relativistic effects and electron correlation in theoretical calculations poses a complex challenge. To tackle this, we developed the CASPT2/RASPT2 program along with the DIRAC program, enabling calculations of electron correlation methods using multiconfigurational perturbation theory with various relativistic Hamiltonians. Currently, we employ a method that combines the improved virtual orbital (IVO) approach and CASCI methodologies as reference functions, deviating from the traditional use of CASSCF. Additionally, we implemented the RASCI-RASPT2 method to treat larger active spaces and parallelized the entire program. Due to the intricate process of selecting orbital spaces in CASPT2 and RASPT2, we offer a GUI program to assist with input creation. All these programs and tutorials are freely accessible on GitHub for anyone to use. In our benchmark calculations, we demonstrated the efficiency of parallelization by utilizing 1-256 cores for CASCI-CASPT2 calculations on the UO<sub>2</sub><sup>2+</sup> molecule. Despite encountering some anomalies, we achieved commendable parallelization efficiency with CASCI and CASPT2 computational times. We also computed the vertical excitation energies of UO<sub>2</sub><sup>2+</sup> using the RASCI-RASPT2 approach. By adapting the IVO and setting the maximum number of holes and electrons to three for RAS1 and RAS3, we obtained trends consistent with those reported in previous studies using alternative methods. We plan to continue improving the program in the future, believing that its widespread use will contribute to further development in actinide chemistry.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1249-1258"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}