首页 > 最新文献

Journal of Chemical Theory and Computation最新文献

英文 中文
New Algorithms to Generate Permutationally Invariant Polynomials and Fundamental Invariants for Potential Energy Surface Fitting. 势能曲面拟合中置换不变多项式和基本不变量的新算法。
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-22 DOI: 10.1021/acs.jctc.4c01447
Yiping Hao, Xiaoxiao Lu, Bina Fu, Dong H Zhang

Symmetric functions, such as Permutationally Invariant Polynomials (PIPs) and Fundamental Invariants (FIs), are effective and concise descriptors for incorporating permutation symmetry into neural network (NN) potential energy surface (PES) fitting. The traditional algorithm for generating such symmetric polynomials has a factorial time complexity of N!, where N is the number of identical atoms, posing a significant challenge to applying symmetric polynomials as descriptors of NN PESs for larger systems, particularly with more than 10 atoms. Herein, we report a new algorithm which has only linear time complexity for identical atoms. It can tremendously accelerate generation process of symmetric polynomials for molecular systems. The proposed algorithm is based on graph connectivity analysis following the action of the generation set of molecular permutational group. For instance, in the case of calculating the invariant polynomials for a 15-atom molecule, such as tropolone, our algorithm is approximately 2 million times faster than the previous method. The efficiency of the new algorithm can be further enhanced with increasing molecular size and number of identical atoms, making the FI-NN approach feasible for systems with over 10 atoms and high symmetry demands.

对称函数,如置换不变多项式(PIPs)和基本不变量(fi),是将置换对称性结合到神经网络势能面(PES)拟合中的有效而简洁的描述符。生成这种对称多项式的传统算法的阶乘时间复杂度为N!,其中N是相同原子的数量,这对将对称多项式作为NN ps的描述符应用于更大的系统,特别是超过10个原子的系统,提出了重大挑战。在此,我们报告了一种新的算法,它对相同的原子只有线性时间复杂度。它可以极大地加快分子系统对称多项式的生成过程。该算法基于分子置换群生成集作用下的图连通性分析。例如,在计算15个原子分子(如tropolone)的不变多项式的情况下,我们的算法比以前的方法快了大约200万倍。新算法的效率可以随着分子大小和相同原子数量的增加而进一步提高,使得FI-NN方法适用于超过10个原子和高对称性要求的系统。
{"title":"New Algorithms to Generate Permutationally Invariant Polynomials and Fundamental Invariants for Potential Energy Surface Fitting.","authors":"Yiping Hao, Xiaoxiao Lu, Bina Fu, Dong H Zhang","doi":"10.1021/acs.jctc.4c01447","DOIUrl":"10.1021/acs.jctc.4c01447","url":null,"abstract":"<p><p>Symmetric functions, such as Permutationally Invariant Polynomials (PIPs) and Fundamental Invariants (FIs), are effective and concise descriptors for incorporating permutation symmetry into neural network (NN) potential energy surface (PES) fitting. The traditional algorithm for generating such symmetric polynomials has a factorial time complexity of <i>N!</i>, where <i>N</i> is the number of identical atoms, posing a significant challenge to applying symmetric polynomials as descriptors of NN PESs for larger systems, particularly with more than 10 atoms. Herein, we report a new algorithm which has only linear time complexity for identical atoms. It can tremendously accelerate generation process of symmetric polynomials for molecular systems. The proposed algorithm is based on graph connectivity analysis following the action of the generation set of molecular permutational group. For instance, in the case of calculating the invariant polynomials for a 15-atom molecule, such as tropolone, our algorithm is approximately 2 million times faster than the previous method. The efficiency of the new algorithm can be further enhanced with increasing molecular size and number of identical atoms, making the FI-NN approach feasible for systems with over 10 atoms and high symmetry demands.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1046-1053"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimal Basis Iterative Stockholder Decomposition with Multipole Constraints. 具有多极约束的最小基迭代股东分解。
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-21 DOI: 10.1021/acs.jctc.4c01297
Jonas E S Mikkelsen, Frank Jensen

The minimal basis iterative Stockholder (MBIS) decomposition of molecular electron densities into atomic quantities is an attractive approach for deriving electrostatic parameters in force fields. The MBIS-derived atomic charges, however, in general tend to overestimate the molecular dipole and quadrupole moments by ∼10%. We show that it is possible to derive a constrained MBIS model where the atomic charges or a combination of atomic charges and dipoles exactly reproduce the molecular dipole and quadrupole moments for molecules. The atomic multipole moments derived by the constrained procedure are better at reproducing the molecular electrostatic potential (ESP) than the unconstrained atomic multipole moments. They are, furthermore, significantly less conformationally dependent than atomic charges obtained by fitting to the molecular electrostatic potential.

将分子电子密度分解为原子量的最小基迭代持股人(MBIS)方法是求解电场中静电参数的一种有吸引力的方法。然而,mbis衍生的原子电荷通常倾向于将分子偶极矩和四极矩高估约10%。我们证明有可能推导出一个约束的MBIS模型,其中原子电荷或原子电荷和偶极子的组合精确地再现分子的偶极子和四极矩。由约束过程导出的原子多极矩比不受约束的原子多极矩能更好地再现分子静电势。此外,它们的构象依赖性明显低于通过拟合分子静电势获得的原子电荷。
{"title":"Minimal Basis Iterative Stockholder Decomposition with Multipole Constraints.","authors":"Jonas E S Mikkelsen, Frank Jensen","doi":"10.1021/acs.jctc.4c01297","DOIUrl":"10.1021/acs.jctc.4c01297","url":null,"abstract":"<p><p>The minimal basis iterative Stockholder (MBIS) decomposition of molecular electron densities into atomic quantities is an attractive approach for deriving electrostatic parameters in force fields. The MBIS-derived atomic charges, however, in general tend to overestimate the molecular dipole and quadrupole moments by ∼10%. We show that it is possible to derive a constrained MBIS model where the atomic charges or a combination of atomic charges and dipoles exactly reproduce the molecular dipole and quadrupole moments for molecules. The atomic multipole moments derived by the constrained procedure are better at reproducing the molecular electrostatic potential (ESP) than the unconstrained atomic multipole moments. They are, furthermore, significantly less conformationally dependent than atomic charges obtained by fitting to the molecular electrostatic potential.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1179-1193"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of Time-Averaged Restraints with UNRES Coarse-Grained Model of Polypeptide Chains.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-24 DOI: 10.1021/acs.jctc.4c01504
Nguyen Truong Co, Cezary Czaplewski, Emilia A Lubecka, Adam Liwo

Time-averaged restraints from nuclear magnetic resonance (NMR) measurements have been implemented in the UNRES coarse-grained model of polypeptide chains in order to develop a tool for data-assisted modeling of the conformational ensembles of multistate proteins, intrinsically disordered proteins (IDPs) and proteins with intrinsically disordered regions (IDRs), many of which are essential in cell biology. A numerically stable variant of molecular dynamics with time-averaged restraints has been introduced, in which the total energy is conserved in sections of a trajectory in microcanonical runs, the bath temperature is maintained in canonical runs, and the time-average-restraint-force components are scaled up with the length of the memory window so that the restraints affect the simulated structures. The new approach restores the conformational ensembles used to generate ensemble-averaged distances, as demonstrated with synthetic restraints. The approach results in a better fitting of the ensemble-averaged interproton distances to those determined experimentally for multistate proteins and proteins with intrinsically disordered regions, which puts it at an advantage over all-atom approaches with regard to the determination of the conformational ensembles of proteins with diffuse structures, owing to a faster and more robust conformational search.

{"title":"Implementation of Time-Averaged Restraints with UNRES Coarse-Grained Model of Polypeptide Chains.","authors":"Nguyen Truong Co, Cezary Czaplewski, Emilia A Lubecka, Adam Liwo","doi":"10.1021/acs.jctc.4c01504","DOIUrl":"10.1021/acs.jctc.4c01504","url":null,"abstract":"<p><p>Time-averaged restraints from nuclear magnetic resonance (NMR) measurements have been implemented in the UNRES coarse-grained model of polypeptide chains in order to develop a tool for data-assisted modeling of the conformational ensembles of multistate proteins, intrinsically disordered proteins (IDPs) and proteins with intrinsically disordered regions (IDRs), many of which are essential in cell biology. A numerically stable variant of molecular dynamics with time-averaged restraints has been introduced, in which the total energy is conserved in sections of a trajectory in microcanonical runs, the bath temperature is maintained in canonical runs, and the time-average-restraint-force components are scaled up with the length of the memory window so that the restraints affect the simulated structures. The new approach restores the conformational ensembles used to generate ensemble-averaged distances, as demonstrated with synthetic restraints. The approach results in a better fitting of the ensemble-averaged interproton distances to those determined experimentally for multistate proteins and proteins with intrinsically disordered regions, which puts it at an advantage over all-atom approaches with regard to the determination of the conformational ensembles of proteins with diffuse structures, owing to a faster and more robust conformational search.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1476-1493"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823420/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143027455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Core-Excited States for Open-Shell Systems in Similarity-Transformed Equation-of-Motion Theory.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-28 DOI: 10.1021/acs.jctc.4c01181
Marcos Casanova-Páez, Frank Neese

X-ray absorption spectroscopy (XAS) is a powerful method for exploring molecular electronic structure by exciting core electrons into higher unoccupied molecular orbitals. In this study, we present the first integration of the spin-unrestricted similarity-transformed equation-of-motion coupled cluster method (CVS-USTEOM-CCSD) for core-excited and core-ionized states into the ORCA quantum chemistry package. Using the core-valence separation (CVS) approach, we evaluate the accuracy of CVS-USTEOM-CCSD across 13 open-shell organic systems, covering over 20 core excitations with diverse spin multiplicities (doublet, triplet, and quartet). The implementation leverages automated active space selection, incorporating CIS natural orbitals to efficiently capture electronic transitions. We benchmark the predicted K- and L-edge spectra against experimental data, underscoring the accuracy of the CVS-USTEOM-CCSD method for high-precision core excitation studies.

X 射线吸收光谱(XAS)是一种通过激发核心电子进入较高的未占据分子轨道来探索分子电子结构的强大方法。在本研究中,我们首次将针对核激发态和核电离态的自旋无限制相似性变换运动方程耦合簇方法(CVS-USTEOM-CCSD)集成到 ORCA 量子化学软件包中。利用核价分离(CVS)方法,我们评估了 CVS-USTEOM-CCSD 在 13 个开壳有机体系中的准确性,涵盖了 20 多种具有不同自旋倍率(二重、三重和四重)的核激发。该方法利用自动活动空间选择,结合 CIS 自然轨道,有效捕捉电子跃迁。我们将预测的 K 边和 L 边光谱与实验数据进行了比对,从而证明了 CVS-USTEOM-CCSD 方法在高精度核激发研究中的准确性。
{"title":"Core-Excited States for Open-Shell Systems in Similarity-Transformed Equation-of-Motion Theory.","authors":"Marcos Casanova-Páez, Frank Neese","doi":"10.1021/acs.jctc.4c01181","DOIUrl":"10.1021/acs.jctc.4c01181","url":null,"abstract":"<p><p>X-ray absorption spectroscopy (XAS) is a powerful method for exploring molecular electronic structure by exciting core electrons into higher unoccupied molecular orbitals. In this study, we present the first integration of the spin-unrestricted similarity-transformed equation-of-motion coupled cluster method (CVS-USTEOM-CCSD) for core-excited and core-ionized states into the ORCA quantum chemistry package. Using the core-valence separation (CVS) approach, we evaluate the accuracy of CVS-USTEOM-CCSD across 13 open-shell organic systems, covering over 20 core excitations with diverse spin multiplicities (doublet, triplet, and quartet). The implementation leverages automated active space selection, incorporating CIS natural orbitals to efficiently capture electronic transitions. We benchmark the predicted K- and L-edge spectra against experimental data, underscoring the accuracy of the CVS-USTEOM-CCSD method for high-precision core excitation studies.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1306-1321"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FCIQMC-CASPT2 with Imaginary-Time-Averaged Wave Functions. 虚时均波函数FCIQMC-CASPT2。
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-17 DOI: 10.1021/acs.jctc.4c01462
Arta A Safari, Robert J Anderson, Ali Alavi, Giovanni Li Manni

A new method to perform complete active space second-order perturbation theory on top of large active spaces optimized with full configuration quantum Monte Carlo is presented. Computing the three- and Fock-contracted four-particle density matrix from imaginary-time-averaged wave functions is found to resolve fermionic positivity violations and to ensure numerical stability. The protocol is applied to [NiFe]-hydrogenase, [Cu2O2]-oxidase and Fe-porphyrin model systems up to 26 electrons in 27 orbitals and benchmarked against DMRG-CASPT2.

提出了一种利用全组态量子蒙特卡罗优化的大活动空间上的完全活动空间二阶摄动理论的新方法。从虚时均波函数计算三粒子和福克收缩的四粒子密度矩阵,可以解决费米子正性违逆并确保数值稳定性。该方案适用于[NiFe]-氢化酶,[Cu2O2]-氧化酶和fe -卟啉模型体系,在27个轨道上有26个电子,并以DMRG-CASPT2为基准。
{"title":"FCIQMC-CASPT2 with Imaginary-Time-Averaged Wave Functions.","authors":"Arta A Safari, Robert J Anderson, Ali Alavi, Giovanni Li Manni","doi":"10.1021/acs.jctc.4c01462","DOIUrl":"10.1021/acs.jctc.4c01462","url":null,"abstract":"<p><p>A new method to perform complete active space second-order perturbation theory on top of large active spaces optimized with full configuration quantum Monte Carlo is presented. Computing the three- and Fock-contracted four-particle density matrix from imaginary-time-averaged wave functions is found to resolve fermionic positivity violations and to ensure numerical stability. The protocol is applied to [NiFe]-hydrogenase, [Cu<sub>2</sub>O<sub>2</sub>]-oxidase and Fe-porphyrin model systems up to 26 electrons in 27 orbitals and benchmarked against DMRG-CASPT2.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1029-1038"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823415/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PiNN: Equivariant Neural Network Suite for Modeling Electrochemical Systems.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-30 DOI: 10.1021/acs.jctc.4c01570
Jichen Li, Lisanne Knijff, Zhan-Yun Zhang, Linnéa Andersson, Chao Zhang

Electrochemical energy storage and conversion play increasingly important roles in electrification and sustainable development across the globe. A key challenge therein is to understand, control, and design electrochemical energy materials with atomistic precision. This requires inputs from molecular modeling powered by machine learning (ML) techniques. In this work, we have upgraded our pairwise interaction neural network Python package PiNN via introducing equivariant features to the PiNet2 architecture for fitting potential energy surfaces along with PiNet2-dipole for dipole and charge predictions as well as PiNet2-χ for generating atom-condensed charge response kernels. By benchmarking publicly accessible data sets of small molecules, crystalline materials, and liquid electrolytes, we found that the equivariant PiNet2 shows significant improvements over the original PiNet architecture and provides a state-of-the-art overall performance. Furthermore, leveraging on plug-ins such as PiNNAcLe for an adaptive learn-on-the-fly workflow in generating ML potentials and PiNNwall for modeling heterogeneous electrodes under external bias, we expect PiNN to serve as a versatile and high-performing ML-accelerated platform for molecular modeling of electrochemical systems.

{"title":"PiNN: Equivariant Neural Network Suite for Modeling Electrochemical Systems.","authors":"Jichen Li, Lisanne Knijff, Zhan-Yun Zhang, Linnéa Andersson, Chao Zhang","doi":"10.1021/acs.jctc.4c01570","DOIUrl":"10.1021/acs.jctc.4c01570","url":null,"abstract":"<p><p>Electrochemical energy storage and conversion play increasingly important roles in electrification and sustainable development across the globe. A key challenge therein is to understand, control, and design electrochemical energy materials with atomistic precision. This requires inputs from molecular modeling powered by machine learning (ML) techniques. In this work, we have upgraded our pairwise interaction neural network Python package PiNN via introducing equivariant features to the PiNet2 architecture for fitting potential energy surfaces along with PiNet2-dipole for dipole and charge predictions as well as PiNet2-χ for generating atom-condensed charge response kernels. By benchmarking publicly accessible data sets of small molecules, crystalline materials, and liquid electrolytes, we found that the equivariant PiNet2 shows significant improvements over the original PiNet architecture and provides a state-of-the-art overall performance. Furthermore, leveraging on plug-ins such as PiNNAcLe for an adaptive learn-on-the-fly workflow in generating ML potentials and PiNNwall for modeling heterogeneous electrodes under external bias, we expect PiNN to serve as a versatile and high-performing ML-accelerated platform for molecular modeling of electrochemical systems.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1382-1395"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823406/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative Model for Constructing Reaction Path from Initial to Final States. 从初始状态到最终状态反应路径的生成模型。
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-18 DOI: 10.1021/acs.jctc.4c01397
Akihide Hayashi, So Takamoto, Ju Li, Yuta Tsuboi, Daisuke Okanohara

Mapping the chemical reaction pathways and their corresponding activation barriers is a significant challenge in molecular simulation. Given the inherent complexities of 3D atomic geometries, even generating an initial guess of these paths can be difficult for humans. This paper presents an innovative approach that utilizes neural networks to generate initial guesses for reaction pathways based on the initial state and learning from a database of low-energy transition paths. The proposed method is initiated by inputting the coordinates of the initial state, followed by progressive alterations to its structure. This iterative process culminates in the generation of the guess reaction path and the coordinates of the final state. The method does not require one-the-fly computation of the actual potential energy surface and is therefore fast-acting. The application of this geometry-based method extends to complex reaction pathways illustrated by organic reactions. Training was executed on the Transition1x data set of organic reaction pathways. The results revealed the generation of reactions that bore substantial similarities with the test set of chemical reaction paths. The method's flexibility allows for reactions to be generated either to conform to predetermined conditions or in a randomized manner.

绘制化学反应途径及其相应的激活屏障是分子模拟的重大挑战。考虑到三维原子几何形状固有的复杂性,对人类来说,即使是对这些路径进行初步猜测也很困难。本文提出了一种创新的方法,利用神经网络在初始状态的基础上对反应路径进行初始猜测,并从低能跃迁路径数据库中学习。该方法首先输入初始状态的坐标,然后逐步改变其结构。这个迭代过程在生成猜测反应路径和最终状态坐标时达到高潮。该方法不需要一次性计算实际势能面,因此是快速的。这种基于几何的方法的应用扩展到复杂的反应途径,如有机反应。在Transition1x有机反应路径数据集上进行训练。结果显示生成的反应与化学反应路径的测试集有很大的相似之处。该方法的灵活性允许产生符合预定条件或随机方式的反应。
{"title":"Generative Model for Constructing Reaction Path from Initial to Final States.","authors":"Akihide Hayashi, So Takamoto, Ju Li, Yuta Tsuboi, Daisuke Okanohara","doi":"10.1021/acs.jctc.4c01397","DOIUrl":"10.1021/acs.jctc.4c01397","url":null,"abstract":"<p><p>Mapping the chemical reaction pathways and their corresponding activation barriers is a significant challenge in molecular simulation. Given the inherent complexities of 3D atomic geometries, even generating an initial guess of these paths can be difficult for humans. This paper presents an innovative approach that utilizes neural networks to generate initial guesses for reaction pathways based on the initial state and learning from a database of low-energy transition paths. The proposed method is initiated by inputting the coordinates of the initial state, followed by progressive alterations to its structure. This iterative process culminates in the generation of the guess reaction path and the coordinates of the final state. The method does not require one-the-fly computation of the actual potential energy surface and is therefore fast-acting. The application of this geometry-based method extends to complex reaction pathways illustrated by organic reactions. Training was executed on the Transition1x data set of organic reaction pathways. The results revealed the generation of reactions that bore substantial similarities with the test set of chemical reaction paths. The method's flexibility allows for reactions to be generated either to conform to predetermined conditions or in a randomized manner.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1292-1305"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11824368/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142996382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KaMLs for Predicting Protein pKa Values and Ionization States: Are Trees All You Need?
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-30 DOI: 10.1021/acs.jctc.4c01602
Mingzhe Shen, Daniel Kortzak, Simon Ambrozak, Shubham Bhatnagar, Ian Buchanan, Ruibin Liu, Jana Shen

Despite its importance in understanding biology and computer-aided drug discovery, the accurate prediction of protein ionization states remains a formidable challenge. Physics-based approaches struggle to capture the small, competing contributions in the complex protein environment, while machine learning (ML) is hampered by the scarcity of experimental data. Here, we report the development of pKa ML (KaML) models based on decision trees and graph attention networks (GAT), exploiting physicochemical understanding and a new experiment pKa database (PKAD-3) enriched with highly shifted pKa's. KaML-CBtree significantly outperforms the current state of the art in predicting pKa values and ionization states across all six titratable amino acids, notably achieving accurate predictions for deprotonated cysteines and lysines─a blind spot in previous models. The superior performance of KaMLs is achieved in part through several innovations, including the separate treatment of acid and base, data augmentation using AlphaFold structures, and model pretraining on a theoretical pKa database. We also introduce the classification of protonation states as a metric for evaluating pKa prediction models. A meta-feature analysis suggests a possible reason for the lightweight tree model to outperform the more complex deep learning GAT. We release an end-to-end pKa predictor based on KaML-CBtree and the new PKAD-3 database, which facilitates a variety of applications and provides the foundation for further advances in protein electrostatic research.

{"title":"KaMLs for Predicting Protein p<i>K</i><sub>a</sub> Values and Ionization States: Are Trees All You Need?","authors":"Mingzhe Shen, Daniel Kortzak, Simon Ambrozak, Shubham Bhatnagar, Ian Buchanan, Ruibin Liu, Jana Shen","doi":"10.1021/acs.jctc.4c01602","DOIUrl":"10.1021/acs.jctc.4c01602","url":null,"abstract":"<p><p>Despite its importance in understanding biology and computer-aided drug discovery, the accurate prediction of protein ionization states remains a formidable challenge. Physics-based approaches struggle to capture the small, competing contributions in the complex protein environment, while machine learning (ML) is hampered by the scarcity of experimental data. Here, we report the development of p<i>K</i><sub>a</sub> ML (KaML) models based on decision trees and graph attention networks (GAT), exploiting physicochemical understanding and a new experiment p<i>K</i><sub>a</sub> database (PKAD-3) enriched with highly shifted p<i>K</i><sub>a</sub>'s. KaML-CBtree significantly outperforms the current state of the art in predicting p<i>K</i><sub>a</sub> values and ionization states across all six titratable amino acids, notably achieving accurate predictions for deprotonated cysteines and lysines─a blind spot in previous models. The superior performance of KaMLs is achieved in part through several innovations, including the separate treatment of acid and base, data augmentation using AlphaFold structures, and model pretraining on a theoretical p<i>K</i><sub>a</sub> database. We also introduce the classification of protonation states as a metric for evaluating p<i>K</i><sub>a</sub> prediction models. A meta-feature analysis suggests a possible reason for the lightweight tree model to outperform the more complex deep learning GAT. We release an end-to-end p<i>K</i><sub>a</sub> predictor based on KaML-CBtree and the new PKAD-3 database, which facilitates a variety of applications and provides the foundation for further advances in protein electrostatic research.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1446-1458"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143062139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How To Correct Erroneous Symmetry-Breaking in Coarse-Grained Constant-pH Simulations.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-29 DOI: 10.1021/acs.jctc.4c01010
David Beyer, Pablo M Blanco, Jonas Landsgesell, Peter Košovan, Christian Holm

The constant-pH Monte Carlo method is a popular algorithm to study acid-base equilibria in coarse-grained simulations of charge regulating soft matter systems including weak polyelectrolytes and proteins. However, the method suffers from systematic errors in simulations with explicit ions, which lead to a symmetry-breaking between chemically equivalent implementations of the acid-base equilibrium. Here, we show that this artifact of the algorithm can be corrected a-posteriori by simply shifting the pH-scale. We present two analytical methods as well as a numerical method using Widom insertion to obtain the correction. By numerically investigating various sample systems, we assess the range of validity of the analytical approaches and show that the Widom approach always leads to consistent results, even when the analytical approaches fail. Overall, we provide practical guidelines on how to use constant-pH simulations to avoid systematic errors, including cases where special care is required, such as polyampholytes and proteins.

{"title":"How To Correct Erroneous Symmetry-Breaking in Coarse-Grained Constant-pH Simulations.","authors":"David Beyer, Pablo M Blanco, Jonas Landsgesell, Peter Košovan, Christian Holm","doi":"10.1021/acs.jctc.4c01010","DOIUrl":"10.1021/acs.jctc.4c01010","url":null,"abstract":"<p><p>The constant-pH Monte Carlo method is a popular algorithm to study acid-base equilibria in coarse-grained simulations of charge regulating soft matter systems including weak polyelectrolytes and proteins. However, the method suffers from systematic errors in simulations with explicit ions, which lead to a symmetry-breaking between chemically equivalent implementations of the acid-base equilibrium. Here, we show that this artifact of the algorithm can be corrected a-posteriori by simply shifting the pH-scale. We present two analytical methods as well as a numerical method using Widom insertion to obtain the correction. By numerically investigating various sample systems, we assess the range of validity of the analytical approaches and show that the Widom approach always leads to consistent results, even when the analytical approaches fail. Overall, we provide practical guidelines on how to use constant-pH simulations to avoid systematic errors, including cases where special care is required, such as polyampholytes and proteins.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1396-1404"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143057416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relativistic CASPT2/RASPT2 Program along with DIRAC Software.
IF 5.7 1区 化学 Q2 CHEMISTRY, PHYSICAL Pub Date : 2025-02-11 Epub Date: 2025-01-27 DOI: 10.1021/acs.jctc.4c01589
Yasuto Masuda, Kohei Noda, Sumika Iwamuro, Masahiko Hada, Naoki Nakatani, Minori Abe

Exploring electronic states in actinide compounds is a critical aspect of nuclear science. However, considering relativistic effects and electron correlation in theoretical calculations poses a complex challenge. To tackle this, we developed the CASPT2/RASPT2 program along with the DIRAC program, enabling calculations of electron correlation methods using multiconfigurational perturbation theory with various relativistic Hamiltonians. Currently, we employ a method that combines the improved virtual orbital (IVO) approach and CASCI methodologies as reference functions, deviating from the traditional use of CASSCF. Additionally, we implemented the RASCI-RASPT2 method to treat larger active spaces and parallelized the entire program. Due to the intricate process of selecting orbital spaces in CASPT2 and RASPT2, we offer a GUI program to assist with input creation. All these programs and tutorials are freely accessible on GitHub for anyone to use. In our benchmark calculations, we demonstrated the efficiency of parallelization by utilizing 1-256 cores for CASCI-CASPT2 calculations on the UO22+ molecule. Despite encountering some anomalies, we achieved commendable parallelization efficiency with CASCI and CASPT2 computational times. We also computed the vertical excitation energies of UO22+ using the RASCI-RASPT2 approach. By adapting the IVO and setting the maximum number of holes and electrons to three for RAS1 and RAS3, we obtained trends consistent with those reported in previous studies using alternative methods. We plan to continue improving the program in the future, believing that its widespread use will contribute to further development in actinide chemistry.

{"title":"Relativistic CASPT2/RASPT2 Program along with DIRAC Software.","authors":"Yasuto Masuda, Kohei Noda, Sumika Iwamuro, Masahiko Hada, Naoki Nakatani, Minori Abe","doi":"10.1021/acs.jctc.4c01589","DOIUrl":"10.1021/acs.jctc.4c01589","url":null,"abstract":"<p><p>Exploring electronic states in actinide compounds is a critical aspect of nuclear science. However, considering relativistic effects and electron correlation in theoretical calculations poses a complex challenge. To tackle this, we developed the CASPT2/RASPT2 program along with the DIRAC program, enabling calculations of electron correlation methods using multiconfigurational perturbation theory with various relativistic Hamiltonians. Currently, we employ a method that combines the improved virtual orbital (IVO) approach and CASCI methodologies as reference functions, deviating from the traditional use of CASSCF. Additionally, we implemented the RASCI-RASPT2 method to treat larger active spaces and parallelized the entire program. Due to the intricate process of selecting orbital spaces in CASPT2 and RASPT2, we offer a GUI program to assist with input creation. All these programs and tutorials are freely accessible on GitHub for anyone to use. In our benchmark calculations, we demonstrated the efficiency of parallelization by utilizing 1-256 cores for CASCI-CASPT2 calculations on the UO<sub>2</sub><sup>2+</sup> molecule. Despite encountering some anomalies, we achieved commendable parallelization efficiency with CASCI and CASPT2 computational times. We also computed the vertical excitation energies of UO<sub>2</sub><sup>2+</sup> using the RASCI-RASPT2 approach. By adapting the IVO and setting the maximum number of holes and electrons to three for RAS1 and RAS3, we obtained trends consistent with those reported in previous studies using alternative methods. We plan to continue improving the program in the future, believing that its widespread use will contribute to further development in actinide chemistry.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":"1249-1258"},"PeriodicalIF":5.7,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemical Theory and Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1