Journal of Chemical Theory and Computation最新文献

英文中文

Machine-Learning Ice Spectra: From 1 to 256 Features 机器学习冰光谱：从1到256个特征

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-04 DOI: 10.1021/acs.jctc.5c01413

Shokirbek Shermukhamedov,Jolla Kullgren,Daniel Sethio,Kersti Hermansson

The study explores how well machine learning and structural fingerprints can predict spectroscopic properties of ice (OH vibrational frequencies and 1H chemical shifts). A large theoretical data set (55 ice polymorphs, 1010 DFT data points both for the vibrations and for the NMR shifts) and a smaller cross-validation set are employed. The Message Passing Atomic Cluster Expansion (MACE) model performs the best, with high accuracy (root-mean-square deviation, RMSD, of 0.06 ppm for chemical shifts and ∼10 cm–1 for vibrational frequencies). Simpler descriptors like ACSF and SOAP, when paired with suitable regressors, nearly match MACE’s performance. At the other end of the complexity scale, it is found that using the simplest possible physics-based descriptor of the environment (a single H-bond distance) yields RMSD values three times as large for the vibrations and four times as large for the proton chemical shift compared to the MACE model. Depending on the context, those RMSD values may still be considered modest and useful, considering the gain in simplicity and transparency.

该研究探索了机器学习和结构指纹如何很好地预测冰的光谱特性（OH振动频率和1H化学位移）。一个大的理论数据集（55冰多态，1010 DFT数据点的振动和核磁共振位移）和一个较小的交叉验证集被采用。消息传递原子簇扩展（Message Passing Atomic Cluster Expansion， MACE）模型表现最好，精度高（化学位移的均方根偏差RMSD为0.06 ppm，振动频率为~ 10 cm-1）。更简单的描述符，如ACSF和SOAP，当与合适的回归器配对时，几乎可以匹配MACE的性能。在复杂性尺度的另一端，研究人员发现，与MACE模型相比，使用最简单的基于物理的环境描述符（单个氢键距离）产生的RMSD值是振动的三倍，是质子化学位移的四倍。根据上下文，考虑到简单性和透明度的增加，这些RMSD值可能仍然被认为是适度和有用的。

引用次数: 0

Seniority-Zero Canonical Transformation Theory: Error Reduction via Late Truncation 资历零正则变换理论：通过后期截断减少误差

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-04 DOI: 10.1021/acs.jctc.5c01892

Daniel F. Calero-Osorio,Paul W. Ayers

We show how to add the effects of residual electron correlation to a reference seniority-zero wave function by transforming the true electronic Hamiltonian into seniority-zero form. The transformation is treated via the Baker–Campbell–Hausdorff (BCH) expansion, and the seniority-zero structure of the reference is exploited to evaluate the first three commutators exactly; the remaining contributions are handled with a recursive commutator approximation, as is typical in canonical transformation methods. By choosing a seniority-zero reference and using parallel computation, this method is practical for small- to medium-sized systems. Numerical tests show high accuracy, with errors ∼10–4 Hartree.

我们展示了如何通过将真正的电子哈密顿量转换为优先级零形式，将剩余电子相关效应添加到参考优先级零波函数中。通过Baker-Campbell-Hausdorff （BCH）展开对变换进行处理，并利用参考的资历零结构精确计算前三个换向子；其余的贡献用递归换向子近似处理，这是典型的规范转换方法。该方法通过选择一个优先级为零的参考点并采用并行计算的方法，适用于中小型系统。数值测试显示出很高的精度，误差为10-4哈特里。

引用次数: 0

Computing Exchange Coupling Constants in Transition Metal Complexes with Tensor Product Selected Configuration Interaction 用张量积选择构型相互作用计算过渡金属配合物中的交换耦合常数

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-04 DOI: 10.1021/acs.jctc.5c01817

Arnab Bachhar,Nicholas J. Mayhall

Transition metal complexes present significant challenges for electronic structure theory due to strong electron correlation arising from partially filled d-orbitals. We compare our recently developed Tensor Product Selected Configuration Interaction (TPSCI) with Density Matrix Renormalization Group (DMRG) for computing exchange coupling constants in six transition metal systems, including dinuclear Cr, Fe, and Mn complexes and a tetranuclear Ni-cubane. TPSCI uses a locally correlated tensor product state basis to capture electronic structure efficiently while maintaining interpretability. From calculations on active spaces ranging from (22e,29o) to (42e,49o), we find that TPSCI consistently yields higher variational energies than DMRG due to truncation of local cluster states, but provides magnetic exchange coupling constants (J) generally within 10–30 cm–1 of DMRG results. Key advantages include natural multistate capability enabling direct J extrapolation with smaller statistical errors, and computational efficiency for challenging systems. However, cluster state truncation represents a fundamental limitation requiring careful convergence testing, particularly for large local cluster dimensions. We identify specific failure cases where current truncation schemes break down, highlighting the need for improved cluster state selection methods and distributed memory implementations to realize TPSCI’s full potential for strongly correlated systems.

过渡金属配合物由于部分填满d轨道而产生的强电子相关性对电子结构理论提出了重大挑战。我们将最近开发的张量积选择构型相互作用（TPSCI）与密度矩阵重正化群（DMRG）进行了比较，用于计算六种过渡金属体系的交换耦合常数，包括双核Cr， Fe和Mn配合物和四核Ni-cubane。TPSCI使用局部相关张量积状态基来有效捕获电子结构，同时保持可解释性。从（22e, 290）到（42e, 490）的有效空间计算中，我们发现由于局部簇态的截断，TPSCI始终比DMRG产生更高的变分能量，但提供的磁交换耦合常数(J)通常在DMRG结果的10-30 cm-1范围内。主要优势包括自然的多状态能力，支持以较小的统计误差进行直接J外推，以及具有挑战性系统的计算效率。然而，集群状态截断代表了一个基本限制，需要仔细的收敛测试，特别是对于大的局部集群维度。我们确定了当前截断方案失效的具体故障案例，强调了改进集群状态选择方法和分布式内存实现的必要性，以实现TPSCI在强相关系统中的全部潜力。

{"title":"Computing Exchange Coupling Constants in Transition Metal Complexes with Tensor Product Selected Configuration Interaction","authors":"Arnab Bachhar,Nicholas J. Mayhall","doi":"10.1021/acs.jctc.5c01817","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01817","url":null,"abstract":"Transition metal complexes present significant challenges for electronic structure theory due to strong electron correlation arising from partially filled d-orbitals. We compare our recently developed Tensor Product Selected Configuration Interaction (TPSCI) with Density Matrix Renormalization Group (DMRG) for computing exchange coupling constants in six transition metal systems, including dinuclear Cr, Fe, and Mn complexes and a tetranuclear Ni-cubane. TPSCI uses a locally correlated tensor product state basis to capture electronic structure efficiently while maintaining interpretability. From calculations on active spaces ranging from (22e,29o) to (42e,49o), we find that TPSCI consistently yields higher variational energies than DMRG due to truncation of local cluster states, but provides magnetic exchange coupling constants (J) generally within 10–30 cm–1 of DMRG results. Key advantages include natural multistate capability enabling direct J extrapolation with smaller statistical errors, and computational efficiency for challenging systems. However, cluster state truncation represents a fundamental limitation requiring careful convergence testing, particularly for large local cluster dimensions. We identify specific failure cases where current truncation schemes break down, highlighting the need for improved cluster state selection methods and distributed memory implementations to realize TPSCI’s full potential for strongly correlated systems.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"91 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Marcus Theory and the Condon Approximation Revisited I: E-SHAKE and Seam Sampling Marcus理论和Condon近似再访I: E-SHAKE和Seam采样

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-03 DOI: 10.1021/acs.jctc.5c01733

D. Vale Cofer-Shabica,Jennifer R. DeRosa,Joseph E. Subotnik

Marcus theory is the workhorse of theoretical chemistry for predicting the rates of charge and energy transfer. Marcus theory overwhelmingly agrees with experiment─both in terms of electron transfer and triplet energy transfer─for the famous set of naphthalene-bridge-biphenyl and naphthalene-bridge-benzophenone systems studied by Piotrowiak, Miller, and Closs. That being said, the agreement is not perfect, and in this manuscript, we revisit one key point of disagreement: the molecule C-13-ae ([3,equatorial]-naphthalene-cyclohexane-[1,axial]-benzophenone). To better understand the theory–experiment disagreement, we introduce and employ a novel scheme to sample the seam between two diabatic electronic states (E-SHAKE) through which we reveal the breakdown of the Condon approximation and the presence of a conical intersection for the C-13-ae molecule; we also predict an isotopic effect on the rate of triplet–triplet energy transfer.

马库斯理论是理论化学中预测电荷速率和能量转移速率的中坚力量。对于Piotrowiak、Miller和Closs所研究的著名的萘-桥-联苯和萘-桥-二苯酮体系，Marcus理论在电子转移和三重态能量转移方面与实验完全一致。话虽如此，协议并不完美，在这个手稿中，我们重新审视一个关键的分歧点：分子C-13-ae（[3，赤道]-萘-环己烷-[1，轴向]-二苯甲酮）。为了更好地理解理论与实验之间的分歧，我们引入并采用了一种新的方案来采样两个非绝热电子态之间的接缝（E-SHAKE），通过该方案我们揭示了Condon近似的击穿和C-13-ae分子的锥形相交的存在；我们还预测了同位素对三重态-三重态能量转移速率的影响。

引用次数: 0

QO-BRA: A Quantum Operator-Based Autoencoder for De Novo Molecular Design QO-BRA：一种基于量子算子的从头分子设计自编码器

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-03 DOI: 10.1021/acs.jctc.5c01704

Yue Yu,Francesco Calcagno,Haote Li,Victor S. Batista

We introduce a variational quantum autoencoder tailored for de novo molecular design, named QO-BRA (Quantum Operator-Based Real Amplitude autoencoder). QO-BRA leverages quantum circuits for real-amplitude encoding and the SWAP test to estimate reconstruction and latent-space regularization errors during back-propagation. Adjoint encoder and decoder operators enable unitary transformations and a generative process that ensures accurate reconstruction, as well as the novelty, uniqueness, and validity of the generated samples. We showcase the capabilities of QO-BRA as applied to the de novo design of Ca2+-, Mg2+-, and Zn2+-binding metalloproteins after training the generative model with a modest data set.

我们介绍了一种为从头分子设计量身定制的变分量子自编码器，命名为QO-BRA（基于量子算子的实振幅自编码器）。QO-BRA利用量子电路进行实幅编码和SWAP测试来估计反向传播期间的重建和潜在空间正则化误差。伴随编码器和解码器操作符实现了统一转换和生成过程，确保了准确的重建，以及生成样本的新颖性，唯一性和有效性。通过适度的数据集训练生成模型，我们展示了QO-BRA应用于Ca2+-， Mg2+-和Zn2+结合金属蛋白的从头设计的能力。

引用次数: 0

Multi-Objective Optimization of Approximate Functionals via Implicit Interdependency Modeling 基于隐式相互依赖建模的近似泛函多目标优化

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-02 DOI: 10.1021/acs.jctc.5c01328

Ziying Yuan,Neil Qiang Su

Accurate and transferable exchange–correlation (XC) functionals are central to the predictive power of density functional theory (DFT). However, conventional parameter optimization of XC functionals is typically performed using single-objective or stepwise strategies, which may lead to imbalanced performance across chemically diverse systems. This work introduces a multi-objective optimization framework, termed EBI4MO (explicit-by-implicit for multi-objectives), that enables simultaneous and consistent optimization with respect to multiple performance criteria. EBI4MO constructs a hierarchy of implicit functions that couple interdependent parameter groups across objectives, allowing sequential yet interlinked parameter updates. As a demonstration, EBI4MO is applied to optimize the parameters in hybrid XC functionals with dispersion corrections, using the GMTKN55 benchmark database. Two objectives are considered: minimizing the overall prediction error and achieving uniform improvement relative to B3LYP-D3(BJ), a widely used and balanced functional. The resulting functionals demonstrate consistent and balanced performance across all benchmark subsets, outperforming functionals optimized via conventional single-objective or stepwise methods. These results highlight the effectiveness and generality of EBI4MO, offering a new strategy for functional development and broader multi-objective optimization problems in computational chemistry.

准确和可转移的交换相关泛函是密度泛函理论（DFT）预测能力的核心。然而，传统的XC函数参数优化通常是使用单目标或逐步策略进行的，这可能导致化学多样性系统的性能不平衡。这项工作引入了一个多目标优化框架，称为EBI4MO（多目标显式隐式），它可以针对多个性能标准同时进行一致的优化。EBI4MO构建了一个隐式函数的层次结构，这些隐式函数将相互依赖的参数组跨目标耦合在一起，允许顺序但又相互关联的参数更新。作为演示，使用GMTKN55基准数据库，将EBI4MO应用于具有色散校正的混合XC函数的参数优化。考虑了两个目标：最小化总体预测误差和实现相对于B3LYP-D3（BJ）的均匀改进，BJ是一个广泛使用的平衡函数。结果函数在所有基准子集中表现出一致和平衡的性能，优于通过传统的单目标或逐步方法优化的函数。这些结果突出了EBI4MO的有效性和通用性，为计算化学中功能开发和更广泛的多目标优化问题提供了新的策略。

{"title":"Multi-Objective Optimization of Approximate Functionals via Implicit Interdependency Modeling","authors":"Ziying Yuan,Neil Qiang Su","doi":"10.1021/acs.jctc.5c01328","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01328","url":null,"abstract":"Accurate and transferable exchange–correlation (XC) functionals are central to the predictive power of density functional theory (DFT). However, conventional parameter optimization of XC functionals is typically performed using single-objective or stepwise strategies, which may lead to imbalanced performance across chemically diverse systems. This work introduces a multi-objective optimization framework, termed EBI4MO (explicit-by-implicit for multi-objectives), that enables simultaneous and consistent optimization with respect to multiple performance criteria. EBI4MO constructs a hierarchy of implicit functions that couple interdependent parameter groups across objectives, allowing sequential yet interlinked parameter updates. As a demonstration, EBI4MO is applied to optimize the parameters in hybrid XC functionals with dispersion corrections, using the GMTKN55 benchmark database. Two objectives are considered: minimizing the overall prediction error and achieving uniform improvement relative to B3LYP-D3(BJ), a widely used and balanced functional. The resulting functionals demonstrate consistent and balanced performance across all benchmark subsets, outperforming functionals optimized via conventional single-objective or stepwise methods. These results highlight the effectiveness and generality of EBI4MO, offering a new strategy for functional development and broader multi-objective optimization problems in computational chemistry.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"40 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scalable Force Fields for Metal-Mediated DNA Nanostructures 金属介导的DNA纳米结构的可伸缩力场

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-02 DOI: 10.1021/acs.jctc.5c01426

William Livernois,Olaiyan Alolaiyan,Arpan De,M. P. Anantram

Force fields were developed for metal-mediated DNA (mmDNA) structures, using ab initio methods to parametrize metal coordination. Two mmDNA structures were considered, using cytosine/thymine mismatches with coordinated Ag and Hg metal atoms. These base pairs were parametrized with the proposed computational framework and subjected to multiple validation steps. The generated force fields showed enhanced structural stability within the metalated base pairs, with the coordinated metal rotating into the major groove. Our findings show a higher propeller angle associated with the metalated base pair, which agrees with previously reported experimental data. The developed force fields have the potential to unveil the structural dynamics of long metalated DNA nanowires, while results have been demonstrated on a chain of three mmDNA base pairs.

建立了金属介导DNA （mmDNA）结构的力场，采用从头算方法对金属配位进行参数化。考虑了两种mmDNA结构，使用胞嘧啶/胸腺嘧啶配位银和汞金属原子。利用所提出的计算框架对这些碱基对进行参数化，并进行多个验证步骤。产生的力场增强了金属化碱基对内部的结构稳定性，配位金属向主槽内旋转。我们的研究结果表明，与金属化碱基对相关的螺旋桨角度更高，这与先前报道的实验数据一致。开发的力场有可能揭示长金属化DNA纳米线的结构动力学，而结果已经在三个mmDNA碱基对链上得到了证明。

引用次数: 0

From Pretrained to Precision: Fine-Tuning Universal Interatomic Potentials for Accurate Catalytic Reaction Simulations 从预训练到精确：微调精确催化反应模拟的普遍原子间电位

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-02 DOI: 10.1021/acs.jctc.5c01455

Jinzhe Ma,Xiaoyan Fu,Wenbo Xie,P. Hu

Universal machine learning interatomic potentials (uMLIPs) represent a significant advancement in interatomic potential modeling, offering remarkable predictive accuracy across a wide range of chemical systems. However, their applications in catalytic reaction simulation are limited by their lack of accuracy in describing reactions, especially in reaction barrier prediction. In this study, we evaluate two established uMLIPs and use fine-tuning strategies to enhance their performance for the prediction of catalytic reaction prediction. We systematically compared the predictive accuracy, data efficiency, and generalization capabilities of two approaches, fine-tuning and training from scratch, using the accuracy of the original pretrained uMLIPs as a baseline. Specifically, we evaluated the applicability of the approaches across a range of tasks, from relatively simple applications such as molecular dynamics (MD) simulations and adsorption energy calculations to more complex challenges such as transition state searches. We also analyzed model performance across varying training set sizes to identify the critical data threshold needed for accurate reaction predictions. Additionally, we assessed the extrapolative generalization of the models by examining improvements in predictive accuracy for unseen elements following fine-tuning across both simple and complex tasks. Our results show that fine-tuning uMLIPs significantly improves the accuracy of reaction energy predictions, reducing the mean absolute error (MAE) to 0.09 eV, compared to 0.38 eV for the original uMLIPs. Notably, the fine-tuned models require only 10%–30% of the data used for training from scratch, yielding a stable and reliable performance. Moreover, the generalization capabilities of the uMLIPs were preserved after fine-tuning. This approach shows significant promise for extending the uMLIPs applicability to diverse catalytic reaction systems.

通用机器学习原子间势（uMLIPs）代表了原子间势建模的重大进步，在广泛的化学系统中提供了显着的预测准确性。然而，它们在催化反应模拟中的应用受到描述反应缺乏准确性的限制，特别是在反应势垒预测方面。在这项研究中，我们评估了两个已建立的umlip，并使用微调策略来提高它们在催化反应预测中的预测性能。我们系统地比较了两种方法的预测准确性、数据效率和泛化能力，微调和从头开始训练，使用原始预训练的umlip的准确性作为基线。具体来说，我们评估了这些方法在一系列任务中的适用性，从相对简单的应用，如分子动力学（MD）模拟和吸附能计算，到更复杂的挑战，如过渡态搜索。我们还分析了不同训练集大小的模型性能，以确定准确预测反应所需的关键数据阈值。此外，我们通过检查在简单和复杂任务中微调后未见元素的预测准确性的改进，评估了模型的外推泛化。我们的研究结果表明，微调的uMLIPs显著提高了反应能量预测的准确性，将平均绝对误差（MAE）降低到0.09 eV，而原始uMLIPs的平均绝对误差为0.38 eV。值得注意的是，微调模型只需要从头开始训练所用数据的10%-30%，从而产生稳定可靠的性能。此外，经过微调后，umlip的泛化能力得到了保留。该方法显示了将uMLIPs应用于多种催化反应体系的重大前景。

{"title":"From Pretrained to Precision: Fine-Tuning Universal Interatomic Potentials for Accurate Catalytic Reaction Simulations","authors":"Jinzhe Ma,Xiaoyan Fu,Wenbo Xie,P. Hu","doi":"10.1021/acs.jctc.5c01455","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01455","url":null,"abstract":"Universal machine learning interatomic potentials (uMLIPs) represent a significant advancement in interatomic potential modeling, offering remarkable predictive accuracy across a wide range of chemical systems. However, their applications in catalytic reaction simulation are limited by their lack of accuracy in describing reactions, especially in reaction barrier prediction. In this study, we evaluate two established uMLIPs and use fine-tuning strategies to enhance their performance for the prediction of catalytic reaction prediction. We systematically compared the predictive accuracy, data efficiency, and generalization capabilities of two approaches, fine-tuning and training from scratch, using the accuracy of the original pretrained uMLIPs as a baseline. Specifically, we evaluated the applicability of the approaches across a range of tasks, from relatively simple applications such as molecular dynamics (MD) simulations and adsorption energy calculations to more complex challenges such as transition state searches. We also analyzed model performance across varying training set sizes to identify the critical data threshold needed for accurate reaction predictions. Additionally, we assessed the extrapolative generalization of the models by examining improvements in predictive accuracy for unseen elements following fine-tuning across both simple and complex tasks. Our results show that fine-tuning uMLIPs significantly improves the accuracy of reaction energy predictions, reducing the mean absolute error (MAE) to 0.09 eV, compared to 0.38 eV for the original uMLIPs. Notably, the fine-tuned models require only 10%–30% of the data used for training from scratch, yielding a stable and reliable performance. Moreover, the generalization capabilities of the uMLIPs were preserved after fine-tuning. This approach shows significant promise for extending the uMLIPs applicability to diverse catalytic reaction systems.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"91 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scalable Quantum Monte Carlo Method for Polariton Chemistry via Mixed Block Sparsity and Tensor Hypercontraction Method 基于混合块稀疏和张量超收缩方法的可扩展量子蒙特卡罗极化化学方法

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-02 DOI: 10.1021/acs.jctc.5c01925

Yu Zhang

We present a reduced-scaling auxiliary-field quantum Monte Carlo (AFQMC) framework designed for large molecular systems and ensembles, with or without coupling to optical cavities. Our approach leverages the natural block sparsity of the Cholesky decomposition (CD) of electron repulsion integrals in molecular ensembles and employs tensor hypercontraction (THC) to efficiently compress low-rank Cholesky blocks. By representing the Cholesky vectors in a mixed format, keeping high-rank blocks in block-sparse form and compressing low-rank blocks with THC, we reduce the scaling of exchange-energy evaluation from quartic to robust cubic in the number of molecular orbitals N, while lowering memory from cubic toward quadratic. Benchmark analyses on one-, two-, and three-dimensional molecular ensembles (up to ∼1,200 orbitals) show that (a) the number of nonzeros in Cholesky tensors grows linearly with system size across dimensions; (b) the average numerical rank increases sublinearly and does not saturate at these sizes; and (c) rank heterogeneity─some blocks nearly full rank and many low rank, naturally motivates the proposed mixed block sparsity and THC scheme for efficient calculation of exchange energy. We demonstrate that the mixed scheme yields cubic wall-time scaling with favorable prefactors and preserves AFQMC accuracy.

我们提出了一个减少尺度的辅助场量子蒙特卡罗（AFQMC）框架，设计用于大分子系统和集成，有或没有耦合到光学腔。我们的方法利用分子系综中电子斥力积分的Cholesky分解（CD）的自然块稀疏性，并采用张量超收缩（THC）来有效压缩低秩Cholesky块。通过以混合格式表示Cholesky向量，将高阶块保持在块稀疏形式，并使用THC压缩低阶块，我们将交换能量评估的尺度从分子轨道数N的四次方降低到鲁棒三次方，同时将内存从三次方降低到二次次方。对一维、二维和三维分子体系（高达~ 1200个轨道）的基准分析表明：(a) Cholesky张量中的非零数随系统尺寸在各个维度上线性增长；(b)平均数值级次线性增长，在这些尺寸下不饱和；(c)等级不均一性──部分区块接近全等级，而许多区块等级较低，自然促使提出混合区块稀疏性和THC方案，以高效计算交换能。我们证明了混合方案产生了具有良好前因子的三次壁面缩放，并保持了AFQMC精度。

{"title":"Scalable Quantum Monte Carlo Method for Polariton Chemistry via Mixed Block Sparsity and Tensor Hypercontraction Method","authors":"Yu Zhang","doi":"10.1021/acs.jctc.5c01925","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01925","url":null,"abstract":"We present a reduced-scaling auxiliary-field quantum Monte Carlo (AFQMC) framework designed for large molecular systems and ensembles, with or without coupling to optical cavities. Our approach leverages the natural block sparsity of the Cholesky decomposition (CD) of electron repulsion integrals in molecular ensembles and employs tensor hypercontraction (THC) to efficiently compress low-rank Cholesky blocks. By representing the Cholesky vectors in a mixed format, keeping high-rank blocks in block-sparse form and compressing low-rank blocks with THC, we reduce the scaling of exchange-energy evaluation from quartic to robust cubic in the number of molecular orbitals N, while lowering memory from cubic toward quadratic. Benchmark analyses on one-, two-, and three-dimensional molecular ensembles (up to ∼1,200 orbitals) show that (a) the number of nonzeros in Cholesky tensors grows linearly with system size across dimensions; (b) the average numerical rank increases sublinearly and does not saturate at these sizes; and (c) rank heterogeneity─some blocks nearly full rank and many low rank, naturally motivates the proposed mixed block sparsity and THC scheme for efficient calculation of exchange energy. We demonstrate that the mixed scheme yields cubic wall-time scaling with favorable prefactors and preserves AFQMC accuracy.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"82 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhanced Representation-Based Sampling for the Efficient Generation of Data Sets for Machine-Learned Interatomic Potentials 机器学习原子间势数据集高效生成的增强表示采样

IF 5.5 1区化学 Q2 CHEMISTRY, PHYSICAL

Journal of Chemical Theory and Computation

Pub Date : 2026-02-02 DOI: 10.1021/acs.jctc.5c01767

Moritz R. Schäfer,Johannes Kästner

In this work, we present enhanced representation-based sampling (ERBS), a novel enhanced sampling method designed to generate structurally diverse training data sets for machine-learned interatomic potentials. ERBS automatically identifies collective variables by dimensionality reduction of atomic descriptors and applies a bias potential inspired by the On-the-Fly probability enhanced sampling framework. We highlight the ability of Gaussian moment descriptors to capture collective molecular motions and explore the impact of biasing parameters using alanine dipeptide as a benchmark system. We show that free energy surfaces can be reconstructed with high fidelity using only short biased trajectories as training data. Further, we apply the method to the iterative construction of a liquid water data set and compare the quality of simulated self-diffusion coefficients for models trained with molecular dynamics and ERBS data. Further, we active-learn models for liquid water with and without enhanced sampling and compare the quality of simulated self-diffusion coefficients. The self-diffusion coefficients closely match those simulated with a reference model at a significantly reduced data set size. Finally, we compare the sampling behavior of enhanced sampling methods by benchmarking the mean squared displacements of BMIM+BF4– trajectories simulated with uncertainty-driven dynamics and ERBS and find that the latter significantly increases the exploration of configurational space.

在这项工作中，我们提出了增强的基于表示的采样（ERBS），这是一种新的增强采样方法，旨在为机器学习的原子间势生成结构多样的训练数据集。ERBS通过原子描述符的降维自动识别集体变量，并应用由实时概率增强采样框架激发的偏差势。我们强调高斯矩描述符捕捉集体分子运动的能力，并利用丙氨酸二肽作为基准系统探索偏化参数的影响。我们证明了自由能表面可以重构高保真度，仅使用短的偏置轨迹作为训练数据。此外，我们将该方法应用于液态水数据集的迭代构建，并比较了用分子动力学和ERBS数据训练的模型的模拟自扩散系数的质量。此外，我们主动学习了有和没有增强采样的液态水模型，并比较了模拟自扩散系数的质量。自扩散系数与参考模型在显著减小的数据集大小下的模拟结果非常接近。最后，通过对不确定性驱动动力学和ERBS模拟的BMIM+BF4 -轨迹的均方位移进行基准比较，比较了增强采样方法的采样行为，发现后者显著增加了对构型空间的探索。

{"title":"Enhanced Representation-Based Sampling for the Efficient Generation of Data Sets for Machine-Learned Interatomic Potentials","authors":"Moritz R. Schäfer,Johannes Kästner","doi":"10.1021/acs.jctc.5c01767","DOIUrl":"https://doi.org/10.1021/acs.jctc.5c01767","url":null,"abstract":"In this work, we present enhanced representation-based sampling (ERBS), a novel enhanced sampling method designed to generate structurally diverse training data sets for machine-learned interatomic potentials. ERBS automatically identifies collective variables by dimensionality reduction of atomic descriptors and applies a bias potential inspired by the On-the-Fly probability enhanced sampling framework. We highlight the ability of Gaussian moment descriptors to capture collective molecular motions and explore the impact of biasing parameters using alanine dipeptide as a benchmark system. We show that free energy surfaces can be reconstructed with high fidelity using only short biased trajectories as training data. Further, we apply the method to the iterative construction of a liquid water data set and compare the quality of simulated self-diffusion coefficients for models trained with molecular dynamics and ERBS data. Further, we active-learn models for liquid water with and without enhanced sampling and compare the quality of simulated self-diffusion coefficients. The self-diffusion coefficients closely match those simulated with a reference model at a significantly reduced data set size. Finally, we compare the sampling behavior of enhanced sampling methods by benchmarking the mean squared displacements of BMIM+BF4– trajectories simulated with uncertainty-driven dynamics and ERBS and find that the latter significantly increases the exploration of configurational space.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"8 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Journal of Chemical Theory and Computation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀