The two-electron reduced density matrix (2RDM) carries enough information to evaluate the electronic energy of a many-electron system. The variational 2RDM (v2RDM) approach seeks to determine the 2RDM directly, without knowledge of the wave function, by minimizing this energy with respect to variations in the elements of the 2RDM, while also enforcing known N-representability conditions. In this tutorial review, we provide an overview of the theoretical underpinnings of the v2RDM approach and the N-representability constraints that are typically applied to the 2RDM. We also discuss the semidefinite programming (SDP) techniques used in v2RDM computations and provide enough Python code to develop a working v2RDM code that interfaces to the libSDP library of SDP solvers.
{"title":"Variational determination of the two-electron reduced density matrix: A tutorial review","authors":"A. Eugene DePrince III","doi":"10.1002/wcms.1702","DOIUrl":"https://doi.org/10.1002/wcms.1702","url":null,"abstract":"<p>The two-electron reduced density matrix (2RDM) carries enough information to evaluate the electronic energy of a many-electron system. The variational 2RDM (v2RDM) approach seeks to determine the 2RDM directly, without knowledge of the wave function, by minimizing this energy with respect to variations in the elements of the 2RDM, while also enforcing known <i>N</i>-representability conditions. In this tutorial review, we provide an overview of the theoretical underpinnings of the v2RDM approach and the <i>N</i>-representability constraints that are typically applied to the 2RDM. We also discuss the semidefinite programming (SDP) techniques used in v2RDM computations and provide enough Python code to develop a working v2RDM code that interfaces to the <span>libSDP</span> library of SDP solvers.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139488607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Löffelsender, Pierre Beaujean, Marc de Wergifosse
The cover image is based on the Advanced Review Simplifi ed quantum chemistry methods to evaluate non-linear optical properties of large systems by Sarah Löffelsender et al., https://doi.org/10.1002/wcms.1695
封面图片根据 Sarah Löffelsender 等人的《高级评论:评估大型系统非线性光学特性的简化量子化学方法》(Advanced Review Simplifi ed quantum chemistry methods to evaluate non-linear optical properties of large systems)https://doi.org/10.1002/wcms.1695。
{"title":"Cover Image, Volume 14, Issue 1","authors":"Sarah Löffelsender, Pierre Beaujean, Marc de Wergifosse","doi":"10.1002/wcms.1709","DOIUrl":"https://doi.org/10.1002/wcms.1709","url":null,"abstract":"<p>The cover image is based on the Advanced Review <i>Simplifi ed quantum chemistry methods to evaluate non-linear optical properties of large systems</i> by Sarah Löffelsender et al., https://doi.org/10.1002/wcms.1695\u0000 \u0000 <figure>\u0000 <div><picture>\u0000 <source></source></picture><p></p>\u0000 </div>\u0000 </figure></p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":16.8,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1709","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142435061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ultrafast electron dynamics have made rapid progress in the last few years. With Jellyfish, we now introduce a program suite that enables to perform the entire workflow of an electron-dynamics simulation. The modular program architecture offers a flexible combination of different propagators, Hamiltonians, basis sets, and more. Jellyfish can be operated by a graphical user interface, which makes it easy to get started for nonspecialist users and gives experienced users a clear overview of the entire functionality. The temporal evolution of a wave function can currently be executed in the time-dependent configuration interaction method (TDCI) formalism, however, a plugin system facilitates the expansion to other methods and tools without requiring in-depth knowledge of the program. Currently developed plugins allow to include results from conventional electronic structure calculations as well as the usage and extension of quantum-compute algorithms for electron dynamics. We present the capabilities of Jellyfish on three examples to showcase the simulation and analysis of light-driven correlated electron dynamics. The implemented visualization of various densities enables an efficient and detailed analysis for the long-standing quest of the electron–hole pair formation.
{"title":"Jellyfish: A modular code for wave function-based electron dynamics simulations and visualizations on traditional and quantum compute architectures","authors":"Fabian Langkabel, Pascal Krause, Annika Bande","doi":"10.1002/wcms.1696","DOIUrl":"10.1002/wcms.1696","url":null,"abstract":"<p>Ultrafast electron dynamics have made rapid progress in the last few years. With Jellyfish, we now introduce a program suite that enables to perform the entire workflow of an electron-dynamics simulation. The modular program architecture offers a flexible combination of different propagators, Hamiltonians, basis sets, and more. Jellyfish can be operated by a graphical user interface, which makes it easy to get started for nonspecialist users and gives experienced users a clear overview of the entire functionality. The temporal evolution of a wave function can currently be executed in the time-dependent configuration interaction method (TDCI) formalism, however, a plugin system facilitates the expansion to other methods and tools without requiring in-depth knowledge of the program. Currently developed plugins allow to include results from conventional electronic structure calculations as well as the usage and extension of quantum-compute algorithms for electron dynamics. We present the capabilities of Jellyfish on three examples to showcase the simulation and analysis of light-driven correlated electron dynamics. The implemented visualization of various densities enables an efficient and detailed analysis for the long-standing quest of the electron–hole pair formation.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1696","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Molecules are complex dynamic objects that can exist in different molecular forms (conformations, tautomers, stereoisomers, protonation states, etc.) and often it is not known which molecular form is responsible for observed physicochemical and biological properties of a given molecule. This raises the problem of the selection of the correct molecular form for machine learning modeling of target properties. The same problem is common to biological molecules (RNA, DNA, proteins)—long sequences where only key segments, which often cannot be located precisely, are involved in biological functions. Multi-instance machine learning (MIL) is an efficient approach for solving problems where objects under study cannot be uniquely represented by a single instance, but rather by a set of multiple alternative instances. Multi-instance learning was formalized in 1997 and motivated by the problem of conformation selection in drug activity prediction tasks. Since then MIL has found a lot of applications in various domains, such as information retrieval, computer vision, signal processing, bankruptcy prediction, and so on. In the given review we describe the MIL framework and its applications to the tasks associated with ambiguity in the representation of small and biological molecules in chemoinformatics and bioinformatics. We have collected examples that demonstrate the advantages of MIL over the traditional single-instance learning (SIL) approach. Special attention was paid to the ability of MIL models to identify key instances responsible for a modeling property.
{"title":"Chemical complexity challenge: Is multi-instance machine learning a solution?","authors":"Dmitry Zankov, Timur Madzhidov, Alexandre Varnek, Pavel Polishchuk","doi":"10.1002/wcms.1698","DOIUrl":"10.1002/wcms.1698","url":null,"abstract":"<p>Molecules are complex dynamic objects that can exist in different molecular forms (conformations, tautomers, stereoisomers, protonation states, etc.) and often it is not known which molecular form is responsible for observed physicochemical and biological properties of a given molecule. This raises the problem of the selection of the correct molecular form for machine learning modeling of target properties. The same problem is common to biological molecules (RNA, DNA, proteins)—long sequences where only key segments, which often cannot be located precisely, are involved in biological functions. Multi-instance machine learning (MIL) is an efficient approach for solving problems where objects under study cannot be uniquely represented by a single instance, but rather by a set of multiple alternative instances. Multi-instance learning was formalized in 1997 and motivated by the problem of conformation selection in drug activity prediction tasks. Since then MIL has found a lot of applications in various domains, such as information retrieval, computer vision, signal processing, bankruptcy prediction, and so on. In the given review we describe the MIL framework and its applications to the tasks associated with ambiguity in the representation of small and biological molecules in chemoinformatics and bioinformatics. We have collected examples that demonstrate the advantages of MIL over the traditional single-instance learning (SIL) approach. Special attention was paid to the ability of MIL models to identify key instances responsible for a modeling property.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1698","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peptide-based drugs offer high specificity, potency, and selectivity. However, their inherent flexibility and differences in conformational preferences between their free and bound states create unique challenges that have hindered progress in effective drug discovery pipelines. The emergence of AlphaFold (AF) and Artificial Intelligence (AI) presents new opportunities for enhancing peptide-based drug discovery. We explore recent advancements that facilitate a successful peptide drug discovery pipeline, considering peptides' attractive therapeutic properties and strategies to enhance their stability and bioavailability. AF enables efficient and accurate prediction of peptide-protein structures, addressing a critical requirement in computational drug discovery pipelines. In the post-AF era, we are witnessing rapid progress with the potential to revolutionize peptide-based drug discovery such as the ability to rank peptide binders or classify them as binders/non-binders and the ability to design novel peptide sequences. However, AI-based methods are struggling due to the lack of well-curated datasets, for example to accommodate modified amino acids or unconventional cyclization. Thus, physics-based methods, such as docking or molecular dynamics simulations, continue to hold a complementary role in peptide drug discovery pipelines. Moreover, MD-based tools offer valuable insights into binding mechanisms, as well as the thermodynamic and kinetic properties of complexes. As we navigate this evolving landscape, a synergistic integration of AI and physics-based methods holds the promise of reshaping the landscape of peptide-based drug discovery.
This article is categorized under:
肽类药物具有高特异性、高效力和高选择性。然而,多肽固有的灵活性以及游离态和结合态之间构象偏好的差异带来了独特的挑战,阻碍了有效药物发现管道的进展。阿尔法折叠(AlphaFold,AF)和人工智能(Artificial Intelligence,AI)的出现为加强基于多肽的药物发现带来了新的机遇。考虑到多肽极具吸引力的治疗特性以及提高其稳定性和生物利用度的策略,我们将探讨促进多肽药物研发管道取得成功的最新进展。AF 能够高效、准确地预测多肽-蛋白质结构,满足了计算药物发现管道的关键要求。在后 AF 时代,我们目睹了快速的进步,这些进步有可能彻底改变基于多肽的药物发现,例如对多肽结合体进行排序或将其分类为结合体/非结合体的能力,以及设计新型多肽序列的能力。然而,基于人工智能的方法由于缺乏完善的数据集而举步维艰,例如,无法适应修饰氨基酸或非常规环化。因此,基于物理的方法,如对接或分子动力学模拟,在多肽药物发现管道中仍起着补充作用。此外,基于 MD 的工具还能提供有关结合机制以及复合物热力学和动力学特性的宝贵见解。在我们驾驭这种不断变化的格局时,人工智能和基于物理学的方法的协同整合有望重塑多肽药物发现的格局:
{"title":"Revolutionizing peptide-based drug discovery: Advances in the post-AlphaFold era","authors":"Liwei Chang, Arup Mondal, Bhumika Singh, Yisel Martínez-Noa, Alberto Perez","doi":"10.1002/wcms.1693","DOIUrl":"10.1002/wcms.1693","url":null,"abstract":"<p>Peptide-based drugs offer high specificity, potency, and selectivity. However, their inherent flexibility and differences in conformational preferences between their free and bound states create unique challenges that have hindered progress in effective drug discovery pipelines. The emergence of AlphaFold (AF) and Artificial Intelligence (AI) presents new opportunities for enhancing peptide-based drug discovery. We explore recent advancements that facilitate a successful peptide drug discovery pipeline, considering peptides' attractive therapeutic properties and strategies to enhance their stability and bioavailability. AF enables efficient and accurate prediction of peptide-protein structures, addressing a critical requirement in computational drug discovery pipelines. In the post-AF era, we are witnessing rapid progress with the potential to revolutionize peptide-based drug discovery such as the ability to rank peptide binders or classify them as binders/non-binders and the ability to design novel peptide sequences. However, AI-based methods are struggling due to the lack of well-curated datasets, for example to accommodate modified amino acids or unconventional cyclization. Thus, physics-based methods, such as docking or molecular dynamics simulations, continue to hold a complementary role in peptide drug discovery pipelines. Moreover, MD-based tools offer valuable insights into binding mechanisms, as well as the thermodynamic and kinetic properties of complexes. As we navigate this evolving landscape, a synergistic integration of AI and physics-based methods holds the promise of reshaping the landscape of peptide-based drug discovery.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2023-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135036898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This review presents the theoretical background concerning simplified quantum chemistry (sQC) methods to compute non-linear optical (NLO) properties and their applications to large systems. To evaluate any NLO responses such as hyperpolarizabilities or two-photon absorption (2PA), one should evidently perform first a ground state calculation and compute its response. Because of this, methods used to compute ground states of large systems are outlined, especially the xTB (extended tight-binding) scheme. An overview on approaches to compute excited state and response properties is given, emphasizing the simplified time-dependent density functional theory (sTD-DFT). The formalism of the eXact integral sTD-DFT (XsTD-DFT) method is also introduced. For the first hyperpolarizability, 2PA, excited state absorption, and second hyperpolarizability, a brief historical review is given on early-stage semi-empirical method applications to systems that were considered large at the time. Then, we showcase recent applications with sQC methods, especially the sTD-DFT scheme to large challenging systems such as fluorescent proteins or fluorescent organic nanoparticles as well as dynamic structural effects on flexible tryptophan-rich peptides and gramicidin A. Thanks to the sTD-DFT-xTB scheme, all-atom quantum chemistry methodologies are now possible for the computation of the first hyperpolarizability and 2PA of systems up to 5000 atoms. This review concludes by summing-up current and future method developments in the sQC framework as well as forthcoming applications on large systems.
{"title":"Simplified quantum chemistry methods to evaluate non-linear optical properties of large systems","authors":"Sarah Löffelsender, Pierre Beaujean, Marc de Wergifosse","doi":"10.1002/wcms.1695","DOIUrl":"10.1002/wcms.1695","url":null,"abstract":"<p>This review presents the theoretical background concerning simplified quantum chemistry (sQC) methods to compute non-linear optical (NLO) properties and their applications to large systems. To evaluate any NLO responses such as hyperpolarizabilities or two-photon absorption (2PA), one should evidently perform first a ground state calculation and compute its response. Because of this, methods used to compute ground states of large systems are outlined, especially the xTB (extended tight-binding) scheme. An overview on approaches to compute excited state and response properties is given, emphasizing the simplified time-dependent density functional theory (sTD-DFT). The formalism of the eXact integral sTD-DFT (XsTD-DFT) method is also introduced. For the first hyperpolarizability, 2PA, excited state absorption, and second hyperpolarizability, a brief historical review is given on early-stage semi-empirical method applications to systems that were considered large at the time. Then, we showcase recent applications with sQC methods, especially the sTD-DFT scheme to large challenging systems such as fluorescent proteins or fluorescent organic nanoparticles as well as dynamic structural effects on flexible tryptophan-rich peptides and gramicidin A. Thanks to the sTD-DFT-xTB scheme, all-atom quantum chemistry methodologies are now possible for the computation of the first hyperpolarizability and 2PA of systems up to 5000 atoms. This review concludes by summing-up current and future method developments in the sQC framework as well as forthcoming applications on large systems.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2023-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135726599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nucleation is the initial step in the formation of crystalline materials from solutions. Various factors, such as environmental conditions, composition, and external fields, can influence its outcomes and rates. Indeed, controlling this rate-determining step toward phase separation is critical, as it can significantly impact the resulting material's structure and properties. Atomistic simulations can be exploited to gain insight into nucleation mechanisms—an aspect difficult to ascertain in experiments—and estimate nucleation rates. However, the microscopic nature of simulations can influence the phase behavior of nucleating solutions when compared to macroscale counterparts. An additional challenge arises from the inadequate timescales accessible to standard molecular simulations to simulate nucleation directly; this is due to the inherent rareness of nucleation events, which may be apparent in silico at even high supersaturations. In recent decades, molecular simulation methods have emerged to circumvent length- and timescale limitations. However, it is not always clear which simulation method is most suitable to study crystal nucleation from solution. This review surveys recent advances in this field, shedding light on typical nucleation mechanisms and the appropriateness of various simulation techniques for their study. Our goal is to provide a deeper understanding of the complexities associated with modeling crystal nucleation from solution and identify areas for further research. This review targets researchers across various scientific domains, including materials science, chemistry, physics and engineering, and aims to foster collaborative efforts to develop new strategies to understand and control nucleation.
{"title":"Molecular simulation approaches to study crystal nucleation from solutions: Theoretical considerations and computational challenges","authors":"Aaron R. Finney, Matteo Salvalaglio","doi":"10.1002/wcms.1697","DOIUrl":"10.1002/wcms.1697","url":null,"abstract":"<p>Nucleation is the initial step in the formation of crystalline materials from solutions. Various factors, such as environmental conditions, composition, and external fields, can influence its outcomes and rates. Indeed, controlling this rate-determining step toward phase separation is critical, as it can significantly impact the resulting material's structure and properties. Atomistic simulations can be exploited to gain insight into nucleation mechanisms—an aspect difficult to ascertain in experiments—and estimate nucleation rates. However, the microscopic nature of simulations can influence the phase behavior of nucleating solutions when compared to macroscale counterparts. An additional challenge arises from the inadequate timescales accessible to standard molecular simulations to simulate nucleation directly; this is due to the inherent rareness of nucleation events, which may be apparent in silico at even high supersaturations. In recent decades, molecular simulation methods have emerged to circumvent length- and timescale limitations. However, it is not always clear which simulation method is most suitable to study crystal nucleation from solution. This review surveys recent advances in this field, shedding light on typical nucleation mechanisms and the appropriateness of various simulation techniques for their study. Our goal is to provide a deeper understanding of the complexities associated with modeling crystal nucleation from solution and identify areas for further research. This review targets researchers across various scientific domains, including materials science, chemistry, physics and engineering, and aims to foster collaborative efforts to develop new strategies to understand and control nucleation.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1697","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135272408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The resolution-of-the-identity (RI) or density fitting (DF) approximation for the electron repulsion integrals (ERIs) has become a standard component of accelerated and reduced-scaling implementations of first-principles Gaussian-type orbital electronic-structure methods. The Cholesky decomposition (CD) of the ERIs has also become increasingly deployed across quantum chemistry packages in the last decade, even though its early applications were mostly limited to high-accuracy methods such as coupled-cluster theory and multiconfigurational approaches. Starting with a summary of the basic theory underpinning both the CD and RI/DF approximations, thus underlining the extremely close relation of the CD and RI/DF techniques, we provide a brief and largely chronological review of the evolution of the CD approach from its birth in 1977 to its current state. In addition to being a purely numerical procedure for handling ERIs, thus providing robust and computationally efficient approximations to the exact ERIs that have been found increasingly useful on modern computer platforms, CD also offers highly accurate approaches for generating auxiliary basis sets for the RI/DF approximation on the fly due to the deep mathematical connection between the two approaches. In this review, we aim to provide a concise reference of the main techniques employed in various CD approaches in electronic structure theory, to exemplify the connection between the CD and RI/DF approaches, and to clarify the state of the art to guide new implementations of CD approaches across electronic structure programs.
This article is categorized under:
电子斥力积分(ERIs)的同一性解析(RI)或密度拟合(DF)近似已成为第一原理高斯轨道电子结构方法加速和缩减缩放实施的标准组成部分。ERIs的Cholesky分解(CD)在过去十年中也越来越多地应用于量子化学软件包中,尽管其早期应用主要局限于高精度方法,如耦合簇理论和多配置方法。我们首先总结了 CD 和 RI/DF 近似的基础理论,从而强调了 CD 和 RI/DF 技术之间极为密切的关系,然后按时间顺序简要回顾了 CD 方法从 1977 年诞生到现在的演变过程。CD 是一种处理 ERI 的纯数值程序,可为精确 ERI 提供稳健且计算效率高的近似值,在现代计算机平台上越来越有用;此外,由于 RI/DF 近似与 CD 两种方法之间存在深层数学联系,CD 还可为 RI/DF 近似提供高精度的辅助基集生成方法。在这篇综述中,我们旨在简明扼要地介绍电子结构理论中各种 CD 方法所采用的主要技术,举例说明 CD 和 RI/DF 方法之间的联系,并阐明目前的技术水平,以指导电子结构程序中 CD 方法的新实施:
{"title":"The versatility of the Cholesky decomposition in electronic structure theory","authors":"Thomas Bondo Pedersen, Susi Lehtola, Ignacio Fdez. Galván, Roland Lindh","doi":"10.1002/wcms.1692","DOIUrl":"10.1002/wcms.1692","url":null,"abstract":"<p>The resolution-of-the-identity (RI) or density fitting (DF) approximation for the electron repulsion integrals (ERIs) has become a standard component of accelerated and reduced-scaling implementations of first-principles Gaussian-type orbital electronic-structure methods. The Cholesky decomposition (CD) of the ERIs has also become increasingly deployed across quantum chemistry packages in the last decade, even though its early applications were mostly limited to high-accuracy methods such as coupled-cluster theory and multiconfigurational approaches. Starting with a summary of the basic theory underpinning both the CD and RI/DF approximations, thus underlining the extremely close relation of the CD and RI/DF techniques, we provide a brief and largely chronological review of the evolution of the CD approach from its birth in 1977 to its current state. In addition to being a purely numerical procedure for handling ERIs, thus providing robust and computationally efficient approximations to the exact ERIs that have been found increasingly useful on modern computer platforms, CD also offers highly accurate approaches for generating auxiliary basis sets for the RI/DF approximation on the fly due to the deep mathematical connection between the two approaches. In this review, we aim to provide a concise reference of the main techniques employed in various CD approaches in electronic structure theory, to exemplify the connection between the CD and RI/DF approaches, and to clarify the state of the art to guide new implementations of CD approaches across electronic structure programs.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1692","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135215975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Retrosynthesis is the cornerstone of organic chemistry, providing chemists in material and drug manufacturing access to poorly available and brand-new molecules. Conventional rule-based or expert-based computer-aided synthesis has obvious limitations, such as high labor costs and limited search space. In recent years, dramatic breakthroughs driven by deep learning have revolutionized retrosynthesis. Here we aim to present a comprehensive review of recent advances in AI-based retrosynthesis. For single-step and multi-step retrosynthesis both, we first introduce their goal and provide a thorough taxonomy of existing methods. Afterwards, we analyze these methods in terms of their mechanism and performance, and introduce popular evaluation metrics for them, in which we also provide a detailed comparison among representative methods on several public datasets. In the next part, we introduce popular databases and established platforms for retrosynthesis. Finally, this review concludes with a discussion about promising research directions in this field.
{"title":"Recent advances in deep learning for retrosynthesis","authors":"Zipeng Zhong, Jie Song, Zunlei Feng, Tiantao Liu, Lingxiang Jia, Shaolun Yao, Tingjun Hou, Mingli Song","doi":"10.1002/wcms.1694","DOIUrl":"10.1002/wcms.1694","url":null,"abstract":"<p>Retrosynthesis is the cornerstone of organic chemistry, providing chemists in material and drug manufacturing access to poorly available and brand-new molecules. Conventional rule-based or expert-based computer-aided synthesis has obvious limitations, such as high labor costs and limited search space. In recent years, dramatic breakthroughs driven by deep learning have revolutionized retrosynthesis. Here we aim to present a comprehensive review of recent advances in AI-based retrosynthesis. For single-step and multi-step retrosynthesis both, we first introduce their goal and provide a thorough taxonomy of existing methods. Afterwards, we analyze these methods in terms of their mechanism and performance, and introduce popular evaluation metrics for them, in which we also provide a detailed comparison among representative methods on several public datasets. In the next part, we introduce popular databases and established platforms for retrosynthesis. Finally, this review concludes with a discussion about promising research directions in this field.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135570895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
All processes involving molecular systems entail a balance between associated enthalpic and entropic changes. Molecular dynamics simulations of the end-points of a process provide in a straightforward way the enthalpy as an ensemble average. Obtaining absolute entropies is still an open problem and most commonly pathway methods are used to obtain free energy changes and thereafter entropy changes. The kth nearest neighbor (kNN) method has been first proposed as a general method for entropy estimation in the mathematical community 20 years ago. Later, it has been applied to compute conformational, positional–orientational, and hydration entropies of molecules. Programs to compute entropies from molecular ensembles, for example, from molecular dynamics (MD) trajectories, based on the kNN method, are currently available. The kNN method has distinct advantages over traditional methods, namely that it is possible to address high-dimensional spaces, impossible to treat without loss of resolution or drastic approximations with, for example, histogram-based methods. Application of the method requires understanding the features of: the kth nearest neighbor method for entropy estimation; the variables relevant to biomolecular and in general molecular processes; the metrics associated with such variables; the practical implementation of the method, including requirements and limitations intrinsic to the method; and the applications for conformational, position/orientation and solvation entropy. Coupling the method with general approximations for the multivariable entropy based on mutual information, it is possible to address high dimensional problems like those involving the conformation of proteins, nucleic acids, binding of molecules and hydration.
This article is categorized under:
所有涉及分子系统的过程都需要在相关的焓变和熵变之间取得平衡。对一个过程的终点进行分子动力学模拟,可以直接获得焓的集合平均值。获得绝对熵仍是一个有待解决的问题,最常用的方法是通过路径来获得自由能变化,进而获得熵变化。20 年前,数学界首次提出 kth 近邻法(kNN)作为熵估算的通用方法。后来,它被用于计算分子的构象熵、位置取向熵和水合熵。目前已有基于 kNN 方法的从分子集合(例如从分子动力学(MD)轨迹)计算熵的程序。与传统方法相比,kNN 方法具有明显的优势,即它可以处理高维空间,而使用基于直方图等的方法则不可能在不损失分辨率或大幅逼近的情况下处理高维空间。应用该方法需要了解以下方面的特点:熵估算的第 k 次近邻法;与生物分子和一般分子过程相关的变量;与这些变量相关的度量;该方法的实际应用,包括该方法的内在要求和限制;以及构象熵、位置/方位熵和溶解熵的应用。将该方法与基于互信息的多变量熵的一般近似值相结合,可以解决高维问题,如涉及蛋白质、核酸、分子结合和水合的构象问题:
{"title":"The kth nearest neighbor method for estimation of entropy changes from molecular ensembles","authors":"Federico Fogolari, Roberto Borelli, Agostino Dovier, Gennaro Esposito","doi":"10.1002/wcms.1691","DOIUrl":"10.1002/wcms.1691","url":null,"abstract":"<p>All processes involving molecular systems entail a balance between associated enthalpic and entropic changes. Molecular dynamics simulations of the end-points of a process provide in a straightforward way the enthalpy as an ensemble average. Obtaining absolute entropies is still an open problem and most commonly pathway methods are used to obtain free energy changes and thereafter entropy changes. The <i>k</i>th nearest neighbor (kNN) method has been first proposed as a general method for entropy estimation in the mathematical community 20 years ago. Later, it has been applied to compute conformational, positional–orientational, and hydration entropies of molecules. Programs to compute entropies from molecular ensembles, for example, from molecular dynamics (MD) trajectories, based on the kNN method, are currently available. The kNN method has distinct advantages over traditional methods, namely that it is possible to address high-dimensional spaces, impossible to treat without loss of resolution or drastic approximations with, for example, histogram-based methods. Application of the method requires understanding the features of: the <i>k</i>th nearest neighbor method for entropy estimation; the variables relevant to biomolecular and in general molecular processes; the metrics associated with such variables; the practical implementation of the method, including requirements and limitations intrinsic to the method; and the applications for conformational, position/orientation and solvation entropy. Coupling the method with general approximations for the multivariable entropy based on mutual information, it is possible to address high dimensional problems like those involving the conformation of proteins, nucleic acids, binding of molecules and hydration.</p><p>This article is categorized under:\u0000 </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"14 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1691","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135792985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}