Pub Date : 2024-10-10DOI: 10.1038/s43588-024-00707-3
Attila Cangi
A highly efficient reconstruction method has been developed for the direct computation of Hamiltonian matrices in the atomic orbital basis from density functional theory calculations originally performed in the plane wave basis. This enables machine learning calculations of electronic structures on a large scale, which are otherwise not feasible with standard methods, and thus fills a methodological gap in terms of accessible length scales.
{"title":"Bridging the gap in electronic structure calculations via machine learning","authors":"Attila Cangi","doi":"10.1038/s43588-024-00707-3","DOIUrl":"10.1038/s43588-024-00707-3","url":null,"abstract":"A highly efficient reconstruction method has been developed for the direct computation of Hamiltonian matrices in the atomic orbital basis from density functional theory calculations originally performed in the plane wave basis. This enables machine learning calculations of electronic structures on a large scale, which are otherwise not feasible with standard methods, and thus fills a methodological gap in terms of accessible length scales.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 10","pages":"729-730"},"PeriodicalIF":12.0,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03DOI: 10.1038/s43588-024-00701-9
Xiaoxun Gong, Steven G. Louie, Wenhui Duan, Yong Xu
Deep neural networks capable of representing the density functional theory (DFT) Hamiltonian as a function of material structure hold great promise for revolutionizing future electronic structure calculations. However, a notable limitation of previous neural networks is their compatibility solely with the atomic-orbital (AO) basis, excluding the widely used plane-wave (PW) basis. Here we overcome this critical limitation by proposing an accurate and efficient real-space reconstruction method for directly computing AO Hamiltonian matrices from PW DFT results. The reconstruction method is orders of magnitude faster than traditional projection-based methods to convert PW results to the AO basis, and the reconstructed Hamiltonian matrices can faithfully reproduce the PW electronic structure, thus bridging the longstanding gap between the AO basis deep learning electronic structure approach and PW DFT. Advantages of the PW methods, such as high accuracy, high flexibility and wide applicability, thus can be all integrated into deep learning electronic structure methods without sacrificing these methods’ inherent benefits. This allows for the construction of large-scale and high-fidelity training datasets with the help of PW DFT results towards the development of precise and broadly applicable deep learning electronic structure models. Deep learning electronic structure calculations are generalized from the atomic-orbital basis to the plane-wave basis, resulting in higher accuracy, improved transferability and the capability to utilize existing electronic structure big data.
能够将密度泛函理论(DFT)哈密顿表示为材料结构函数的深度神经网络,为未来电子结构计算的变革带来了巨大希望。然而,以往神经网络的一个显著局限是只兼容原子轨道(AO)基础,而不兼容广泛使用的平面波(PW)基础。在此,我们提出了一种精确而高效的实空间重构方法,用于从 PW DFT 结果中直接计算 AO 哈密顿矩阵,从而克服了这一关键限制。与传统的基于投影的方法相比,这种重构方法将 PW 结果转换为 AO 基的速度快了几个数量级,而且重构的哈密顿矩阵可以忠实地再现 PW 电子结构,从而弥补了 AO 基深度学习电子结构方法与 PW DFT 之间长期存在的差距。因此,PW 方法的优势,如高精度、高灵活性和广泛适用性,可以在不牺牲这些方法固有优势的前提下,全部集成到深度学习电子结构方法中。这样,在 PW DFT 结果的帮助下,就可以构建大规模、高保真的训练数据集,从而开发出精确、广泛适用的深度学习电子结构模型。
{"title":"Generalizing deep learning electronic structure calculation to the plane-wave basis","authors":"Xiaoxun Gong, Steven G. Louie, Wenhui Duan, Yong Xu","doi":"10.1038/s43588-024-00701-9","DOIUrl":"10.1038/s43588-024-00701-9","url":null,"abstract":"Deep neural networks capable of representing the density functional theory (DFT) Hamiltonian as a function of material structure hold great promise for revolutionizing future electronic structure calculations. However, a notable limitation of previous neural networks is their compatibility solely with the atomic-orbital (AO) basis, excluding the widely used plane-wave (PW) basis. Here we overcome this critical limitation by proposing an accurate and efficient real-space reconstruction method for directly computing AO Hamiltonian matrices from PW DFT results. The reconstruction method is orders of magnitude faster than traditional projection-based methods to convert PW results to the AO basis, and the reconstructed Hamiltonian matrices can faithfully reproduce the PW electronic structure, thus bridging the longstanding gap between the AO basis deep learning electronic structure approach and PW DFT. Advantages of the PW methods, such as high accuracy, high flexibility and wide applicability, thus can be all integrated into deep learning electronic structure methods without sacrificing these methods’ inherent benefits. This allows for the construction of large-scale and high-fidelity training datasets with the help of PW DFT results towards the development of precise and broadly applicable deep learning electronic structure models. Deep learning electronic structure calculations are generalized from the atomic-orbital basis to the plane-wave basis, resulting in higher accuracy, improved transferability and the capability to utilize existing electronic structure big data.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 10","pages":"752-760"},"PeriodicalIF":12.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11499277/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01DOI: 10.1038/s43588-024-00710-8
Christine Yifeng Chen, Alan Christoffels, Roger Dube, Kamuela Enos, Juan E. Gilbert, Sanmi Koyejo, Jason Leigh, Carlo Liquido, Amy McKee, Kari Noe, Tai-Quan Peng, Karaitiana Taiuru
{"title":"Publisher Correction: Increasing the presence of BIPOC researchers in computational science","authors":"Christine Yifeng Chen, Alan Christoffels, Roger Dube, Kamuela Enos, Juan E. Gilbert, Sanmi Koyejo, Jason Leigh, Carlo Liquido, Amy McKee, Kari Noe, Tai-Quan Peng, Karaitiana Taiuru","doi":"10.1038/s43588-024-00710-8","DOIUrl":"10.1038/s43588-024-00710-8","url":null,"abstract":"","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 10","pages":"798-798"},"PeriodicalIF":12.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00710-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-27DOI: 10.1038/s43588-024-00697-2
Derek van Tilborg, Francesca Grisoni
Deep learning is accelerating drug discovery. However, current approaches are often affected by limitations in the available data, in terms of either size or molecular diversity. Active deep learning has high potential for low-data drug discovery, as it allows iterative model improvement during the screening process. However, there are several ‘known unknowns’ that limit the wider adoption of active deep learning in drug discovery: (1) what the best computational strategies are for chemical space exploration, (2) how active learning holds up to traditional, non-iterative, approaches and (3) how it should be used in the low-data scenarios typical of drug discovery. To provide answers, this study simulates a low-data drug discovery scenario, and systematically analyzes six active learning strategies combined with two deep learning architectures, on three large-scale molecular libraries. We identify the most important determinants of success in low-data regimes and show that active learning can achieve up to a sixfold improvement in hit discovery when compared with traditional screening methods. Active deep learning is a promising approach to learn from low-data scenarios in drug discovery. This study illuminates key success factors of active learning and shows that it can boost hit discovery by up to sixfold over traditional methods.
{"title":"Traversing chemical space with active deep learning for low-data drug discovery","authors":"Derek van Tilborg, Francesca Grisoni","doi":"10.1038/s43588-024-00697-2","DOIUrl":"10.1038/s43588-024-00697-2","url":null,"abstract":"Deep learning is accelerating drug discovery. However, current approaches are often affected by limitations in the available data, in terms of either size or molecular diversity. Active deep learning has high potential for low-data drug discovery, as it allows iterative model improvement during the screening process. However, there are several ‘known unknowns’ that limit the wider adoption of active deep learning in drug discovery: (1) what the best computational strategies are for chemical space exploration, (2) how active learning holds up to traditional, non-iterative, approaches and (3) how it should be used in the low-data scenarios typical of drug discovery. To provide answers, this study simulates a low-data drug discovery scenario, and systematically analyzes six active learning strategies combined with two deep learning architectures, on three large-scale molecular libraries. We identify the most important determinants of success in low-data regimes and show that active learning can achieve up to a sixfold improvement in hit discovery when compared with traditional screening methods. Active deep learning is a promising approach to learn from low-data scenarios in drug discovery. This study illuminates key success factors of active learning and shows that it can boost hit discovery by up to sixfold over traditional methods.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 10","pages":"786-796"},"PeriodicalIF":12.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-27DOI: 10.1038/s43588-024-00698-1
Yicheng Gao, Zhiting Wei, Kejing Dong, Ke Chen, Jingya Yang, Guohui Chuai, Qi Liu
Deciphering cellular responses to genetic perturbations is fundamental for a wide array of biomedical applications. However, there are three main challenges: predicting single-genetic-perturbation outcomes, predicting multiple-genetic-perturbation outcomes and predicting genetic outcomes across cell lines. Here we introduce Subtask Decomposition Modeling for Genetic Perturbation Prediction (STAMP), a flexible artificial intelligence strategy for genetic perturbation outcome prediction and downstream applications. STAMP formulates genetic perturbation prediction as a subtask decomposition problem by resolving three progressive subtasks in a problem decomposition manner, that is, identifying postperturbation differentially expressed genes, determining the expression change directions of differentially expressed genes and finally estimating the magnitudes of gene expression changes. STAMP exhibits a substantial improvement over the existing approaches on three subtasks and beyond, including the ability to identify key regulatory genes and pathways on small samples and to reveal precise genetic interactions of diverse types. By employing the subtask decomposition strategy, STAMP outperforms existing models in single, multiple and cross-cell-line scenarios for genetic perturbation prediction, showing potential to uncover gene regulations and interactions.
{"title":"Toward subtask-decomposition-based learning and benchmarking for predicting genetic perturbation outcomes and beyond","authors":"Yicheng Gao, Zhiting Wei, Kejing Dong, Ke Chen, Jingya Yang, Guohui Chuai, Qi Liu","doi":"10.1038/s43588-024-00698-1","DOIUrl":"10.1038/s43588-024-00698-1","url":null,"abstract":"Deciphering cellular responses to genetic perturbations is fundamental for a wide array of biomedical applications. However, there are three main challenges: predicting single-genetic-perturbation outcomes, predicting multiple-genetic-perturbation outcomes and predicting genetic outcomes across cell lines. Here we introduce Subtask Decomposition Modeling for Genetic Perturbation Prediction (STAMP), a flexible artificial intelligence strategy for genetic perturbation outcome prediction and downstream applications. STAMP formulates genetic perturbation prediction as a subtask decomposition problem by resolving three progressive subtasks in a problem decomposition manner, that is, identifying postperturbation differentially expressed genes, determining the expression change directions of differentially expressed genes and finally estimating the magnitudes of gene expression changes. STAMP exhibits a substantial improvement over the existing approaches on three subtasks and beyond, including the ability to identify key regulatory genes and pathways on small samples and to reveal precise genetic interactions of diverse types. By employing the subtask decomposition strategy, STAMP outperforms existing models in single, multiple and cross-cell-line scenarios for genetic perturbation prediction, showing potential to uncover gene regulations and interactions.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 10","pages":"773-785"},"PeriodicalIF":12.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-24DOI: 10.1038/s43588-024-00693-6
Christine Yifeng Chen, Alan Christoffels, Roger Dube, Kamuela Enos, Juan E. Gilbert, Sanmi Koyeji, Jason Leigh, Carlo Liquido, Amy McKee, Kari Noe, Tai-Quan Peng, Karaitiana Taiuru
Nature Computational Science asked a group of scientists to discuss strategies for increasing the presence of Black, Indigenous, People of Color (BIPOC) researchers in computational science, as well as the various considerations to be made for improving education and methods design.
{"title":"Increasing the presence of BIPOC researchers in computational science","authors":"Christine Yifeng Chen, Alan Christoffels, Roger Dube, Kamuela Enos, Juan E. Gilbert, Sanmi Koyeji, Jason Leigh, Carlo Liquido, Amy McKee, Kari Noe, Tai-Quan Peng, Karaitiana Taiuru","doi":"10.1038/s43588-024-00693-6","DOIUrl":"10.1038/s43588-024-00693-6","url":null,"abstract":"Nature Computational Science asked a group of scientists to discuss strategies for increasing the presence of Black, Indigenous, People of Color (BIPOC) researchers in computational science, as well as the various considerations to be made for improving education and methods design.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 9","pages":"646-653"},"PeriodicalIF":12.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00693-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142317010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-24DOI: 10.1038/s43588-024-00667-8
Laetitia Gauvin
The widespread availability of digital traces capturing individuals’ daily mobility has the potential to enrich the understanding of the relationship between mobility, gender and socioeconomic factors. In fact, it has led to a heightened interest in deriving policy insights from these data. However, it is also essential to put the focus on methodological aspects to address the data gaps and biases.
{"title":"Gaps in gender and socioeconomic mobility disparity studies","authors":"Laetitia Gauvin","doi":"10.1038/s43588-024-00667-8","DOIUrl":"10.1038/s43588-024-00667-8","url":null,"abstract":"The widespread availability of digital traces capturing individuals’ daily mobility has the potential to enrich the understanding of the relationship between mobility, gender and socioeconomic factors. In fact, it has led to a heightened interest in deriving policy insights from these data. However, it is also essential to put the focus on methodological aspects to address the data gaps and biases.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 9","pages":"633-635"},"PeriodicalIF":12.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00667-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142317002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-24DOI: 10.1038/s43588-024-00676-7
Elaine O. Nsoesie, Marzyeh Ghassemi
The proliferation of artificial intelligence (AI) algorithms for public use has led to many creative healthcare applications, some with the potential to create or worsen health inequities. Here, we argue that similar to prescription medicine labels, AI algorithms should be accompanied by a responsible use label.
{"title":"Using labels to limit AI misuse in health","authors":"Elaine O. Nsoesie, Marzyeh Ghassemi","doi":"10.1038/s43588-024-00676-7","DOIUrl":"10.1038/s43588-024-00676-7","url":null,"abstract":"The proliferation of artificial intelligence (AI) algorithms for public use has led to many creative healthcare applications, some with the potential to create or worsen health inequities. Here, we argue that similar to prescription medicine labels, AI algorithms should be accompanied by a responsible use label.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 9","pages":"638-640"},"PeriodicalIF":12.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00676-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142317005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-24DOI: 10.1038/s43588-024-00692-7
Vinod Namboodiri
Navigating built environments can be a challenge for persons with disabilities. Emerging computational capabilities are promising to help by providing the right information at the right time in accessible formats.
{"title":"Harnessing the power of emerging computational capabilities for independent mobility for persons with disabilities","authors":"Vinod Namboodiri","doi":"10.1038/s43588-024-00692-7","DOIUrl":"10.1038/s43588-024-00692-7","url":null,"abstract":"Navigating built environments can be a challenge for persons with disabilities. Emerging computational capabilities are promising to help by providing the right information at the right time in accessible formats.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 9","pages":"636-637"},"PeriodicalIF":12.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00692-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142317004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}