Pub Date : 2024-07-22DOI: 10.1088/2632-2153/ad5f74
Xiaofei Guan, Xintong Wang, Hao Wu, Zihao Yang and Peng Yu
This paper presents an innovative approach to tackle Bayesian inverse problems using physics-informed invertible neural networks (PI-INN). Serving as a neural operator model, PI-INN employs an invertible neural network (INN) to elucidate the relationship between the parameter field and the solution function in latent variable spaces. Specifically, the INN decomposes the latent variable of the parameter field into two distinct components: the expansion coefficients that represent the solution to the forward problem, and the noise that captures the inherent uncertainty associated with the inverse problem. Through precise estimation of the forward mapping and preservation of statistical independence between expansion coefficients and latent noise, PI-INN offers an accurate and efficient generative model for resolving Bayesian inverse problems, even in the absence of labeled data. For a given solution function, PI-INN can provide tractable and accurate estimates of the posterior distribution of the underlying parameter field. Moreover, capitalizing on the INN’s characteristics, we propose a novel independent loss function to effectively ensure the independence of the INN’s decomposition results. The efficacy and precision of the proposed PI-INN are demonstrated through a series of numerical experiments.
本文提出了一种利用物理信息可逆神经网络(PI-INN)解决贝叶斯逆问题的创新方法。作为一种神经算子模型,PI-INN 利用可逆神经网络(INN)来阐明潜变量空间中参数场与解函数之间的关系。具体来说,INN 将参数场的潜变量分解为两个不同的部分:代表正向问题解决方案的扩展系数,以及捕捉与逆向问题相关的固有不确定性的噪声。通过精确估计前向映射以及保持扩展系数和潜在噪声之间的统计独立性,PI-INN 为解决贝叶斯逆问题提供了一个精确高效的生成模型,即使在没有标记数据的情况下也是如此。对于给定的求解函数,PI-INN 可以对底层参数场的后验分布提供简便而准确的估计。此外,利用 INN 的特点,我们提出了一种新的独立损失函数,以有效确保 INN 分解结果的独立性。我们通过一系列数值实验证明了所提出的 PI-INN 的有效性和精确性。
{"title":"Efficient Bayesian inference using physics-informed invertible neural networks for inverse problems","authors":"Xiaofei Guan, Xintong Wang, Hao Wu, Zihao Yang and Peng Yu","doi":"10.1088/2632-2153/ad5f74","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f74","url":null,"abstract":"This paper presents an innovative approach to tackle Bayesian inverse problems using physics-informed invertible neural networks (PI-INN). Serving as a neural operator model, PI-INN employs an invertible neural network (INN) to elucidate the relationship between the parameter field and the solution function in latent variable spaces. Specifically, the INN decomposes the latent variable of the parameter field into two distinct components: the expansion coefficients that represent the solution to the forward problem, and the noise that captures the inherent uncertainty associated with the inverse problem. Through precise estimation of the forward mapping and preservation of statistical independence between expansion coefficients and latent noise, PI-INN offers an accurate and efficient generative model for resolving Bayesian inverse problems, even in the absence of labeled data. For a given solution function, PI-INN can provide tractable and accurate estimates of the posterior distribution of the underlying parameter field. Moreover, capitalizing on the INN’s characteristics, we propose a novel independent loss function to effectively ensure the independence of the INN’s decomposition results. The efficacy and precision of the proposed PI-INN are demonstrated through a series of numerical experiments.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"214 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-21DOI: 10.1088/2632-2153/ad622f
Alessandro Bombini, Fernando García-Avello Bofías, Caterina Bracci, Michele Ginolfi and Chiara Ruberto
Extended vision techniques are ubiquitous in physics. However, the data cubes steaming from such analysis often pose a challenge in their interpretation, due to the intrinsic difficulty in discerning the relevant information from the spectra composing the data cube. Furthermore, the huge dimensionality of data cube spectra poses a complex task in its statistical interpretation; nevertheless, this complexity contains a massive amount of statistical information that can be exploited in an unsupervised manner to outline some essential properties of the case study at hand, e.g. it is possible to obtain an image segmentation via (deep) clustering of data-cube’s spectra, performed in a suitably defined low-dimensional embedding space. To tackle this topic, we explore the possibility of applying unsupervised clustering methods in encoded space, i.e. perform deep clustering on the spectral properties of datacube pixels. A statistical dimensional reduction is performed by an ad hoc trained (variational) AutoEncoder, in charge of mapping spectra into lower dimensional metric spaces, while the clustering process is performed by a (learnable) iterative K-means clustering algorithm. We apply this technique to two different use cases, of different physical origins: a set of macro mapping x-ray fluorescence (MA-XRF) synthetic data on pictorial artworks, and a dataset of simulated astrophysical observations.
扩展视觉技术在物理学中无处不在。然而,由于从组成数据立方体的光谱中辨别相关信息的内在困难,从此类分析中产生的数据立方体往往对其解释构成挑战。此外,数据立方体光谱的巨大维度也给统计解释带来了复杂的任务;然而,这种复杂性包含了大量的统计信息,可以在无监督的情况下利用这些信息来概述手头案例研究的一些基本属性,例如,可以通过在适当定义的低维嵌入空间中对数据立方体光谱进行(深度)聚类来获得图像分割。为了解决这个问题,我们探索了在编码空间中应用无监督聚类方法的可能性,即对数据立方体像素的光谱属性进行深度聚类。统计降维是通过一个经过特别训练的(变异)自动编码器来完成的,它负责将光谱映射到低维的度量空间中,而聚类过程则是通过一个(可学习的)迭代 K-means 聚类算法来完成的。我们将这一技术应用于两个不同的使用案例,它们的物理来源各不相同:一组关于绘画艺术品的宏观映射 X 射线荧光(MA-XRF)合成数据,以及一个模拟天体物理观测数据集。
{"title":"Datacube segmentation via deep spectral clustering","authors":"Alessandro Bombini, Fernando García-Avello Bofías, Caterina Bracci, Michele Ginolfi and Chiara Ruberto","doi":"10.1088/2632-2153/ad622f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad622f","url":null,"abstract":"Extended vision techniques are ubiquitous in physics. However, the data cubes steaming from such analysis often pose a challenge in their interpretation, due to the intrinsic difficulty in discerning the relevant information from the spectra composing the data cube. Furthermore, the huge dimensionality of data cube spectra poses a complex task in its statistical interpretation; nevertheless, this complexity contains a massive amount of statistical information that can be exploited in an unsupervised manner to outline some essential properties of the case study at hand, e.g. it is possible to obtain an image segmentation via (deep) clustering of data-cube’s spectra, performed in a suitably defined low-dimensional embedding space. To tackle this topic, we explore the possibility of applying unsupervised clustering methods in encoded space, i.e. perform deep clustering on the spectral properties of datacube pixels. A statistical dimensional reduction is performed by an ad hoc trained (variational) AutoEncoder, in charge of mapping spectra into lower dimensional metric spaces, while the clustering process is performed by a (learnable) iterative K-means clustering algorithm. We apply this technique to two different use cases, of different physical origins: a set of macro mapping x-ray fluorescence (MA-XRF) synthetic data on pictorial artworks, and a dataset of simulated astrophysical observations.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"32 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-18DOI: 10.1088/2632-2153/ad5a60
Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein and Gustau Camps-Valls
Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to achieve these purposes. This paper introduces a novel approach to estimating hybrid models via a causal inference framework, specifically employing double machine learning (DML) to estimate causal effects. We showcase its use for the Earth sciences on two problems related to carbon dioxide fluxes. In the Q10 model, we demonstrate that DML-based hybrid modeling is superior in estimating causal parameters over end-to-end deep neural network approaches, proving efficiency, robustness to bias from regularization methods, and circumventing equifinality. Our approach, applied to carbon flux partitioning, exhibits flexibility in accommodating heterogeneous causal effects. The study emphasizes the necessity of explicitly defining causal graphs and relationships, advocating for this as a general best practice. We encourage the continued exploration of causality in hybrid models for more interpretable and trustworthy results in knowledge-guided machine learning.
{"title":"Causal hybrid modeling with double machine learning—applications in carbon flux modeling","authors":"Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein and Gustau Camps-Valls","doi":"10.1088/2632-2153/ad5a60","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5a60","url":null,"abstract":"Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to achieve these purposes. This paper introduces a novel approach to estimating hybrid models via a causal inference framework, specifically employing double machine learning (DML) to estimate causal effects. We showcase its use for the Earth sciences on two problems related to carbon dioxide fluxes. In the Q10 model, we demonstrate that DML-based hybrid modeling is superior in estimating causal parameters over end-to-end deep neural network approaches, proving efficiency, robustness to bias from regularization methods, and circumventing equifinality. Our approach, applied to carbon flux partitioning, exhibits flexibility in accommodating heterogeneous causal effects. The study emphasizes the necessity of explicitly defining causal graphs and relationships, advocating for this as a general best practice. We encourage the continued exploration of causality in hybrid models for more interpretable and trustworthy results in knowledge-guided machine learning.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"18 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-18DOI: 10.1088/2632-2153/ad5f12
Johannes Nokkala, Gian Luca Giorgi and Roberta Zambrini
Machine learning techniques have achieved impressive results in recent years and the possibility of harnessing the power of quantum physics opens new promising avenues to speed up classical learning methods. Rather than viewing classical and quantum approaches as exclusive alternatives, their integration into hybrid designs has gathered increasing interest, as seen in variational quantum algorithms, quantum circuit learning, and kernel methods. Here we introduce deep hybrid classical-quantum reservoir computing for temporal processing of quantum states where information about, for instance, the entanglement or the purity of past input states can be extracted via a single-step measurement. We find that the hybrid setup cascading two reservoirs not only inherits the strengths of both of its constituents but is even more than just the sum of its parts, outperforming comparable non-hybrid alternatives. The quantum layer is within reach of state-of-the-art multimode quantum optical platforms while the classical layer can be implemented in silico.
{"title":"Retrieving past quantum features with deep hybrid classical-quantum reservoir computing","authors":"Johannes Nokkala, Gian Luca Giorgi and Roberta Zambrini","doi":"10.1088/2632-2153/ad5f12","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f12","url":null,"abstract":"Machine learning techniques have achieved impressive results in recent years and the possibility of harnessing the power of quantum physics opens new promising avenues to speed up classical learning methods. Rather than viewing classical and quantum approaches as exclusive alternatives, their integration into hybrid designs has gathered increasing interest, as seen in variational quantum algorithms, quantum circuit learning, and kernel methods. Here we introduce deep hybrid classical-quantum reservoir computing for temporal processing of quantum states where information about, for instance, the entanglement or the purity of past input states can be extracted via a single-step measurement. We find that the hybrid setup cascading two reservoirs not only inherits the strengths of both of its constituents but is even more than just the sum of its parts, outperforming comparable non-hybrid alternatives. The quantum layer is within reach of state-of-the-art multimode quantum optical platforms while the classical layer can be implemented in silico.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"22 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1088/2632-2153/ad5f10
Patrick Odagiu, Zhiqiang Que, Javier Duarte, Johannes Haller, Gregor Kasieczka, Artur Lobanov, Vladimir Loncar, Wayne Luk, Jennifer Ngadiuba, Maurizio Pierini, Philipp Rincke, Arpita Seksaria, Sioni Summers, Andre Sznajder, Alexander Tapper and Thea K Årrestad
Three machine learning models are used to perform jet origin classification. These models are optimized for deployment on a field-programmable gate array device. In this context, we demonstrate how latency and resource consumption scale with the input size and choice of algorithm. Moreover, the models proposed here are designed to work on the type of data and under the foreseen conditions at the CERN large hadron collider during its high-luminosity phase. Through quantization-aware training and efficient synthetization for a specific field programmable gate array, we show that ns inference of complex architectures such as Deep Sets and Interaction Networks is feasible at a relatively low computational resource cost.
{"title":"Ultrafast jet classification at the HL-LHC","authors":"Patrick Odagiu, Zhiqiang Que, Javier Duarte, Johannes Haller, Gregor Kasieczka, Artur Lobanov, Vladimir Loncar, Wayne Luk, Jennifer Ngadiuba, Maurizio Pierini, Philipp Rincke, Arpita Seksaria, Sioni Summers, Andre Sznajder, Alexander Tapper and Thea K Årrestad","doi":"10.1088/2632-2153/ad5f10","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f10","url":null,"abstract":"Three machine learning models are used to perform jet origin classification. These models are optimized for deployment on a field-programmable gate array device. In this context, we demonstrate how latency and resource consumption scale with the input size and choice of algorithm. Moreover, the models proposed here are designed to work on the type of data and under the foreseen conditions at the CERN large hadron collider during its high-luminosity phase. Through quantization-aware training and efficient synthetization for a specific field programmable gate array, we show that ns inference of complex architectures such as Deep Sets and Interaction Networks is feasible at a relatively low computational resource cost.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"50 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1088/2632-2153/ad6120
Gabriele Lo Monaco, Marco Bertini, Salvatore Lorenzo and G Massimo Palma
Quantum machine learning algorithms are expected to play a pivotal role in quantum chemistry simulations in the immediate future. One such key application is the training of a quantum neural network to learn the potential energy surface and force field of molecular systems. We address this task by using the quantum extreme learning machine paradigm. This particular supervised learning routine allows for resource-efficient training, consisting of a simple linear regression performed on a classical computer. We have tested a setup that can be used to study molecules of any dimension and is optimized for immediate use on NISQ devices with a limited number of native gates. We have applied this setup to three case studies: lithium hydride, water, and formamide, carrying out both noiseless simulations and actual implementation on IBM quantum hardware. Compared to other supervised learning routines, the proposed setup requires minimal quantum resources, making it feasible for direct implementation on quantum platforms, while still achieving a high level of predictive accuracy compared to simulations. Our encouraging results pave the way towards the future application to more complex molecules, being the proposed setup scalable.
在不久的将来,量子机器学习算法有望在量子化学模拟中发挥关键作用。其中一个关键应用是训练量子神经网络,以学习分子系统的势能面和力场。我们利用量子极端学习机范式来完成这项任务。这种特殊的监督学习程序允许进行资源节约型训练,包括在经典计算机上执行简单的线性回归。我们测试了一种可用于研究任何维度分子的设置,并对其进行了优化,以便在原生门数量有限的 NISQ 设备上立即使用。我们将这种设置应用于三个案例研究:氢化锂、水和甲酰胺,在 IBM 量子硬件上进行了无噪声模拟和实际实施。与其他监督学习程序相比,所提出的设置只需要极少的量子资源,因此可以在量子平台上直接实施,同时与模拟相比仍能达到很高的预测精度。我们取得的令人鼓舞的成果为未来应用于更复杂的分子铺平了道路,使我们提出的设置具有可扩展性。
{"title":"Quantum extreme learning of molecular potential energy surfaces and force fields","authors":"Gabriele Lo Monaco, Marco Bertini, Salvatore Lorenzo and G Massimo Palma","doi":"10.1088/2632-2153/ad6120","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6120","url":null,"abstract":"Quantum machine learning algorithms are expected to play a pivotal role in quantum chemistry simulations in the immediate future. One such key application is the training of a quantum neural network to learn the potential energy surface and force field of molecular systems. We address this task by using the quantum extreme learning machine paradigm. This particular supervised learning routine allows for resource-efficient training, consisting of a simple linear regression performed on a classical computer. We have tested a setup that can be used to study molecules of any dimension and is optimized for immediate use on NISQ devices with a limited number of native gates. We have applied this setup to three case studies: lithium hydride, water, and formamide, carrying out both noiseless simulations and actual implementation on IBM quantum hardware. Compared to other supervised learning routines, the proposed setup requires minimal quantum resources, making it feasible for direct implementation on quantum platforms, while still achieving a high level of predictive accuracy compared to simulations. Our encouraging results pave the way towards the future application to more complex molecules, being the proposed setup scalable.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"19 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1088/2632-2153/ad611f
Emanuele Costa, Giuseppe Scriva and Sebastiano Pilati
In recent years, machine learning models, chiefly deep neural networks, have revealed suited to learn accurate energy-density functionals from data. However, problematic instabilities have been shown to occur in the search of ground-state density profiles via energy minimization. Indeed, any small noise can lead astray from realistic profiles, causing the failure of the learned functional and, hence, strong violations of the variational property. In this article, we employ variational autoencoders (VAEs) to build a compressed, flexible, and regular representation of the ground-state density profiles of various quantum models. Performing energy minimization in this compressed space allows us to avoid both numerical instabilities and variational biases due to excessive constraints. Our tests are performed on one-dimensional single-particle models from the literature in the field and, notably, on a three-dimensional disordered potential. In all cases, the ground-state energies are estimated with errors below the chemical accuracy and the density profiles are accurately reproduced without numerical artifacts. Furthermore, we show that it is possible to perform transfer learning, applying pre-trained VAEs to different potentials.
{"title":"Solving deep-learning density functional theory via variational autoencoders","authors":"Emanuele Costa, Giuseppe Scriva and Sebastiano Pilati","doi":"10.1088/2632-2153/ad611f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad611f","url":null,"abstract":"In recent years, machine learning models, chiefly deep neural networks, have revealed suited to learn accurate energy-density functionals from data. However, problematic instabilities have been shown to occur in the search of ground-state density profiles via energy minimization. Indeed, any small noise can lead astray from realistic profiles, causing the failure of the learned functional and, hence, strong violations of the variational property. In this article, we employ variational autoencoders (VAEs) to build a compressed, flexible, and regular representation of the ground-state density profiles of various quantum models. Performing energy minimization in this compressed space allows us to avoid both numerical instabilities and variational biases due to excessive constraints. Our tests are performed on one-dimensional single-particle models from the literature in the field and, notably, on a three-dimensional disordered potential. In all cases, the ground-state energies are estimated with errors below the chemical accuracy and the density profiles are accurately reproduced without numerical artifacts. Furthermore, we show that it is possible to perform transfer learning, applying pre-trained VAEs to different potentials.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"286 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1088/2632-2153/ad612b
Jian-Gang Kong, Ke-Lin Zhao, Jian Li, Qing-Xu Li, Yu Liu, Rui Zhang, Jia-Ji Zhu and Kai Chang
Supervised machine learning algorithms, such as graph neural networks (GNN), have successfully predicted material properties. However, the superior performance of GNN usually relies on end-to-end learning on large material datasets, which may lose the physical insight of multi-scale information about materials. And the process of labeling data consumes many resources and inevitably introduces errors, which constrains the accuracy of prediction. We propose to train the GNN model by self-supervised learning on the node and edge information of the crystal graph. Compared with the popular manually constructed material descriptors, the self-supervised atomic representation can reach better prediction performance on material properties. Furthermore, it may provide physical insights by tuning the range information. Applying the self-supervised atomic representation on the magnetic moment datasets, we show how they can extract rules and information from the magnetic materials. To incorporate rich physical information into the GNN model, we develop the node embedding graph neural networks (NEGNN) framework and show significant improvements in the prediction performance. The self-supervised material representation and the NEGNN framework may investigate in-depth information from materials and can be applied to small datasets with increased prediction accuracy.
{"title":"Self-supervised representations and node embedding graph neural networks for accurate and multi-scale analysis of materials","authors":"Jian-Gang Kong, Ke-Lin Zhao, Jian Li, Qing-Xu Li, Yu Liu, Rui Zhang, Jia-Ji Zhu and Kai Chang","doi":"10.1088/2632-2153/ad612b","DOIUrl":"https://doi.org/10.1088/2632-2153/ad612b","url":null,"abstract":"Supervised machine learning algorithms, such as graph neural networks (GNN), have successfully predicted material properties. However, the superior performance of GNN usually relies on end-to-end learning on large material datasets, which may lose the physical insight of multi-scale information about materials. And the process of labeling data consumes many resources and inevitably introduces errors, which constrains the accuracy of prediction. We propose to train the GNN model by self-supervised learning on the node and edge information of the crystal graph. Compared with the popular manually constructed material descriptors, the self-supervised atomic representation can reach better prediction performance on material properties. Furthermore, it may provide physical insights by tuning the range information. Applying the self-supervised atomic representation on the magnetic moment datasets, we show how they can extract rules and information from the magnetic materials. To incorporate rich physical information into the GNN model, we develop the node embedding graph neural networks (NEGNN) framework and show significant improvements in the prediction performance. The self-supervised material representation and the NEGNN framework may investigate in-depth information from materials and can be applied to small datasets with increased prediction accuracy.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"64 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-14DOI: 10.1088/2632-2153/ad5f11
Wojciech G Stark, Cas van der Oord, Ilyes Batatia, Yaolong Zhang, Bin Jiang, Gábor Csányi and Reinhard J Maurer
Simulations of chemical reaction probabilities in gas surface dynamics require the calculation of ensemble averages over many tens of thousands of reaction events to predict dynamical observables that can be compared to experiments. At the same time, the energy landscapes need to be accurately mapped, as small errors in barriers can lead to large deviations in reaction probabilities. This brings a particularly interesting challenge for machine learning interatomic potentials, which are becoming well-established tools to accelerate molecular dynamics simulations. We compare state-of-the-art machine learning interatomic potentials with a particular focus on their inference performance on CPUs and suitability for high throughput simulation of reactive chemistry at surfaces. The considered models include polarizable atom interaction neural networks (PaiNN), recursively embedded atom neural networks (REANN), the MACE equivariant graph neural network, and atomic cluster expansion potentials (ACE). The models are applied to a dataset on reactive molecular hydrogen scattering on low-index surface facets of copper. All models are assessed for their accuracy, time-to-solution, and ability to simulate reactive sticking probabilities as a function of the rovibrational initial state and kinetic incidence energy of the molecule. REANN and MACE models provide the best balance between accuracy and time-to-solution and can be considered the current state-of-the-art in gas-surface dynamics. PaiNN models require many features for the best accuracy, which causes significant losses in computational efficiency. ACE models provide the fastest time-to-solution, however, models trained on the existing dataset were not able to achieve sufficiently accurate predictions in all cases.
模拟气体表面动力学中的化学反应概率需要计算数以万计反应事件的集合平均值,以预测可与实验进行比较的动力学观测值。与此同时,还需要精确绘制能谱,因为壁垒的微小误差就会导致反应概率的巨大偏差。这给机器学习原子间势带来了特别有趣的挑战,而机器学习原子间势正在成为加速分子动力学模拟的成熟工具。我们对最先进的机器学习原子间势进行了比较,重点关注它们在 CPU 上的推理性能以及是否适合表面反应化学的高通量模拟。所考虑的模型包括可极化原子相互作用神经网络(PaiNN)、递归嵌入式原子神经网络(REANN)、MACE 等变图神经网络和原子簇扩展势能(ACE)。这些模型被应用于铜的低指数表面面上的反应分子氢散射数据集。对所有模型的准确性、求解时间以及模拟反应性粘滞概率的能力进行了评估,并将其作为分子振荡初始状态和动力学入射能的函数。REANN 和 MACE 模型在精确度和求解时间之间取得了最佳平衡,可以说是目前最先进的气体表面动力学模型。PaiNN 模型需要许多特征才能达到最佳精度,这就大大降低了计算效率。ACE 模型提供了最快的求解时间,但在现有数据集上训练的模型并不能在所有情况下都实现足够准确的预测。
{"title":"Benchmarking of machine learning interatomic potentials for reactive hydrogen dynamics at metal surfaces","authors":"Wojciech G Stark, Cas van der Oord, Ilyes Batatia, Yaolong Zhang, Bin Jiang, Gábor Csányi and Reinhard J Maurer","doi":"10.1088/2632-2153/ad5f11","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f11","url":null,"abstract":"Simulations of chemical reaction probabilities in gas surface dynamics require the calculation of ensemble averages over many tens of thousands of reaction events to predict dynamical observables that can be compared to experiments. At the same time, the energy landscapes need to be accurately mapped, as small errors in barriers can lead to large deviations in reaction probabilities. This brings a particularly interesting challenge for machine learning interatomic potentials, which are becoming well-established tools to accelerate molecular dynamics simulations. We compare state-of-the-art machine learning interatomic potentials with a particular focus on their inference performance on CPUs and suitability for high throughput simulation of reactive chemistry at surfaces. The considered models include polarizable atom interaction neural networks (PaiNN), recursively embedded atom neural networks (REANN), the MACE equivariant graph neural network, and atomic cluster expansion potentials (ACE). The models are applied to a dataset on reactive molecular hydrogen scattering on low-index surface facets of copper. All models are assessed for their accuracy, time-to-solution, and ability to simulate reactive sticking probabilities as a function of the rovibrational initial state and kinetic incidence energy of the molecule. REANN and MACE models provide the best balance between accuracy and time-to-solution and can be considered the current state-of-the-art in gas-surface dynamics. PaiNN models require many features for the best accuracy, which causes significant losses in computational efficiency. ACE models provide the fastest time-to-solution, however, models trained on the existing dataset were not able to achieve sufficiently accurate predictions in all cases.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"19 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10DOI: 10.1088/2632-2153/ad5f13
Majd Ghrear, Peter Sadowski and Sven E Vahsen
We present the first method to probabilistically predict 3D direction in a deep neural network model. The probabilistic predictions are modeled as a heteroscedastic von Mises-Fisher distribution on the sphere , giving a simple way to quantify aleatoric uncertainty. This approach generalizes the cosine distance loss which is a special case of our loss function when the uncertainty is assumed to be uniform across samples. We develop approximations required to make the likelihood function and gradient calculations stable. The method is applied to the task of predicting the 3D directions of electrons, the most complex signal in a class of experimental particle physics detectors designed to demonstrate the particle nature of dark matter and study solar neutrinos. Using simulated Monte Carlo data, the initial direction of recoiling electrons is inferred from their tortuous trajectories, as captured by the 3D detectors. For keV electrons in a 70% He 30% CO2 gas mixture at STP, the new approach achieves a mean cosine distance of 0.104 (26∘) compared to 0.556 (64∘) achieved by a non-machine learning algorithm. We show that the model is well-calibrated and accuracy can be increased further by removing samples with high predicted uncertainty. This advancement in probabilistic 3D directional learning could increase the sensitivity of directional dark matter detectors.
我们首次提出了在深度神经网络模型中以概率方式预测三维方向的方法。概率预测被建模为球面上的异方差 von Mises-Fisher 分布,从而提供了一种量化不确定性的简单方法。这种方法概括了余弦距离损失,当不确定性被假定为跨样本均匀时,余弦距离损失是我们损失函数的一种特例。我们开发了使似然函数和梯度计算稳定所需的近似值。该方法被应用于预测电子的三维方向,这是一类粒子物理实验探测器中最复杂的信号,旨在证明暗物质的粒子性质和研究太阳中微子。利用模拟蒙特卡洛数据,通过三维探测器捕捉到的电子迂回轨迹,推断出反冲电子的初始方向。对于在 STP 条件下 70% He 30% CO2 混合气体中的 keV 电子,新方法实现的平均余弦距离为 0.104 (26∘),而非机器学习算法实现的平均余弦距离为 0.556 (64∘)。我们的研究表明,该模型校准良好,通过去除预测不确定性较高的样本,可以进一步提高精度。概率三维定向学习的这一进步可以提高定向暗物质探测器的灵敏度。
{"title":"Deep probabilistic direction prediction in 3D with applications to directional dark matter detectors","authors":"Majd Ghrear, Peter Sadowski and Sven E Vahsen","doi":"10.1088/2632-2153/ad5f13","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f13","url":null,"abstract":"We present the first method to probabilistically predict 3D direction in a deep neural network model. The probabilistic predictions are modeled as a heteroscedastic von Mises-Fisher distribution on the sphere , giving a simple way to quantify aleatoric uncertainty. This approach generalizes the cosine distance loss which is a special case of our loss function when the uncertainty is assumed to be uniform across samples. We develop approximations required to make the likelihood function and gradient calculations stable. The method is applied to the task of predicting the 3D directions of electrons, the most complex signal in a class of experimental particle physics detectors designed to demonstrate the particle nature of dark matter and study solar neutrinos. Using simulated Monte Carlo data, the initial direction of recoiling electrons is inferred from their tortuous trajectories, as captured by the 3D detectors. For keV electrons in a 70% He 30% CO2 gas mixture at STP, the new approach achieves a mean cosine distance of 0.104 (26∘) compared to 0.556 (64∘) achieved by a non-machine learning algorithm. We show that the model is well-calibrated and accuracy can be increased further by removing samples with high predicted uncertainty. This advancement in probabilistic 3D directional learning could increase the sensitivity of directional dark matter detectors.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"24 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141586996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}