Pub Date : 2024-08-04DOI: 10.1088/2632-2153/ad64a6
Ryan Humble, Zhe Zhang, Finn O’Shea, Eric Darve and Daniel Ratner
Anomaly detection is an important task for complex scientific experiments and other complex systems (e.g. industrial facilities, manufacturing), where failures in a sub-system can lead to lost data, poor performance, or even damage to components. While scientific facilities generate a wealth of data, labeled anomalies may be rare (or even nonexistent), and expensive to acquire. Unsupervised approaches are therefore common and typically search for anomalies either by distance or density of examples in the input feature space (or some associated low-dimensional representation). This paper presents a novel approach called coincident learning for anomaly detection (CoAD), which is specifically designed for multi-modal tasks and identifies anomalies based on coincident behavior across two different slices of the feature space. We define an unsupervised metric, , out of analogy to the supervised classification Fβ statistic. CoAD uses to train an anomaly detection algorithm on unlabeled data, based on the expectation that anomalous behavior in one feature slice is coincident with anomalous behavior in the other. The method is illustrated using a synthetic outlier data set and a MNIST-based image data set, and is compared to prior state-of-the-art on two real-world tasks: a metal milling data set and our motivating task of identifying RF station anomalies in a particle accelerator.
{"title":"Coincident learning for unsupervised anomaly detection of scientific instruments","authors":"Ryan Humble, Zhe Zhang, Finn O’Shea, Eric Darve and Daniel Ratner","doi":"10.1088/2632-2153/ad64a6","DOIUrl":"https://doi.org/10.1088/2632-2153/ad64a6","url":null,"abstract":"Anomaly detection is an important task for complex scientific experiments and other complex systems (e.g. industrial facilities, manufacturing), where failures in a sub-system can lead to lost data, poor performance, or even damage to components. While scientific facilities generate a wealth of data, labeled anomalies may be rare (or even nonexistent), and expensive to acquire. Unsupervised approaches are therefore common and typically search for anomalies either by distance or density of examples in the input feature space (or some associated low-dimensional representation). This paper presents a novel approach called coincident learning for anomaly detection (CoAD), which is specifically designed for multi-modal tasks and identifies anomalies based on coincident behavior across two different slices of the feature space. We define an unsupervised metric, , out of analogy to the supervised classification Fβ statistic. CoAD uses to train an anomaly detection algorithm on unlabeled data, based on the expectation that anomalous behavior in one feature slice is coincident with anomalous behavior in the other. The method is illustrated using a synthetic outlier data set and a MNIST-based image data set, and is compared to prior state-of-the-art on two real-world tasks: a metal milling data set and our motivating task of identifying RF station anomalies in a particle accelerator.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"76 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141931255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1088/2632-2153/ad66ad
Joschka Birk, Anna Hallin and Gregor Kasieczka
Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data. We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-α model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.
{"title":"OmniJet-α: the first cross-task foundation model for particle physics","authors":"Joschka Birk, Anna Hallin and Gregor Kasieczka","doi":"10.1088/2632-2153/ad66ad","DOIUrl":"https://doi.org/10.1088/2632-2153/ad66ad","url":null,"abstract":"Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data. We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-α model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"81 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141885723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1088/2632-2153/ad5f74
Xiaofei Guan, Xintong Wang, Hao Wu, Zihao Yang and Peng Yu
This paper presents an innovative approach to tackle Bayesian inverse problems using physics-informed invertible neural networks (PI-INN). Serving as a neural operator model, PI-INN employs an invertible neural network (INN) to elucidate the relationship between the parameter field and the solution function in latent variable spaces. Specifically, the INN decomposes the latent variable of the parameter field into two distinct components: the expansion coefficients that represent the solution to the forward problem, and the noise that captures the inherent uncertainty associated with the inverse problem. Through precise estimation of the forward mapping and preservation of statistical independence between expansion coefficients and latent noise, PI-INN offers an accurate and efficient generative model for resolving Bayesian inverse problems, even in the absence of labeled data. For a given solution function, PI-INN can provide tractable and accurate estimates of the posterior distribution of the underlying parameter field. Moreover, capitalizing on the INN’s characteristics, we propose a novel independent loss function to effectively ensure the independence of the INN’s decomposition results. The efficacy and precision of the proposed PI-INN are demonstrated through a series of numerical experiments.
本文提出了一种利用物理信息可逆神经网络(PI-INN)解决贝叶斯逆问题的创新方法。作为一种神经算子模型,PI-INN 利用可逆神经网络(INN)来阐明潜变量空间中参数场与解函数之间的关系。具体来说,INN 将参数场的潜变量分解为两个不同的部分:代表正向问题解决方案的扩展系数,以及捕捉与逆向问题相关的固有不确定性的噪声。通过精确估计前向映射以及保持扩展系数和潜在噪声之间的统计独立性,PI-INN 为解决贝叶斯逆问题提供了一个精确高效的生成模型,即使在没有标记数据的情况下也是如此。对于给定的求解函数,PI-INN 可以对底层参数场的后验分布提供简便而准确的估计。此外,利用 INN 的特点,我们提出了一种新的独立损失函数,以有效确保 INN 分解结果的独立性。我们通过一系列数值实验证明了所提出的 PI-INN 的有效性和精确性。
{"title":"Efficient Bayesian inference using physics-informed invertible neural networks for inverse problems","authors":"Xiaofei Guan, Xintong Wang, Hao Wu, Zihao Yang and Peng Yu","doi":"10.1088/2632-2153/ad5f74","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f74","url":null,"abstract":"This paper presents an innovative approach to tackle Bayesian inverse problems using physics-informed invertible neural networks (PI-INN). Serving as a neural operator model, PI-INN employs an invertible neural network (INN) to elucidate the relationship between the parameter field and the solution function in latent variable spaces. Specifically, the INN decomposes the latent variable of the parameter field into two distinct components: the expansion coefficients that represent the solution to the forward problem, and the noise that captures the inherent uncertainty associated with the inverse problem. Through precise estimation of the forward mapping and preservation of statistical independence between expansion coefficients and latent noise, PI-INN offers an accurate and efficient generative model for resolving Bayesian inverse problems, even in the absence of labeled data. For a given solution function, PI-INN can provide tractable and accurate estimates of the posterior distribution of the underlying parameter field. Moreover, capitalizing on the INN’s characteristics, we propose a novel independent loss function to effectively ensure the independence of the INN’s decomposition results. The efficacy and precision of the proposed PI-INN are demonstrated through a series of numerical experiments.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"214 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-21DOI: 10.1088/2632-2153/ad622f
Alessandro Bombini, Fernando García-Avello Bofías, Caterina Bracci, Michele Ginolfi and Chiara Ruberto
Extended vision techniques are ubiquitous in physics. However, the data cubes steaming from such analysis often pose a challenge in their interpretation, due to the intrinsic difficulty in discerning the relevant information from the spectra composing the data cube. Furthermore, the huge dimensionality of data cube spectra poses a complex task in its statistical interpretation; nevertheless, this complexity contains a massive amount of statistical information that can be exploited in an unsupervised manner to outline some essential properties of the case study at hand, e.g. it is possible to obtain an image segmentation via (deep) clustering of data-cube’s spectra, performed in a suitably defined low-dimensional embedding space. To tackle this topic, we explore the possibility of applying unsupervised clustering methods in encoded space, i.e. perform deep clustering on the spectral properties of datacube pixels. A statistical dimensional reduction is performed by an ad hoc trained (variational) AutoEncoder, in charge of mapping spectra into lower dimensional metric spaces, while the clustering process is performed by a (learnable) iterative K-means clustering algorithm. We apply this technique to two different use cases, of different physical origins: a set of macro mapping x-ray fluorescence (MA-XRF) synthetic data on pictorial artworks, and a dataset of simulated astrophysical observations.
扩展视觉技术在物理学中无处不在。然而,由于从组成数据立方体的光谱中辨别相关信息的内在困难,从此类分析中产生的数据立方体往往对其解释构成挑战。此外,数据立方体光谱的巨大维度也给统计解释带来了复杂的任务;然而,这种复杂性包含了大量的统计信息,可以在无监督的情况下利用这些信息来概述手头案例研究的一些基本属性,例如,可以通过在适当定义的低维嵌入空间中对数据立方体光谱进行(深度)聚类来获得图像分割。为了解决这个问题,我们探索了在编码空间中应用无监督聚类方法的可能性,即对数据立方体像素的光谱属性进行深度聚类。统计降维是通过一个经过特别训练的(变异)自动编码器来完成的,它负责将光谱映射到低维的度量空间中,而聚类过程则是通过一个(可学习的)迭代 K-means 聚类算法来完成的。我们将这一技术应用于两个不同的使用案例,它们的物理来源各不相同:一组关于绘画艺术品的宏观映射 X 射线荧光(MA-XRF)合成数据,以及一个模拟天体物理观测数据集。
{"title":"Datacube segmentation via deep spectral clustering","authors":"Alessandro Bombini, Fernando García-Avello Bofías, Caterina Bracci, Michele Ginolfi and Chiara Ruberto","doi":"10.1088/2632-2153/ad622f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad622f","url":null,"abstract":"Extended vision techniques are ubiquitous in physics. However, the data cubes steaming from such analysis often pose a challenge in their interpretation, due to the intrinsic difficulty in discerning the relevant information from the spectra composing the data cube. Furthermore, the huge dimensionality of data cube spectra poses a complex task in its statistical interpretation; nevertheless, this complexity contains a massive amount of statistical information that can be exploited in an unsupervised manner to outline some essential properties of the case study at hand, e.g. it is possible to obtain an image segmentation via (deep) clustering of data-cube’s spectra, performed in a suitably defined low-dimensional embedding space. To tackle this topic, we explore the possibility of applying unsupervised clustering methods in encoded space, i.e. perform deep clustering on the spectral properties of datacube pixels. A statistical dimensional reduction is performed by an ad hoc trained (variational) AutoEncoder, in charge of mapping spectra into lower dimensional metric spaces, while the clustering process is performed by a (learnable) iterative K-means clustering algorithm. We apply this technique to two different use cases, of different physical origins: a set of macro mapping x-ray fluorescence (MA-XRF) synthetic data on pictorial artworks, and a dataset of simulated astrophysical observations.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"32 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-18DOI: 10.1088/2632-2153/ad5a60
Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein and Gustau Camps-Valls
Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to achieve these purposes. This paper introduces a novel approach to estimating hybrid models via a causal inference framework, specifically employing double machine learning (DML) to estimate causal effects. We showcase its use for the Earth sciences on two problems related to carbon dioxide fluxes. In the Q10 model, we demonstrate that DML-based hybrid modeling is superior in estimating causal parameters over end-to-end deep neural network approaches, proving efficiency, robustness to bias from regularization methods, and circumventing equifinality. Our approach, applied to carbon flux partitioning, exhibits flexibility in accommodating heterogeneous causal effects. The study emphasizes the necessity of explicitly defining causal graphs and relationships, advocating for this as a general best practice. We encourage the continued exploration of causality in hybrid models for more interpretable and trustworthy results in knowledge-guided machine learning.
{"title":"Causal hybrid modeling with double machine learning—applications in carbon flux modeling","authors":"Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein and Gustau Camps-Valls","doi":"10.1088/2632-2153/ad5a60","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5a60","url":null,"abstract":"Hybrid modeling integrates machine learning with scientific knowledge to enhance interpretability, generalization, and adherence to natural laws. Nevertheless, equifinality and regularization biases pose challenges in hybrid modeling to achieve these purposes. This paper introduces a novel approach to estimating hybrid models via a causal inference framework, specifically employing double machine learning (DML) to estimate causal effects. We showcase its use for the Earth sciences on two problems related to carbon dioxide fluxes. In the Q10 model, we demonstrate that DML-based hybrid modeling is superior in estimating causal parameters over end-to-end deep neural network approaches, proving efficiency, robustness to bias from regularization methods, and circumventing equifinality. Our approach, applied to carbon flux partitioning, exhibits flexibility in accommodating heterogeneous causal effects. The study emphasizes the necessity of explicitly defining causal graphs and relationships, advocating for this as a general best practice. We encourage the continued exploration of causality in hybrid models for more interpretable and trustworthy results in knowledge-guided machine learning.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"18 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-18DOI: 10.1088/2632-2153/ad5f12
Johannes Nokkala, Gian Luca Giorgi and Roberta Zambrini
Machine learning techniques have achieved impressive results in recent years and the possibility of harnessing the power of quantum physics opens new promising avenues to speed up classical learning methods. Rather than viewing classical and quantum approaches as exclusive alternatives, their integration into hybrid designs has gathered increasing interest, as seen in variational quantum algorithms, quantum circuit learning, and kernel methods. Here we introduce deep hybrid classical-quantum reservoir computing for temporal processing of quantum states where information about, for instance, the entanglement or the purity of past input states can be extracted via a single-step measurement. We find that the hybrid setup cascading two reservoirs not only inherits the strengths of both of its constituents but is even more than just the sum of its parts, outperforming comparable non-hybrid alternatives. The quantum layer is within reach of state-of-the-art multimode quantum optical platforms while the classical layer can be implemented in silico.
{"title":"Retrieving past quantum features with deep hybrid classical-quantum reservoir computing","authors":"Johannes Nokkala, Gian Luca Giorgi and Roberta Zambrini","doi":"10.1088/2632-2153/ad5f12","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f12","url":null,"abstract":"Machine learning techniques have achieved impressive results in recent years and the possibility of harnessing the power of quantum physics opens new promising avenues to speed up classical learning methods. Rather than viewing classical and quantum approaches as exclusive alternatives, their integration into hybrid designs has gathered increasing interest, as seen in variational quantum algorithms, quantum circuit learning, and kernel methods. Here we introduce deep hybrid classical-quantum reservoir computing for temporal processing of quantum states where information about, for instance, the entanglement or the purity of past input states can be extracted via a single-step measurement. We find that the hybrid setup cascading two reservoirs not only inherits the strengths of both of its constituents but is even more than just the sum of its parts, outperforming comparable non-hybrid alternatives. The quantum layer is within reach of state-of-the-art multimode quantum optical platforms while the classical layer can be implemented in silico.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"22 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1088/2632-2153/ad5f10
Patrick Odagiu, Zhiqiang Que, Javier Duarte, Johannes Haller, Gregor Kasieczka, Artur Lobanov, Vladimir Loncar, Wayne Luk, Jennifer Ngadiuba, Maurizio Pierini, Philipp Rincke, Arpita Seksaria, Sioni Summers, Andre Sznajder, Alexander Tapper and Thea K Årrestad
Three machine learning models are used to perform jet origin classification. These models are optimized for deployment on a field-programmable gate array device. In this context, we demonstrate how latency and resource consumption scale with the input size and choice of algorithm. Moreover, the models proposed here are designed to work on the type of data and under the foreseen conditions at the CERN large hadron collider during its high-luminosity phase. Through quantization-aware training and efficient synthetization for a specific field programmable gate array, we show that ns inference of complex architectures such as Deep Sets and Interaction Networks is feasible at a relatively low computational resource cost.
{"title":"Ultrafast jet classification at the HL-LHC","authors":"Patrick Odagiu, Zhiqiang Que, Javier Duarte, Johannes Haller, Gregor Kasieczka, Artur Lobanov, Vladimir Loncar, Wayne Luk, Jennifer Ngadiuba, Maurizio Pierini, Philipp Rincke, Arpita Seksaria, Sioni Summers, Andre Sznajder, Alexander Tapper and Thea K Årrestad","doi":"10.1088/2632-2153/ad5f10","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f10","url":null,"abstract":"Three machine learning models are used to perform jet origin classification. These models are optimized for deployment on a field-programmable gate array device. In this context, we demonstrate how latency and resource consumption scale with the input size and choice of algorithm. Moreover, the models proposed here are designed to work on the type of data and under the foreseen conditions at the CERN large hadron collider during its high-luminosity phase. Through quantization-aware training and efficient synthetization for a specific field programmable gate array, we show that ns inference of complex architectures such as Deep Sets and Interaction Networks is feasible at a relatively low computational resource cost.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"50 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1088/2632-2153/ad6120
Gabriele Lo Monaco, Marco Bertini, Salvatore Lorenzo and G Massimo Palma
Quantum machine learning algorithms are expected to play a pivotal role in quantum chemistry simulations in the immediate future. One such key application is the training of a quantum neural network to learn the potential energy surface and force field of molecular systems. We address this task by using the quantum extreme learning machine paradigm. This particular supervised learning routine allows for resource-efficient training, consisting of a simple linear regression performed on a classical computer. We have tested a setup that can be used to study molecules of any dimension and is optimized for immediate use on NISQ devices with a limited number of native gates. We have applied this setup to three case studies: lithium hydride, water, and formamide, carrying out both noiseless simulations and actual implementation on IBM quantum hardware. Compared to other supervised learning routines, the proposed setup requires minimal quantum resources, making it feasible for direct implementation on quantum platforms, while still achieving a high level of predictive accuracy compared to simulations. Our encouraging results pave the way towards the future application to more complex molecules, being the proposed setup scalable.
在不久的将来,量子机器学习算法有望在量子化学模拟中发挥关键作用。其中一个关键应用是训练量子神经网络,以学习分子系统的势能面和力场。我们利用量子极端学习机范式来完成这项任务。这种特殊的监督学习程序允许进行资源节约型训练,包括在经典计算机上执行简单的线性回归。我们测试了一种可用于研究任何维度分子的设置,并对其进行了优化,以便在原生门数量有限的 NISQ 设备上立即使用。我们将这种设置应用于三个案例研究:氢化锂、水和甲酰胺,在 IBM 量子硬件上进行了无噪声模拟和实际实施。与其他监督学习程序相比,所提出的设置只需要极少的量子资源,因此可以在量子平台上直接实施,同时与模拟相比仍能达到很高的预测精度。我们取得的令人鼓舞的成果为未来应用于更复杂的分子铺平了道路,使我们提出的设置具有可扩展性。
{"title":"Quantum extreme learning of molecular potential energy surfaces and force fields","authors":"Gabriele Lo Monaco, Marco Bertini, Salvatore Lorenzo and G Massimo Palma","doi":"10.1088/2632-2153/ad6120","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6120","url":null,"abstract":"Quantum machine learning algorithms are expected to play a pivotal role in quantum chemistry simulations in the immediate future. One such key application is the training of a quantum neural network to learn the potential energy surface and force field of molecular systems. We address this task by using the quantum extreme learning machine paradigm. This particular supervised learning routine allows for resource-efficient training, consisting of a simple linear regression performed on a classical computer. We have tested a setup that can be used to study molecules of any dimension and is optimized for immediate use on NISQ devices with a limited number of native gates. We have applied this setup to three case studies: lithium hydride, water, and formamide, carrying out both noiseless simulations and actual implementation on IBM quantum hardware. Compared to other supervised learning routines, the proposed setup requires minimal quantum resources, making it feasible for direct implementation on quantum platforms, while still achieving a high level of predictive accuracy compared to simulations. Our encouraging results pave the way towards the future application to more complex molecules, being the proposed setup scalable.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"19 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1088/2632-2153/ad611f
Emanuele Costa, Giuseppe Scriva and Sebastiano Pilati
In recent years, machine learning models, chiefly deep neural networks, have revealed suited to learn accurate energy-density functionals from data. However, problematic instabilities have been shown to occur in the search of ground-state density profiles via energy minimization. Indeed, any small noise can lead astray from realistic profiles, causing the failure of the learned functional and, hence, strong violations of the variational property. In this article, we employ variational autoencoders (VAEs) to build a compressed, flexible, and regular representation of the ground-state density profiles of various quantum models. Performing energy minimization in this compressed space allows us to avoid both numerical instabilities and variational biases due to excessive constraints. Our tests are performed on one-dimensional single-particle models from the literature in the field and, notably, on a three-dimensional disordered potential. In all cases, the ground-state energies are estimated with errors below the chemical accuracy and the density profiles are accurately reproduced without numerical artifacts. Furthermore, we show that it is possible to perform transfer learning, applying pre-trained VAEs to different potentials.
{"title":"Solving deep-learning density functional theory via variational autoencoders","authors":"Emanuele Costa, Giuseppe Scriva and Sebastiano Pilati","doi":"10.1088/2632-2153/ad611f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad611f","url":null,"abstract":"In recent years, machine learning models, chiefly deep neural networks, have revealed suited to learn accurate energy-density functionals from data. However, problematic instabilities have been shown to occur in the search of ground-state density profiles via energy minimization. Indeed, any small noise can lead astray from realistic profiles, causing the failure of the learned functional and, hence, strong violations of the variational property. In this article, we employ variational autoencoders (VAEs) to build a compressed, flexible, and regular representation of the ground-state density profiles of various quantum models. Performing energy minimization in this compressed space allows us to avoid both numerical instabilities and variational biases due to excessive constraints. Our tests are performed on one-dimensional single-particle models from the literature in the field and, notably, on a three-dimensional disordered potential. In all cases, the ground-state energies are estimated with errors below the chemical accuracy and the density profiles are accurately reproduced without numerical artifacts. Furthermore, we show that it is possible to perform transfer learning, applying pre-trained VAEs to different potentials.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"286 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1088/2632-2153/ad612b
Jian-Gang Kong, Ke-Lin Zhao, Jian Li, Qing-Xu Li, Yu Liu, Rui Zhang, Jia-Ji Zhu and Kai Chang
Supervised machine learning algorithms, such as graph neural networks (GNN), have successfully predicted material properties. However, the superior performance of GNN usually relies on end-to-end learning on large material datasets, which may lose the physical insight of multi-scale information about materials. And the process of labeling data consumes many resources and inevitably introduces errors, which constrains the accuracy of prediction. We propose to train the GNN model by self-supervised learning on the node and edge information of the crystal graph. Compared with the popular manually constructed material descriptors, the self-supervised atomic representation can reach better prediction performance on material properties. Furthermore, it may provide physical insights by tuning the range information. Applying the self-supervised atomic representation on the magnetic moment datasets, we show how they can extract rules and information from the magnetic materials. To incorporate rich physical information into the GNN model, we develop the node embedding graph neural networks (NEGNN) framework and show significant improvements in the prediction performance. The self-supervised material representation and the NEGNN framework may investigate in-depth information from materials and can be applied to small datasets with increased prediction accuracy.
{"title":"Self-supervised representations and node embedding graph neural networks for accurate and multi-scale analysis of materials","authors":"Jian-Gang Kong, Ke-Lin Zhao, Jian Li, Qing-Xu Li, Yu Liu, Rui Zhang, Jia-Ji Zhu and Kai Chang","doi":"10.1088/2632-2153/ad612b","DOIUrl":"https://doi.org/10.1088/2632-2153/ad612b","url":null,"abstract":"Supervised machine learning algorithms, such as graph neural networks (GNN), have successfully predicted material properties. However, the superior performance of GNN usually relies on end-to-end learning on large material datasets, which may lose the physical insight of multi-scale information about materials. And the process of labeling data consumes many resources and inevitably introduces errors, which constrains the accuracy of prediction. We propose to train the GNN model by self-supervised learning on the node and edge information of the crystal graph. Compared with the popular manually constructed material descriptors, the self-supervised atomic representation can reach better prediction performance on material properties. Furthermore, it may provide physical insights by tuning the range information. Applying the self-supervised atomic representation on the magnetic moment datasets, we show how they can extract rules and information from the magnetic materials. To incorporate rich physical information into the GNN model, we develop the node embedding graph neural networks (NEGNN) framework and show significant improvements in the prediction performance. The self-supervised material representation and the NEGNN framework may investigate in-depth information from materials and can be applied to small datasets with increased prediction accuracy.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"64 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}