Pub Date : 2023-11-13DOI: 10.1088/2632-2153/ad0c40
Xian Zhang, Diquan Li, Bei Liu, Yanfang Hu, Yao Mo
Abstract The application of the electromagnetic method has accelerated due to the demand for the development of mineral resource, however the strong electromagnetic interference seriously lowers the data quality, resolution and detect effect. To suppress the electromagnetic interference, this paper proposes an intelligent processing method based on detrended and identification, and applies for wide field electromagnetic method (WFEM) data. First, we combined the improved intrinsic time scale decomposition (IITD) and detrended fluctuation analysis (DFA) algorithm for removing the trend noise. Then, we extracted the time domain characteristics of the WFEM data after removing the trend noise. Next, the arithmetic optimization algorithm (AOA) was utilized to search for the optimal smoothing factor of the probabilistic neural network (PNN) algorithm, which realized to intelligently identify the noise data and WFEM effective data. Finally, The Fourier transform was performed to extract the spectrum amplitude of the effective frequency points from the reconstructed WFEM data, and the electric field curve was obtained. In these studies and applications, the fuzzy c-mean (FCM) and PNN algorithm are contrasted. The proposed method indicated that the trend noise can be adaptively extracted and eliminated, the abnormal waveform or noise interference can be intelligently identified, the reconstructed WFEM data can effectively recover the pseudo-random signal waveform, and the shape of electric field curves were more stable. Simulation experiments and measured applications has verified that the proposed method can provide technical support for deep underground exploration.
{"title":"Intelligent processing of electromagnetic data using detrended and identification","authors":"Xian Zhang, Diquan Li, Bei Liu, Yanfang Hu, Yao Mo","doi":"10.1088/2632-2153/ad0c40","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0c40","url":null,"abstract":"Abstract The application of the electromagnetic method has accelerated due to the demand for the development of mineral resource, however the strong electromagnetic interference seriously lowers the data quality, resolution and detect effect. To suppress the electromagnetic interference, this paper proposes an intelligent processing method based on detrended and identification, and applies for wide field electromagnetic method (WFEM) data. First, we combined the improved intrinsic time scale decomposition (IITD) and detrended fluctuation analysis (DFA) algorithm for removing the trend noise. Then, we extracted the time domain characteristics of the WFEM data after removing the trend noise. Next, the arithmetic optimization algorithm (AOA) was utilized to search for the optimal smoothing factor of the probabilistic neural network (PNN) algorithm, which realized to intelligently identify the noise data and WFEM effective data. Finally, The Fourier transform was performed to extract the spectrum amplitude of the effective frequency points from the reconstructed WFEM data, and the electric field curve was obtained. In these studies and applications, the fuzzy c-mean (FCM) and PNN algorithm are contrasted. The proposed method indicated that the trend noise can be adaptively extracted and eliminated, the abnormal waveform or noise interference can be intelligently identified, the reconstructed WFEM data can effectively recover the pseudo-random signal waveform, and the shape of electric field curves were more stable. Simulation experiments and measured applications has verified that the proposed method can provide technical support for deep underground exploration.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"59 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136281910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-10DOI: 10.1088/2632-2153/ad0652
Jun Dai, Roman V Krems
Abstract Kernel models of potential energy surfaces (PESs) for polyatomic molecules are often restricted by a specific choice of the kernel function. This can be avoided by optimizing the complexity of the kernel function. For regression problems with very expensive data, the functional form of the model kernels can be optimized in the Gaussian process (GP) setting through compositional function search guided by the Bayesian information criterion. However, the compositional kernel search is computationally demanding and relies on greedy strategies, which may yield sub-optimal kernels. An alternative strategy of increasing complexity of GP kernels treats a GP as a Bayesian neural network (NN) with a variable number of hidden layers, which yields NNGP models. Here, we present a direct comparison of GP models with composite kernels and NNGP models for applications aiming at the construction of global PES for polyatomic molecules. We show that NNGP models of PES can be trained much more efficiently and yield better generalization accuracy without relying on any specific form of the kernel function. We illustrate that NNGP models trained by distributions of energy points at low energies produce accurate predictions of PES at high energies. We also illustrate that NNGP models can extrapolate in the input variable space by building the free energy surface of the Heisenberg model trained in the paramagnetic phase and validated in the ferromagnetic phase. By construction, composite kernels yield more accurate models than kernels with a fixed functional form. Therefore, by illustrating that NNGP models outperform GP models with composite kernels, our work suggests that NNGP models should be a preferred choice of kernel models for PES.
{"title":"Neural network Gaussian processes as efficient models of potential energy surfaces for polyatomic molecules","authors":"Jun Dai, Roman V Krems","doi":"10.1088/2632-2153/ad0652","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0652","url":null,"abstract":"Abstract Kernel models of potential energy surfaces (PESs) for polyatomic molecules are often restricted by a specific choice of the kernel function. This can be avoided by optimizing the complexity of the kernel function. For regression problems with very expensive data, the functional form of the model kernels can be optimized in the Gaussian process (GP) setting through compositional function search guided by the Bayesian information criterion. However, the compositional kernel search is computationally demanding and relies on greedy strategies, which may yield sub-optimal kernels. An alternative strategy of increasing complexity of GP kernels treats a GP as a Bayesian neural network (NN) with a variable number of hidden layers, which yields NNGP models. Here, we present a direct comparison of GP models with composite kernels and NNGP models for applications aiming at the construction of global PES for polyatomic molecules. We show that NNGP models of PES can be trained much more efficiently and yield better generalization accuracy without relying on any specific form of the kernel function. We illustrate that NNGP models trained by distributions of energy points at low energies produce accurate predictions of PES at high energies. We also illustrate that NNGP models can extrapolate in the input variable space by building the free energy surface of the Heisenberg model trained in the paramagnetic phase and validated in the ferromagnetic phase. By construction, composite kernels yield more accurate models than kernels with a fixed functional form. Therefore, by illustrating that NNGP models outperform GP models with composite kernels, our work suggests that NNGP models should be a preferred choice of kernel models for PES.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"71 26","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135088047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-09DOI: 10.1088/2632-2153/ad092c
Umberto Maria Tomasini, Leonardo Petrini, Francesco Cagnetta, Matthieu Wyart
Abstract A central question of machine learning is how deep nets manage to learn tasks in high dimensions. An appealing hypothesis is that they achieve this feat by building a representation of the data where information irrelevant to the task is lost. For image datasets, this view is supported by the observation that after (and not before) training, the neural representation becomes less and less sensitive to diffeomorphisms acting on images as the signal propagates through the network. This loss of sensitivity correlates with performance and surprisingly correlates with a gain of sensitivity to white noise acquired during training. Which are the mechanisms learned by convolutional neural networks (CNNs) responsible for the these phenomena? In particular, why is the sensitivity to noise heightened with training? Our approach consists of two steps. (1) Analyzing the layer-wise representations of trained CNNs, we disentangle the role of spatial pooling in contrast to channel pooling in decreasing their sensitivity to image diffeomorphisms while increasing their sensitivity to noise. (2) We introduce model scale-detection tasks, which qualitatively reproduce the phenomena reported in our empirical analysis. In these models we can assess quantitatively how spatial pooling affects these sensitivities. We find that the increased sensitivity to noise observed in deep ReLU networks is a mechanistic consequence of the perturbing noise piling up during spatial pooling, after being rectified by ReLU units. Using odd activation functions like tanh drastically reduces the CNNs’ sensitivity to noise.
{"title":"How deep convolutional neural networks lose spatial information with training","authors":"Umberto Maria Tomasini, Leonardo Petrini, Francesco Cagnetta, Matthieu Wyart","doi":"10.1088/2632-2153/ad092c","DOIUrl":"https://doi.org/10.1088/2632-2153/ad092c","url":null,"abstract":"Abstract A central question of machine learning is how deep nets manage to learn tasks in high dimensions. An appealing hypothesis is that they achieve this feat by building a representation of the data where information irrelevant to the task is lost. For image datasets, this view is supported by the observation that after (and not before) training, the neural representation becomes less and less sensitive to diffeomorphisms acting on images as the signal propagates through the network. This loss of sensitivity correlates with performance and surprisingly correlates with a gain of sensitivity to white noise acquired during training. Which are the mechanisms learned by convolutional neural networks (CNNs) responsible for the these phenomena? In particular, why is the sensitivity to noise heightened with training? Our approach consists of two steps. (1) Analyzing the layer-wise representations of trained CNNs, we disentangle the role of spatial pooling in contrast to channel pooling in decreasing their sensitivity to image diffeomorphisms while increasing their sensitivity to noise. (2) We introduce model scale-detection tasks, which qualitatively reproduce the phenomena reported in our empirical analysis. In these models we can assess quantitatively how spatial pooling affects these sensitivities. We find that the increased sensitivity to noise observed in deep ReLU networks is a mechanistic consequence of the perturbing noise piling up during spatial pooling, after being rectified by ReLU units. Using odd activation functions like tanh drastically reduces the CNNs’ sensitivity to noise.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135191273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-08DOI: 10.1088/2632-2153/ad0ab4
Hampus Linander, Oleksandr Balabanov, Henry Yang, Bernhard Mehlig
Abstract Bayesian inference can quantify uncertainty in the predictions of neural networks using posterior distributions for model parameters and network output. By looking at these posterior distributions, one can separate the origin of uncertainty into aleatoric and epistemic contributions. One goal of uncertainty quantification is to inform on prediction accuracy.
Here we show that prediction accuracy depends on both epistemic and aleatoric uncertainty in an intricate fashion that cannot be understood in terms of marginalized uncertainty distributions alone. How the accuracy relates to epistemic and aleatoric uncertainties depends not only on the model architecture, but also on the properties of the dataset.
We discuss the significance of these results for active learning and introduce a novel acquisition function that outperforms common uncertainty-based methods. 
To arrive at our results, we approximated the posteriors using deep ensembles, for fully-connected, convolutional and attention-based neural networks.
{"title":"Looking at the posterior: accuracy and uncertainty of neural-network predictions","authors":"Hampus Linander, Oleksandr Balabanov, Henry Yang, Bernhard Mehlig","doi":"10.1088/2632-2153/ad0ab4","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0ab4","url":null,"abstract":"Abstract Bayesian inference can quantify uncertainty in the predictions of neural networks using posterior distributions for model parameters and network output. By looking at these posterior distributions, one can separate the origin of uncertainty into aleatoric and epistemic contributions. One goal of uncertainty quantification is to inform on prediction accuracy.
Here we show that prediction accuracy depends on both epistemic and aleatoric uncertainty in an intricate fashion that cannot be understood in terms of marginalized uncertainty distributions alone. How the accuracy relates to epistemic and aleatoric uncertainties depends not only on the model architecture, but also on the properties of the dataset.
We discuss the significance of these results for active learning and introduce a novel acquisition function that outperforms common uncertainty-based methods. 
To arrive at our results, we approximated the posteriors using deep ensembles, for fully-connected, convolutional and attention-based neural networks.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135340664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-08DOI: 10.1088/2632-2153/ad0ab5
Adam Thomas-Mitchell, Glenn Ivan Hawe, Paul Popelier
Abstract FFLUX is a Machine Learning Force Field that uses the Maximum Expected Prediction Error (MEPE) active learning algorithm to improve the efficiency of model training. MEPE uses the predictive uncertainty of a Gaussian Process to balance exploration and exploitation when selecting the next training sample. However, the predictive uncertainty of a Gaussian Process is unlikely to be accurate or precise immediately after training. We hypothesize that calibrating the uncertainty quantification within MEPE will improve active learning performance. We develop and test two methods to improve uncertainty estimates: post-hoc calibration of predictive uncertainty using the CRUDE algorithm, and replacing the Gaussian Process with a Student-t Process. We investigate the impact of these methods on MEPE for single sample and batch sample active learning. Our findings suggest that post-hoc calibration does not improve the performance of active learning using the MEPE method. However, we do find that the Student-t Process can outperform active learning strategies and random sampling using a Gaussian Process if the training set is sufficiently large.
{"title":"Calibration of uncertainty in the active learning of machine learning force fields","authors":"Adam Thomas-Mitchell, Glenn Ivan Hawe, Paul Popelier","doi":"10.1088/2632-2153/ad0ab5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0ab5","url":null,"abstract":"Abstract FFLUX is a Machine Learning Force Field that uses the Maximum Expected Prediction Error (MEPE) active learning algorithm to improve the efficiency of model training. MEPE uses the predictive uncertainty of a Gaussian Process to balance exploration and exploitation when selecting the next training sample. However, the predictive uncertainty of a Gaussian Process is unlikely to be accurate or precise immediately after training. We hypothesize that calibrating the uncertainty quantification within MEPE will improve active learning performance. We develop and test two methods to improve uncertainty estimates: post-hoc calibration of predictive uncertainty using the CRUDE algorithm, and replacing the Gaussian Process with a Student-t Process. We investigate the impact of these methods on MEPE for single sample and batch sample active learning. Our findings suggest that post-hoc calibration does not improve the performance of active learning using the MEPE method. However, we do find that the Student-t Process can outperform active learning strategies and random sampling using a Gaussian Process if the training set is sufficiently large.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" 31","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135340811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-08DOI: 10.1088/2632-2153/ad0ab3
Yaochi Tang, Kuohao Li
Abstract Bearings are one of the critical components of any mechanical equipment. They induce most equipment faults, and their health status directly impacts the overall performance of equipment. Therefore, effective bearing fault diagnosis is essential, as it helps maintain the equipment stability, increasing economic benefits through timely maintenance. Currently, most studies focus on extracting fault features, with limited attention to establishing fault thresholds. As a result, these thresholds are challenging to utilize in the automatic monitoring diagnosis of intelligent devices. This study employed the generalized fractal dimensions (GFDs) to effectively extract the feature of time-domain vibration signals of bearings. The optimal fault threshold model was developed using the receiver operating characteristic curve (ROC curve), which served as the baseline of exception judgment. The extracted fault threshold model was verified using two bearing operation experiments. The experimental results revealed different damaged positions and components observed in the two experiments. The same fault threshold model was obtained using the method proposed in this study, and it effectively diagnosed the abnormal states within the signals. This finding confirms the effectiveness of the diagnostic method proposed in this study.
{"title":"A Machine-learning approach to setting optimal thresholds and its application in rolling bearing fault diagnosis","authors":"Yaochi Tang, Kuohao Li","doi":"10.1088/2632-2153/ad0ab3","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0ab3","url":null,"abstract":"Abstract Bearings are one of the critical components of any mechanical equipment. They induce most equipment faults, and their health status directly impacts the overall performance of equipment. Therefore, effective bearing fault diagnosis is essential, as it helps maintain the equipment stability, increasing economic benefits through timely maintenance. Currently, most studies focus on extracting fault features, with limited attention to establishing fault thresholds. As a result, these thresholds are challenging to utilize in the automatic monitoring diagnosis of intelligent devices. This study employed the generalized fractal dimensions (GFDs) to effectively extract the feature of time-domain vibration signals of bearings. The optimal fault threshold model was developed using the receiver operating characteristic curve (ROC curve), which served as the baseline of exception judgment. The extracted fault threshold model was verified using two bearing operation experiments. The experimental results revealed different damaged positions and components observed in the two experiments. The same fault threshold model was obtained using the method proposed in this study, and it effectively diagnosed the abnormal states within the signals. This finding confirms the effectiveness of the diagnostic method proposed in this study.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135340967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-06DOI: 10.1088/2632-2153/ad035e
Alessandro Baiocchi, Stefano Giagu, Christian Napoli, Marco SERRA, Pietro Nardelli, Martina Valleriani
Abstract This paper presents a novel approach for fragmented solid object classification exploiting neural networks based on point clouds. This work is the initial step of a project in collaboration with the Institution of ‘Ente Parco Archeologico del Colosseo’ in Rome, which aims to reconstruct ancient artifacts from their fragments. We built from scratch a synthetic dataset (DS) of fragments of different 3D objects including aging effects. We used this DS to train deep learning models for the task of classifying internal and external fragments. As model architectures, we adopted PointNet and dynamical graph convolutional neural network, which take as input a point cloud representing the spatial geometry of a fragment, and we optimized model performance by adding additional features sensitive to local geometry characteristics. We tested the approach by performing several experiments to check the robustness and generalization capabilities of the models. Finally, we test the models on a real case using a 3D scan of artifacts preserved in different museums, artificially fragmented, obtaining good performance.
摘要提出了一种基于点云的神经网络碎片实体分类新方法。这项工作是与罗马“Ente Parco Archeologico del Colosseo”研究所合作项目的第一步,该项目旨在从碎片中重建古代文物。我们从零开始建立了一个合成数据集(DS),其中包括不同3D物体的碎片,包括老化效果。我们使用这个DS来训练深度学习模型来完成内部和外部碎片的分类任务。采用PointNet和动态图卷积神经网络作为模型架构,以表示碎片空间几何形状的点云为输入,通过增加对局部几何特征敏感的特征来优化模型性能。我们通过执行几个实验来测试该方法,以检查模型的鲁棒性和泛化能力。最后,我们在实际案例中对不同博物馆保存的文物进行了三维扫描,人工分割,获得了良好的效果。
{"title":"Artificial neural networks exploiting point cloud data for fragmented solid objects classification","authors":"Alessandro Baiocchi, Stefano Giagu, Christian Napoli, Marco SERRA, Pietro Nardelli, Martina Valleriani","doi":"10.1088/2632-2153/ad035e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad035e","url":null,"abstract":"Abstract This paper presents a novel approach for fragmented solid object classification exploiting neural networks based on point clouds. This work is the initial step of a project in collaboration with the Institution of ‘Ente Parco Archeologico del Colosseo’ in Rome, which aims to reconstruct ancient artifacts from their fragments. We built from scratch a synthetic dataset (DS) of fragments of different 3D objects including aging effects. We used this DS to train deep learning models for the task of classifying internal and external fragments. As model architectures, we adopted PointNet and dynamical graph convolutional neural network, which take as input a point cloud representing the spatial geometry of a fragment, and we optimized model performance by adding additional features sensitive to local geometry characteristics. We tested the approach by performing several experiments to check the robustness and generalization capabilities of the models. Finally, we test the models on a real case using a 3D scan of artifacts preserved in different museums, artificially fragmented, obtaining good performance.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"17 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135585228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-02DOI: 10.1088/2632-2153/ad0937
Sheng Chang, Ao Wu, Li Liu, Zifeng Wang, Shurong Pan, Jiangxue Huang, Qijun Huang, Jin He, Hao Wang
Abstract The wavefunction, as the basic hypothesis of quantum mechanics, describes the motion of particles and plays a pivotal role in determining physical properties at the atomic scale. However, its conventional acquisition method, such as density functional theory (DFT), requires a considerable amount of calculation, which brings numerous problems to wide application. Here, we propose an algorithmic framework based on graph neural network (GNN) to machine-learn the wavefunction of electrons. This framework primarily generates atomic features containing information about chemical environment and geometric structure and subsequently constructs a scalable distribution map. For the first time, the visualization of wavefunction of interface is realized by machine learning (ML) methods, bypassing complex calculation and obscure comprehension. In this way, we vividly illustrate quantum mechanics, which can inspire theoretical exploration. As an intriguing case to verify the ability of our method, a novel quantum confinement phenomenon on interfaces based on graphene nanoribbon (GNR) is uncovered. We believe that the versatility of this framework paves the way for swiftly linking quantum physics and atom-level structures.
{"title":"Graph machine learning framework for depicting wavefunction on interface","authors":"Sheng Chang, Ao Wu, Li Liu, Zifeng Wang, Shurong Pan, Jiangxue Huang, Qijun Huang, Jin He, Hao Wang","doi":"10.1088/2632-2153/ad0937","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0937","url":null,"abstract":"Abstract The wavefunction, as the basic hypothesis of quantum mechanics, describes the motion of particles and plays a pivotal role in determining physical properties at the atomic scale. However, its conventional acquisition method, such as density functional theory (DFT), requires a considerable amount of calculation, which brings numerous problems to wide application. Here, we propose an algorithmic framework based on graph neural network (GNN) to machine-learn the wavefunction of electrons. This framework primarily generates atomic features containing information about chemical environment and geometric structure and subsequently constructs a scalable distribution map. For the first time, the visualization of wavefunction of interface is realized by machine learning (ML) methods, bypassing complex calculation and obscure comprehension. In this way, we vividly illustrate quantum mechanics, which can inspire theoretical exploration. As an intriguing case to verify the ability of our method, a novel quantum confinement phenomenon on interfaces based on graphene nanoribbon (GNR) is uncovered. We believe that the versatility of this framework paves the way for swiftly linking quantum physics and atom-level structures.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"25 15","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135875561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-31DOI: 10.1088/2632-2153/ad087a
Andrea Coccaro, Francesco Armando Di Bello, Stefano Giagu, Lucrezia Rambelli, Nicola Stocchetti
Abstract Experimental particle physics demands a sophisticated trigger and acquisition system capable to efficiently retain the collisions of interest for further investigation. Heterogeneous computing with the employment of FPGA cards may emerge as a trending technology for the triggering strategy of the upcoming high-luminosity program of the Large Hadron Collider at CERN. In this context, we present two machine-learning algorithms for selecting events where neutral long-lived particles decay within the detector volume studying their accuracy and inference time when accelerated on commercially available Xilinx FPGA accelerator cards. The inference time is also confronted with a CPU- and GPU-based hardware setup. The proposed new algorithms are proven efficient for the considered benchmark physics scenario and their accuracy is found to not degrade when accelerated on the FPGA cards. The results indicate that all tested architectures fit within the latency requirements of a second-level trigger farm and that exploiting accelerator technologies for realtime processing of particle-physics collisions is a promising research field that deserves additional investigations, in particular with machine-learning models with a large number of trainable parameters.
{"title":"Fast Neural Network Inference on FPGAs for Triggering on Long-Lived Particles at Colliders","authors":"Andrea Coccaro, Francesco Armando Di Bello, Stefano Giagu, Lucrezia Rambelli, Nicola Stocchetti","doi":"10.1088/2632-2153/ad087a","DOIUrl":"https://doi.org/10.1088/2632-2153/ad087a","url":null,"abstract":"Abstract Experimental particle physics demands a sophisticated trigger and acquisition system capable to efficiently retain the collisions of interest for further investigation. Heterogeneous computing with the employment of FPGA cards may emerge as a trending technology for the triggering strategy of the upcoming high-luminosity program of the Large Hadron Collider at CERN. In this context, we present two machine-learning algorithms for selecting events where neutral long-lived particles decay within the detector volume studying their accuracy and inference time when accelerated on commercially available Xilinx FPGA accelerator cards. The inference time is also confronted with a CPU- and GPU-based hardware setup. The proposed new algorithms are proven efficient for the considered benchmark physics scenario and their accuracy is found to not degrade when accelerated on the FPGA cards. The results indicate that all tested architectures fit within the latency requirements of a second-level trigger farm and that exploiting accelerator technologies for realtime processing of particle-physics collisions is a promising research field that deserves additional investigations, in particular with machine-learning models with a large number of trainable parameters.
","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135808988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract To simulate bosons on a qubit- or qudit-based quantum computer, one has to regularize the theory by truncating infinite-dimensional local Hilbert spaces to finite dimensions. In the search for practical quantum applications, it is important to know how big the truncation errors can be. In general, it is not easy to estimate errors unless we have a good quantum computer. In this paper, we show that traditional sampling methods on classical devices, specifically Markov Chain Monte Carlo, can address this issue for a rather generic class of bosonic systems with a reasonable amount of computational resources available today. As a demonstration, we apply this idea to the scalar field theory on a two-dimensional lattice, with a size that goes beyond what is achievable using exact diagonalization methods. This method can be used to estimate the resources needed for realistic quantum simulations of bosonic theories, and also, to check the validity of the results of the corresponding quantum simulations.
{"title":"Estimating truncation effects of quantum bosonic systems using sampling algorithms","authors":"Masanori Hanada, Junyu Liu, Enrico Rinaldi, Masaki Tezuka","doi":"10.1088/2632-2153/ad035c","DOIUrl":"https://doi.org/10.1088/2632-2153/ad035c","url":null,"abstract":"Abstract To simulate bosons on a qubit- or qudit-based quantum computer, one has to regularize the theory by truncating infinite-dimensional local Hilbert spaces to finite dimensions. In the search for practical quantum applications, it is important to know how big the truncation errors can be. In general, it is not easy to estimate errors unless we have a good quantum computer. In this paper, we show that traditional sampling methods on classical devices, specifically Markov Chain Monte Carlo, can address this issue for a rather generic class of bosonic systems with a reasonable amount of computational resources available today. As a demonstration, we apply this idea to the scalar field theory on a two-dimensional lattice, with a size that goes beyond what is achievable using exact diagonalization methods. This method can be used to estimate the resources needed for realistic quantum simulations of bosonic theories, and also, to check the validity of the results of the corresponding quantum simulations.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"143 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136018146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}