Pub Date : 2024-07-14DOI: 10.1088/2632-2153/ad5f11
Wojciech G Stark, Cas van der Oord, Ilyes Batatia, Yaolong Zhang, Bin Jiang, Gábor Csányi and Reinhard J Maurer
Simulations of chemical reaction probabilities in gas surface dynamics require the calculation of ensemble averages over many tens of thousands of reaction events to predict dynamical observables that can be compared to experiments. At the same time, the energy landscapes need to be accurately mapped, as small errors in barriers can lead to large deviations in reaction probabilities. This brings a particularly interesting challenge for machine learning interatomic potentials, which are becoming well-established tools to accelerate molecular dynamics simulations. We compare state-of-the-art machine learning interatomic potentials with a particular focus on their inference performance on CPUs and suitability for high throughput simulation of reactive chemistry at surfaces. The considered models include polarizable atom interaction neural networks (PaiNN), recursively embedded atom neural networks (REANN), the MACE equivariant graph neural network, and atomic cluster expansion potentials (ACE). The models are applied to a dataset on reactive molecular hydrogen scattering on low-index surface facets of copper. All models are assessed for their accuracy, time-to-solution, and ability to simulate reactive sticking probabilities as a function of the rovibrational initial state and kinetic incidence energy of the molecule. REANN and MACE models provide the best balance between accuracy and time-to-solution and can be considered the current state-of-the-art in gas-surface dynamics. PaiNN models require many features for the best accuracy, which causes significant losses in computational efficiency. ACE models provide the fastest time-to-solution, however, models trained on the existing dataset were not able to achieve sufficiently accurate predictions in all cases.
模拟气体表面动力学中的化学反应概率需要计算数以万计反应事件的集合平均值,以预测可与实验进行比较的动力学观测值。与此同时,还需要精确绘制能谱,因为壁垒的微小误差就会导致反应概率的巨大偏差。这给机器学习原子间势带来了特别有趣的挑战,而机器学习原子间势正在成为加速分子动力学模拟的成熟工具。我们对最先进的机器学习原子间势进行了比较,重点关注它们在 CPU 上的推理性能以及是否适合表面反应化学的高通量模拟。所考虑的模型包括可极化原子相互作用神经网络(PaiNN)、递归嵌入式原子神经网络(REANN)、MACE 等变图神经网络和原子簇扩展势能(ACE)。这些模型被应用于铜的低指数表面面上的反应分子氢散射数据集。对所有模型的准确性、求解时间以及模拟反应性粘滞概率的能力进行了评估,并将其作为分子振荡初始状态和动力学入射能的函数。REANN 和 MACE 模型在精确度和求解时间之间取得了最佳平衡,可以说是目前最先进的气体表面动力学模型。PaiNN 模型需要许多特征才能达到最佳精度,这就大大降低了计算效率。ACE 模型提供了最快的求解时间,但在现有数据集上训练的模型并不能在所有情况下都实现足够准确的预测。
{"title":"Benchmarking of machine learning interatomic potentials for reactive hydrogen dynamics at metal surfaces","authors":"Wojciech G Stark, Cas van der Oord, Ilyes Batatia, Yaolong Zhang, Bin Jiang, Gábor Csányi and Reinhard J Maurer","doi":"10.1088/2632-2153/ad5f11","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f11","url":null,"abstract":"Simulations of chemical reaction probabilities in gas surface dynamics require the calculation of ensemble averages over many tens of thousands of reaction events to predict dynamical observables that can be compared to experiments. At the same time, the energy landscapes need to be accurately mapped, as small errors in barriers can lead to large deviations in reaction probabilities. This brings a particularly interesting challenge for machine learning interatomic potentials, which are becoming well-established tools to accelerate molecular dynamics simulations. We compare state-of-the-art machine learning interatomic potentials with a particular focus on their inference performance on CPUs and suitability for high throughput simulation of reactive chemistry at surfaces. The considered models include polarizable atom interaction neural networks (PaiNN), recursively embedded atom neural networks (REANN), the MACE equivariant graph neural network, and atomic cluster expansion potentials (ACE). The models are applied to a dataset on reactive molecular hydrogen scattering on low-index surface facets of copper. All models are assessed for their accuracy, time-to-solution, and ability to simulate reactive sticking probabilities as a function of the rovibrational initial state and kinetic incidence energy of the molecule. REANN and MACE models provide the best balance between accuracy and time-to-solution and can be considered the current state-of-the-art in gas-surface dynamics. PaiNN models require many features for the best accuracy, which causes significant losses in computational efficiency. ACE models provide the fastest time-to-solution, however, models trained on the existing dataset were not able to achieve sufficiently accurate predictions in all cases.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"19 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10DOI: 10.1088/2632-2153/ad5f13
Majd Ghrear, Peter Sadowski and Sven E Vahsen
We present the first method to probabilistically predict 3D direction in a deep neural network model. The probabilistic predictions are modeled as a heteroscedastic von Mises-Fisher distribution on the sphere , giving a simple way to quantify aleatoric uncertainty. This approach generalizes the cosine distance loss which is a special case of our loss function when the uncertainty is assumed to be uniform across samples. We develop approximations required to make the likelihood function and gradient calculations stable. The method is applied to the task of predicting the 3D directions of electrons, the most complex signal in a class of experimental particle physics detectors designed to demonstrate the particle nature of dark matter and study solar neutrinos. Using simulated Monte Carlo data, the initial direction of recoiling electrons is inferred from their tortuous trajectories, as captured by the 3D detectors. For keV electrons in a 70% He 30% CO2 gas mixture at STP, the new approach achieves a mean cosine distance of 0.104 (26∘) compared to 0.556 (64∘) achieved by a non-machine learning algorithm. We show that the model is well-calibrated and accuracy can be increased further by removing samples with high predicted uncertainty. This advancement in probabilistic 3D directional learning could increase the sensitivity of directional dark matter detectors.
我们首次提出了在深度神经网络模型中以概率方式预测三维方向的方法。概率预测被建模为球面上的异方差 von Mises-Fisher 分布,从而提供了一种量化不确定性的简单方法。这种方法概括了余弦距离损失,当不确定性被假定为跨样本均匀时,余弦距离损失是我们损失函数的一种特例。我们开发了使似然函数和梯度计算稳定所需的近似值。该方法被应用于预测电子的三维方向,这是一类粒子物理实验探测器中最复杂的信号,旨在证明暗物质的粒子性质和研究太阳中微子。利用模拟蒙特卡洛数据,通过三维探测器捕捉到的电子迂回轨迹,推断出反冲电子的初始方向。对于在 STP 条件下 70% He 30% CO2 混合气体中的 keV 电子,新方法实现的平均余弦距离为 0.104 (26∘),而非机器学习算法实现的平均余弦距离为 0.556 (64∘)。我们的研究表明,该模型校准良好,通过去除预测不确定性较高的样本,可以进一步提高精度。概率三维定向学习的这一进步可以提高定向暗物质探测器的灵敏度。
{"title":"Deep probabilistic direction prediction in 3D with applications to directional dark matter detectors","authors":"Majd Ghrear, Peter Sadowski and Sven E Vahsen","doi":"10.1088/2632-2153/ad5f13","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f13","url":null,"abstract":"We present the first method to probabilistically predict 3D direction in a deep neural network model. The probabilistic predictions are modeled as a heteroscedastic von Mises-Fisher distribution on the sphere , giving a simple way to quantify aleatoric uncertainty. This approach generalizes the cosine distance loss which is a special case of our loss function when the uncertainty is assumed to be uniform across samples. We develop approximations required to make the likelihood function and gradient calculations stable. The method is applied to the task of predicting the 3D directions of electrons, the most complex signal in a class of experimental particle physics detectors designed to demonstrate the particle nature of dark matter and study solar neutrinos. Using simulated Monte Carlo data, the initial direction of recoiling electrons is inferred from their tortuous trajectories, as captured by the 3D detectors. For keV electrons in a 70% He 30% CO2 gas mixture at STP, the new approach achieves a mean cosine distance of 0.104 (26∘) compared to 0.556 (64∘) achieved by a non-machine learning algorithm. We show that the model is well-calibrated and accuracy can be increased further by removing samples with high predicted uncertainty. This advancement in probabilistic 3D directional learning could increase the sensitivity of directional dark matter detectors.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"24 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141586996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-10DOI: 10.1088/2632-2153/ad5f17
Thang M Pham, Nam Do, Ha T T Pham, Hanh T Bui, Thang T Do and Manh V Hoang
Landslides, which can occur due to earthquakes and heavy rainfall, pose significant challenges across large areas. To effectively manage these disasters, it is crucial to have fast and reliable automatic detection methods for mapping landslides. In recent years, deep learning methods, particularly convolutional neural and fully convolutional networks, have been successfully applied to various fields, including landslide detection, with remarkable accuracy and high reliability. However, most of these models achieved high detection performance based on high-resolution satellite images. In this research, we introduce a modified Residual U-Net combined with the Convolutional Block Attention Module, a deep learning method, for automatic landslide mapping. The proposed method is trained and assessed using freely available data sets acquired from Sentinel-2 sensors, digital elevation models, and slope data from ALOS PALSAR with a spatial resolution of 10 m. Compared to the original ResU-Net model, the proposed architecture achieved higher accuracy, with the F1-score improving by 9.1% for the landslide class. Additionally, it offers a lower computational cost, with 1.38 giga multiply-accumulate operations per second (GMACS) needed to execute the model compared to 2.68 GMACS in the original model. The source code is available at https://github.com/manhhv87/LandSlideMapping.git.
{"title":"CResU-Net: a method for landslide mapping using deep learning","authors":"Thang M Pham, Nam Do, Ha T T Pham, Hanh T Bui, Thang T Do and Manh V Hoang","doi":"10.1088/2632-2153/ad5f17","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5f17","url":null,"abstract":"Landslides, which can occur due to earthquakes and heavy rainfall, pose significant challenges across large areas. To effectively manage these disasters, it is crucial to have fast and reliable automatic detection methods for mapping landslides. In recent years, deep learning methods, particularly convolutional neural and fully convolutional networks, have been successfully applied to various fields, including landslide detection, with remarkable accuracy and high reliability. However, most of these models achieved high detection performance based on high-resolution satellite images. In this research, we introduce a modified Residual U-Net combined with the Convolutional Block Attention Module, a deep learning method, for automatic landslide mapping. The proposed method is trained and assessed using freely available data sets acquired from Sentinel-2 sensors, digital elevation models, and slope data from ALOS PALSAR with a spatial resolution of 10 m. Compared to the original ResU-Net model, the proposed architecture achieved higher accuracy, with the F1-score improving by 9.1% for the landslide class. Additionally, it offers a lower computational cost, with 1.38 giga multiply-accumulate operations per second (GMACS) needed to execute the model compared to 2.68 GMACS in the original model. The source code is available at https://github.com/manhhv87/LandSlideMapping.git.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"235 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141588574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-09DOI: 10.1088/2632-2153/ad563c
F Vaselli, F Cattafesta, P Asenov, A Rizzi
The simulation of high-energy physics collision events is a key element for data analysis at present and future particle accelerators. The comparison of simulation predictions to data allows looking for rare deviations that can be due to new phenomena not previously observed. We show that novel machine learning algorithms, specifically Normalizing Flows and Flow Matching, can be used to replicate accurate simulations from traditional approaches with several orders of magnitude of speed-up. The classical simulation chain starts from a physics process of interest, computes energy deposits of particles and electronics response, and finally employs the same reconstruction algorithms used for data. Eventually, the data are reduced to some high-level analysis format. Instead, we propose an end-to-end approach, simulating the final data format directly from physical generator inputs, skipping any intermediate steps. We use particle jets simulation as a benchmark for comparing both discrete and continuous Normalizing Flows models. The models are validated across a variety of metrics to identify the most accurate. We discuss the scaling of performance with the increase in training data, as well as the generalization power of these models on physical processes different from the training one. We investigate sampling multiple times from the same physical generator inputs, a procedure we name oversampling, and we show that it can effectively reduce the statistical uncertainties of a dataset. This class of ML algorithms is found to be capable of learning the expected detector response independently of the physical input process. The speed and accuracy of the models, coupled with the stability of the training procedure, make them a compelling tool for the needs of current and future experiments.
高能物理碰撞事件的模拟是目前和未来粒子加速器数据分析的关键要素。将模拟预测与数据进行比较,可以发现罕见的偏差,而这些偏差可能是由于以前未观察到的新现象造成的。我们展示了新颖的机器学习算法,特别是 "归一化流量"(Normalizing Flows)和 "流量匹配"(Flow Matching)算法,可用于从传统方法中复制精确的模拟结果,并将速度提高几个数量级。经典模拟链从感兴趣的物理过程开始,计算粒子的能量沉积和电子响应,最后采用与数据相同的重构算法。最终,数据被还原为某种高级分析格式。相反,我们提出了一种端到端的方法,直接从物理发生器输入模拟最终数据格式,跳过任何中间步骤。我们将粒子喷流模拟作为比较离散和连续归一化流模型的基准。通过各种指标对模型进行验证,以确定最准确的模型。我们讨论了性能随着训练数据的增加而缩放的问题,以及这些模型对不同于训练数据的物理过程的泛化能力。我们研究了从相同的物理发生器输入中进行多次采样的方法,我们将这一过程命名为 "超采样",结果表明它能有效降低数据集的统计不确定性。我们发现,这类 ML 算法能够独立于物理输入过程学习预期的探测器响应。模型的速度和准确性,加上训练过程的稳定性,使它们成为满足当前和未来实验需求的有力工具。
{"title":"End-to-end simulation of particle physics events with flow matching and generator oversampling","authors":"F Vaselli, F Cattafesta, P Asenov, A Rizzi","doi":"10.1088/2632-2153/ad563c","DOIUrl":"https://doi.org/10.1088/2632-2153/ad563c","url":null,"abstract":"The simulation of high-energy physics collision events is a key element for data analysis at present and future particle accelerators. The comparison of simulation predictions to data allows looking for rare deviations that can be due to new phenomena not previously observed. We show that novel machine learning algorithms, specifically Normalizing Flows and Flow Matching, can be used to replicate accurate simulations from traditional approaches with several orders of magnitude of speed-up. The classical simulation chain starts from a physics process of interest, computes energy deposits of particles and electronics response, and finally employs the same reconstruction algorithms used for data. Eventually, the data are reduced to some high-level analysis format. Instead, we propose an end-to-end approach, simulating the final data format directly from physical generator inputs, skipping any intermediate steps. We use particle jets simulation as a benchmark for comparing both <italic toggle=\"yes\">discrete</italic> and <italic toggle=\"yes\">continuous</italic> Normalizing Flows models. The models are validated across a variety of metrics to identify the most accurate. We discuss the scaling of performance with the increase in training data, as well as the generalization power of these models on physical processes different from the training one. We investigate sampling multiple times from the same physical generator inputs, a procedure we name <italic toggle=\"yes\">oversampling</italic>, and we show that it can effectively reduce the statistical uncertainties of a dataset. This class of ML algorithms is found to be capable of learning the expected detector response independently of the physical input process. The speed and accuracy of the models, coupled with the stability of the training procedure, make them a compelling tool for the needs of current and future experiments.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"38 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-04DOI: 10.1088/2632-2153/ad594a
Matthias Kellner and Michele Ceriotti
Statistical learning algorithms provide a generally-applicable framework to sidestep time-consuming experiments, or accurate physics-based modeling, but they introduce a further source of error on top of the intrinsic limitations of the experimental or theoretical setup. Uncertainty estimation is essential to quantify this error, and to make application of data-centric approaches more trustworthy. To ensure that uncertainty quantification is used widely, one should aim for algorithms that are accurate, but also easy to implement and apply. In particular, including uncertainty quantification on top of an existing architecture should be straightforward, and add minimal computational overhead. Furthermore, it should be easy to manipulate or combine multiple machine-learning predictions, propagating uncertainty over further modeling steps. We compare several well-established uncertainty quantification frameworks against these requirements, and propose a practical approach, which we dub direct propagation of shallow ensembles, that provides a good compromise between ease of use and accuracy. We present benchmarks for generic datasets, and an in-depth study of applications to the field of atomistic machine learning for chemistry and materials. These examples underscore the importance of using a formulation that allows propagating errors without making strong assumptions on the correlations between different predictions of the model.
{"title":"Uncertainty quantification by direct propagation of shallow ensembles","authors":"Matthias Kellner and Michele Ceriotti","doi":"10.1088/2632-2153/ad594a","DOIUrl":"https://doi.org/10.1088/2632-2153/ad594a","url":null,"abstract":"Statistical learning algorithms provide a generally-applicable framework to sidestep time-consuming experiments, or accurate physics-based modeling, but they introduce a further source of error on top of the intrinsic limitations of the experimental or theoretical setup. Uncertainty estimation is essential to quantify this error, and to make application of data-centric approaches more trustworthy. To ensure that uncertainty quantification is used widely, one should aim for algorithms that are accurate, but also easy to implement and apply. In particular, including uncertainty quantification on top of an existing architecture should be straightforward, and add minimal computational overhead. Furthermore, it should be easy to manipulate or combine multiple machine-learning predictions, propagating uncertainty over further modeling steps. We compare several well-established uncertainty quantification frameworks against these requirements, and propose a practical approach, which we dub direct propagation of shallow ensembles, that provides a good compromise between ease of use and accuracy. We present benchmarks for generic datasets, and an in-depth study of applications to the field of atomistic machine learning for chemistry and materials. These examples underscore the importance of using a formulation that allows propagating errors without making strong assumptions on the correlations between different predictions of the model.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"13 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-04DOI: 10.1088/2632-2153/ad5bbf
Carlo Abate, Sergio Decherchi and Andrea Cavalli
Drug design is both a time consuming and expensive endeavour. Computational strategies offer viable options to address this task; deep learning approaches in particular are indeed gaining traction for their capability of dealing with chemical structures. A straightforward way to represent such structures is via their molecular graph, which in turn can be naturally processed by graph neural networks. This paper introduces AMCG, a dual atomic-molecular, conditional, latent-space, generative model built around graph processing layers able to support both unconditional and conditional molecular graph generation. Among other features, AMCG is a one-shot model allowing for fast sampling, explicit atomic type histogram assignation and property optimization via gradient ascent. The model was trained on the Quantum Machines 9 (QM9) and ZINC datasets, achieving state-of-the-art performances. Together with classic benchmarks, AMCG was also tested by generating large-scale sampled sets, showing robustness in terms of sustainable throughput of valid, novel and unique molecules.
{"title":"AMCG: a graph dual atomic-molecular conditional molecular generator","authors":"Carlo Abate, Sergio Decherchi and Andrea Cavalli","doi":"10.1088/2632-2153/ad5bbf","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5bbf","url":null,"abstract":"Drug design is both a time consuming and expensive endeavour. Computational strategies offer viable options to address this task; deep learning approaches in particular are indeed gaining traction for their capability of dealing with chemical structures. A straightforward way to represent such structures is via their molecular graph, which in turn can be naturally processed by graph neural networks. This paper introduces AMCG, a dual atomic-molecular, conditional, latent-space, generative model built around graph processing layers able to support both unconditional and conditional molecular graph generation. Among other features, AMCG is a one-shot model allowing for fast sampling, explicit atomic type histogram assignation and property optimization via gradient ascent. The model was trained on the Quantum Machines 9 (QM9) and ZINC datasets, achieving state-of-the-art performances. Together with classic benchmarks, AMCG was also tested by generating large-scale sampled sets, showing robustness in terms of sustainable throughput of valid, novel and unique molecules.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"44 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-03DOI: 10.1088/2632-2153/ad56f9
Shailesh Lal, Suvajit Majumder and Evgeny Sobko
We provide a novel neural network architecture that can: i) output R-matrix for a given quantum integrable spin chain, ii) search for an integrable Hamiltonian and the corresponding R-matrix under assumptions of certain symmetries or other restrictions, iii) explore the space of Hamiltonians around already learned models and reconstruct the family of integrable spin chains which they belong to. The neural network training is done by minimizing loss functions encoding Yang–Baxter equation, regularity and other model-specific restrictions such as hermiticity. Holomorphy is implemented via the choice of activation functions. We demonstrate the work of our neural network on the spin chains of difference form with two-dimensional local space. In particular, we reconstruct the R-matrices for all 14 classes. We also demonstrate its utility as an Explorer, scanning a certain subspace of Hamiltonians and identifying integrable classes after clusterisation. The last strategy can be used in future to carve out the map of integrable spin chains with higher dimensional local space and in more general settings where no analytical methods are available.
我们提供了一种新颖的神经网络架构,它可以:i) 输出给定量子可积分自旋链的 R 矩阵;ii) 在某些对称性或其他限制条件的假设下,搜索可积分哈密顿和相应的 R 矩阵;iii) 围绕已学模型探索哈密顿空间,并重建它们所属的可积分自旋链家族。神经网络训练是通过最小化编码杨-巴克斯特方程、正则性和其他特定模型限制(如隐蔽性)的损失函数来完成的。全态性是通过选择激活函数来实现的。我们在具有二维局部空间的差分形式自旋链上演示了神经网络的工作。特别是,我们重建了所有 14 个类别的 R 矩阵。我们还展示了它作为探索者的实用性,它可以扫描汉密尔顿的某个子空间,并在聚类后识别可积分类。最后一种策略今后可用于在更高维度的局部空间和没有分析方法的更一般环境中刻画出可积分自旋链的映射。
{"title":"The R-mAtrIx Net","authors":"Shailesh Lal, Suvajit Majumder and Evgeny Sobko","doi":"10.1088/2632-2153/ad56f9","DOIUrl":"https://doi.org/10.1088/2632-2153/ad56f9","url":null,"abstract":"We provide a novel neural network architecture that can: i) output R-matrix for a given quantum integrable spin chain, ii) search for an integrable Hamiltonian and the corresponding R-matrix under assumptions of certain symmetries or other restrictions, iii) explore the space of Hamiltonians around already learned models and reconstruct the family of integrable spin chains which they belong to. The neural network training is done by minimizing loss functions encoding Yang–Baxter equation, regularity and other model-specific restrictions such as hermiticity. Holomorphy is implemented via the choice of activation functions. We demonstrate the work of our neural network on the spin chains of difference form with two-dimensional local space. In particular, we reconstruct the R-matrices for all 14 classes. We also demonstrate its utility as an Explorer, scanning a certain subspace of Hamiltonians and identifying integrable classes after clusterisation. The last strategy can be used in future to carve out the map of integrable spin chains with higher dimensional local space and in more general settings where no analytical methods are available.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"11 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141552911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-02DOI: 10.1088/2632-2153/ad52e7
Giles C Strong, Maxime Lagrange, Aitor Orio, Anna Bordignon, Florian Bury, Tommaso Dorigo, Andrea Giammanco, Mariam Heikal, Jan Kieseler, Max Lamparth, Pablo Martínez Ruíz del Árbol, Federico Nardi, Pietro Vischia and Haitham Zaraket
We describe a software package, TomOpt, developed to optimise the geometrical layout and specifications of detectors designed for tomography by scattering of cosmic-ray muons. The software exploits differentiable programming for the modeling of muon interactions with detectors and scanned volumes, the inference of volume properties, and the optimisation cycle performing the loss minimisation. In doing so, we provide the first demonstration of end-to-end-differentiable and inference-aware optimisation of particle physics instruments. We study the performance of the software on a relevant benchmark scenario and discuss its potential applications. Our code is available on Github (Strong et al 2024 available at: https://github.com/GilesStrong/tomopt).
我们介绍了一个名为 "TomOpt "的软件包,该软件包的开发目的是优化通过宇宙射线μ介子散射进行断层扫描的探测器的几何布局和规格。该软件利用可微分编程对μ介子与探测器和扫描体积的相互作用进行建模,推断体积属性,以及执行损耗最小化的优化循环。这样,我们首次展示了粒子物理仪器端到端可微分和推理感知优化。我们研究了该软件在相关基准场景下的性能,并讨论了它的潜在应用。我们的代码可在 Github 上获取(Strong et al 2024,网址:https://github.com/GilesStrong/tomopt)。
{"title":"TomOpt: differential optimisation for task- and constraint-aware design of particle detectors in the context of muon tomography","authors":"Giles C Strong, Maxime Lagrange, Aitor Orio, Anna Bordignon, Florian Bury, Tommaso Dorigo, Andrea Giammanco, Mariam Heikal, Jan Kieseler, Max Lamparth, Pablo Martínez Ruíz del Árbol, Federico Nardi, Pietro Vischia and Haitham Zaraket","doi":"10.1088/2632-2153/ad52e7","DOIUrl":"https://doi.org/10.1088/2632-2153/ad52e7","url":null,"abstract":"We describe a software package, TomOpt, developed to optimise the geometrical layout and specifications of detectors designed for tomography by scattering of cosmic-ray muons. The software exploits differentiable programming for the modeling of muon interactions with detectors and scanned volumes, the inference of volume properties, and the optimisation cycle performing the loss minimisation. In doing so, we provide the first demonstration of end-to-end-differentiable and inference-aware optimisation of particle physics instruments. We study the performance of the software on a relevant benchmark scenario and discuss its potential applications. Our code is available on Github (Strong et al 2024 available at: https://github.com/GilesStrong/tomopt).","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"5 3 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141550029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01DOI: 10.1088/2632-2153/ad5926
Paul Hagemann, Johannes Hertrich, Maren Casfor, Sebastian Heidenreich and Gabriele Steidl
We develop an algorithm for jointly estimating the posterior and the noise parameters in Bayesian inverse problems, which is motivated by indirect measurements and applications from nanometrology with a mixed noise model. We propose to solve the problem by an expectation maximization (EM) algorithm. Based on the current noise parameters, we learn in the E-step a conditional normalizing flow that approximates the posterior. In the M-step, we propose to find the noise parameter updates again by an EM algorithm, which has analytical formulas. We compare the training of the conditional normalizing flow with the forward and reverse Kullback–Leibler divergence, and show that our model is able to incorporate information from many measurements, unlike previous approaches.
我们开发了一种在贝叶斯逆问题中联合估计后验参数和噪声参数的算法,该算法的动机来自于采用混合噪声模型的纳米计量学的间接测量和应用。我们建议用期望最大化(EM)算法来解决这个问题。基于当前的噪声参数,我们在 E 步中学习一个近似后验的条件归一化流。在 M 步中,我们建议通过 EM 算法再次找到噪声参数更新,该算法具有解析公式。我们将条件归一化流的训练与正向和反向库尔贝克-莱布勒发散进行了比较,结果表明,与以往的方法不同,我们的模型能够纳入来自许多测量的信息。
{"title":"Mixed noise and posterior estimation with conditional deepGEM","authors":"Paul Hagemann, Johannes Hertrich, Maren Casfor, Sebastian Heidenreich and Gabriele Steidl","doi":"10.1088/2632-2153/ad5926","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5926","url":null,"abstract":"We develop an algorithm for jointly estimating the posterior and the noise parameters in Bayesian inverse problems, which is motivated by indirect measurements and applications from nanometrology with a mixed noise model. We propose to solve the problem by an expectation maximization (EM) algorithm. Based on the current noise parameters, we learn in the E-step a conditional normalizing flow that approximates the posterior. In the M-step, we propose to find the noise parameter updates again by an EM algorithm, which has analytical formulas. We compare the training of the conditional normalizing flow with the forward and reverse Kullback–Leibler divergence, and show that our model is able to incorporate information from many measurements, unlike previous approaches.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"86 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-27DOI: 10.1088/2632-2153/ad5784
Atsarina Larasati Anindya, Torbjörn Nur Olsson, Maja Jensen, Maria-Jose Garcia-Bonete, Sally P Wheatley, Maria I Bokarewa, Stefano A Mezzasalma and Gergely Katona
In the realm of atomic physics and chemistry, composition emerges as the most powerful means of describing matter. Mendeleev’s periodic table and chemical formulas, while not entirely free from ambiguities, provide robust approximations for comprehending the properties of atoms, chemicals, and their collective behaviours, which stem from the dynamic interplay of their constituents. Our study illustrates that protein-protein interactions follow a similar paradigm, wherein the composition of peptides plays a pivotal role in predicting their interactions with the protein survivin, using an elegantly simple model. An analysis of these predictions within the context of the human proteome not only confirms the known cellular locations of survivin and its interaction partners, but also introduces novel insights into biological functionality. It becomes evident that electrostatic- and primary structure-based descriptions fall short in predictive power, leading us to speculate that protein interactions are orchestrated by the collective dynamics of functional groups.
{"title":"Deciphering peptide-protein interactions via composition-based prediction: a case study with survivin/BIRC5","authors":"Atsarina Larasati Anindya, Torbjörn Nur Olsson, Maja Jensen, Maria-Jose Garcia-Bonete, Sally P Wheatley, Maria I Bokarewa, Stefano A Mezzasalma and Gergely Katona","doi":"10.1088/2632-2153/ad5784","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5784","url":null,"abstract":"In the realm of atomic physics and chemistry, composition emerges as the most powerful means of describing matter. Mendeleev’s periodic table and chemical formulas, while not entirely free from ambiguities, provide robust approximations for comprehending the properties of atoms, chemicals, and their collective behaviours, which stem from the dynamic interplay of their constituents. Our study illustrates that protein-protein interactions follow a similar paradigm, wherein the composition of peptides plays a pivotal role in predicting their interactions with the protein survivin, using an elegantly simple model. An analysis of these predictions within the context of the human proteome not only confirms the known cellular locations of survivin and its interaction partners, but also introduces novel insights into biological functionality. It becomes evident that electrostatic- and primary structure-based descriptions fall short in predictive power, leading us to speculate that protein interactions are orchestrated by the collective dynamics of functional groups.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"236 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141519229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}