Machine Learning Science and Technology最新文献

英文中文

Incorporating background knowledge in symbolic regression using a computer algebra system 利用计算机代数系统在符号回归中纳入背景知识

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-06-02 DOI: 10.1088/2632-2153/ad4a1e

Charles Fox, Neil D Tran, F Nikki Nacion, Samiha Sharlin and Tyler R Josephson

Symbolic regression (SR) can generate interpretable, concise expressions that fit a given dataset, allowing for more human understanding of the structure than black-box approaches. The addition of background knowledge (in the form of symbolic mathematical constraints) allows for the generation of expressions that are meaningful with respect to theory while also being consistent with data. We specifically examine the addition of constraints to traditional genetic algorithm (GA) based SR (PySR) as well as a Markov-chain Monte Carlo (MCMC) based Bayesian SR architecture (Bayesian Machine Scientist), and apply these to rediscovering adsorption equations from experimental, historical datasets. We find that, while hard constraints prevent GA and MCMC SR from searching, soft constraints can lead to improved performance both in terms of search effectiveness and model meaningfulness, with computational costs increasing by about an order of magnitude. If the constraints do not correlate well with the dataset or expected models, they can hinder the search of expressions. We find incorporating these constraints in Bayesian SR (as the Bayesian prior) is better than by modifying the fitness function in the GA.

符号回归（SR）可以生成符合给定数据集的可解释的简洁表达式，与黑箱方法相比，它能让人类更好地理解数据结构。增加背景知识（以符号数学约束的形式）可以生成既符合理论又有意义的表达式。我们特别研究了在基于遗传算法（GA）的传统 SR（PySR）和基于马尔可夫链蒙特卡罗（MCMC）的贝叶斯 SR 架构（贝叶斯机器科学家）中添加约束的问题，并将其应用于从实验、历史数据集中重新发现吸附方程。我们发现，虽然硬约束阻碍了 GA 和 MCMC SR 的搜索，但软约束可以在搜索效果和模型意义方面提高性能，计算成本大约增加一个数量级。如果约束条件与数据集或预期模型关联度不高，就会阻碍表达式的搜索。我们发现，将这些约束条件纳入贝叶斯 SR（作为贝叶斯先验）比在 GA 中修改拟合函数效果更好。

{"title":"Incorporating background knowledge in symbolic regression using a computer algebra system","authors":"Charles Fox, Neil D Tran, F Nikki Nacion, Samiha Sharlin and Tyler R Josephson","doi":"10.1088/2632-2153/ad4a1e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4a1e","url":null,"abstract":"Symbolic regression (SR) can generate interpretable, concise expressions that fit a given dataset, allowing for more human understanding of the structure than black-box approaches. The addition of background knowledge (in the form of symbolic mathematical constraints) allows for the generation of expressions that are meaningful with respect to theory while also being consistent with data. We specifically examine the addition of constraints to traditional genetic algorithm (GA) based SR (PySR) as well as a Markov-chain Monte Carlo (MCMC) based Bayesian SR architecture (Bayesian Machine Scientist), and apply these to rediscovering adsorption equations from experimental, historical datasets. We find that, while hard constraints prevent GA and MCMC SR from searching, soft constraints can lead to improved performance both in terms of search effectiveness and model meaningfulness, with computational costs increasing by about an order of magnitude. If the constraints do not correlate well with the dataset or expected models, they can hinder the search of expressions. We find incorporating these constraints in Bayesian SR (as the Bayesian prior) is better than by modifying the fitness function in the GA.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141258806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GPU optimization techniques to accelerate optiGAN-a particle simulation GAN. 利用 GPU 优化技术加速 OptiGAN--粒子模拟 GAN。

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-06-01 Epub Date: 2024-06-13 DOI: 10.1088/2632-2153/ad51c9

Anirudh Srikanth, Carlotta Trigila, Emilie Roncali

The demand for specialized hardware to train AI models has increased in tandem with the increase in the model complexity over the recent years. Graphics processing unit (GPU) is one such hardware that is capable of parallelizing operations performed on a large chunk of data. Companies like Nvidia, AMD, and Google have been constantly scaling-up the hardware performance as fast as they can. Nevertheless, there is still a gap between the required processing power and processing capacity of the hardware. To increase the hardware utilization, the software has to be optimized too. In this paper, we present some general GPU optimization techniques we used to efficiently train the optiGAN model, a Generative Adversarial Network that is capable of generating multidimensional probability distributions of optical photons at the photodetector face in radiation detectors, on an 8GB Nvidia Quadro RTX 4000 GPU. We analyze and compare the performances of all the optimizations based on the execution time and the memory consumed using the Nvidia Nsight Systems profiler tool. The optimizations gave approximately a 4.5x increase in the runtime performance when compared to a naive training on the GPU, without compromising the model performance. Finally we discuss optiGANs future work and how we are planning to scale the model on GPUs.

近年来，随着人工智能模型复杂性的增加，对训练人工智能模型的专用硬件的需求也随之增加。图形处理器（GPU）就是这样一种能够对大块数据进行并行运算的硬件。Nvidia、AMD 和谷歌等公司一直在以最快的速度提升硬件性能。然而，所需的处理能力与硬件的处理能力之间仍然存在差距。要提高硬件的利用率，就必须对软件进行优化。在本文中，我们介绍了一些通用 GPU 优化技术，这些技术用于在 8GB Nvidia Quadro RTX 4000 GPU 上高效训练 optiGAN 模型，这是一个生成对抗网络，能够生成辐射探测器中光电探测器面上光学光子的多维概率分布。我们使用 Nvidia Nsight Systems profiler 工具，根据执行时间和内存消耗分析和比较了所有优化的性能。与 GPU 上的原始训练相比，优化后的运行时间性能提高了约 4.5 倍，而模型性能并未受到影响。最后，我们将讨论 optiGANs 的未来工作，以及我们计划如何在 GPU 上扩展该模型。

{"title":"GPU optimization techniques to accelerate optiGAN-a particle simulation GAN.","authors":"Anirudh Srikanth, Carlotta Trigila, Emilie Roncali","doi":"10.1088/2632-2153/ad51c9","DOIUrl":"10.1088/2632-2153/ad51c9","url":null,"abstract":"<p><p>The demand for specialized hardware to train AI models has increased in tandem with the increase in the model complexity over the recent years. Graphics processing unit (GPU) is one such hardware that is capable of parallelizing operations performed on a large chunk of data. Companies like Nvidia, AMD, and Google have been constantly scaling-up the hardware performance as fast as they can. Nevertheless, there is still a gap between the required processing power and processing capacity of the hardware. To increase the hardware utilization, the software has to be optimized too. In this paper, we present some general GPU optimization techniques we used to efficiently train the optiGAN model, a Generative Adversarial Network that is capable of generating multidimensional probability distributions of optical photons at the photodetector face in radiation detectors, on an 8GB Nvidia Quadro RTX 4000 GPU. We analyze and compare the performances of all the optimizations based on the execution time and the memory consumed using the Nvidia Nsight Systems profiler tool. The optimizations gave approximately a 4.5x increase in the runtime performance when compared to a naive training on the GPU, without compromising the model performance. Finally we discuss optiGANs future work and how we are planning to scale the model on GPUs.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"5 2","pages":"027001"},"PeriodicalIF":6.8,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170465/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141331906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Transformer-powered surrogates close the ICF simulation-experiment gap with extremely limited data 变压器供电的代用设备利用极其有限的数据缩小了 ICF 模拟与实验之间的差距

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-05-30 DOI: 10.1088/2632-2153/ad4e03

Matthew L Olson, Shusen Liu, Jayaraman J Thiagarajan, Bogdan Kustowski, Weng-Keen Wong and Rushil Anirudh

Recent advances in machine learning, specifically transformer architecture, have led to significant advancements in commercial domains. These powerful models have demonstrated superior capability to learn complex relationships and often generalize better to new data and problems. This paper presents a novel transformer-powered approach for enhancing prediction accuracy in multi-modal output scenarios, where sparse experimental data is supplemented with simulation data. The proposed approach integrates transformer-based architecture with a novel graph-based hyper-parameter optimization technique. The resulting system not only effectively reduces simulation bias, but also achieves superior prediction accuracy compared to the prior method. We demonstrate the efficacy of our approach on inertial confinement fusion experiments, where only 10 shots of real-world data are available, as well as synthetic versions of these experiments.

机器学习（特别是变换器架构）领域的最新进展已在商业领域取得重大进展。这些功能强大的模型在学习复杂关系方面表现出了卓越的能力，通常能更好地概括新数据和新问题。本文介绍了一种新颖的变压器驱动方法，用于提高多模式输出场景中的预测准确性，在这种场景中，稀疏的实验数据得到了模拟数据的补充。所提出的方法将基于变压器的架构与基于图的新型超参数优化技术相结合。由此产生的系统不仅有效减少了模拟偏差，而且与之前的方法相比实现了更高的预测精度。我们在惯性约束聚变实验中演示了我们的方法的有效性，在这些实验中，只有 10 次真实世界数据以及这些实验的合成版本。

引用次数: 0

Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems 发现复杂动力系统数据中流形维度和坐标的自动编码器

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-05-29 DOI: 10.1088/2632-2153/ad4ba5

Kevin Zeng, Carlos E Pérez De Jesús, Andrew J Fox and Michael D Graham

While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and L2 regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the mapping functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework’s ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a ‘collective weight variable’ incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices.

虽然物理学和工程学中的许多现象在形式上都是高维的，但它们的长期动态变化往往是在低维流形上进行的。本研究介绍了一种自动编码器框架，它将隐式正则化与内部线性层和 L2 正则化（权重衰减）相结合，自动估算数据集的底层维度，生成正交流形坐标系，并提供环境空间与流形空间之间的映射函数，从而实现样本外投影。我们验证了我们的框架估算流形维度的能力，并与其他最先进的估算器进行了比较。我们分析了网络的训练动态，以深入了解低秩学习的机制，并发现每个隐式正则化层都在集体地复合低秩表示，甚至在训练过程中进行自我修正。对线性情况下该架构的梯度下降动态分析揭示了内部线性层在导致包含所有层的 "集体权重变量 "更快衰减中的作用，以及权重衰减在打破退化中的作用，从而推动沿着没有衰减时不会发生的方向收敛。我们展示了这一框架可以自然地扩展到状态空间建模和预测的应用中，只需使用流形坐标就能生成时空混沌偏微分方程的数据驱动动态模型。最后，我们证明了我们的框架对超参数选择的鲁棒性。

{"title":"Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems","authors":"Kevin Zeng, Carlos E Pérez De Jesús, Andrew J Fox and Michael D Graham","doi":"10.1088/2632-2153/ad4ba5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4ba5","url":null,"abstract":"While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and L2 regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the mapping functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework’s ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a ‘collective weight variable’ incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"9 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141195813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised learning of quantum many-body scars using intrinsic dimension 利用本征维度对量子多体伤痕进行无监督学习

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-05-28 DOI: 10.1088/2632-2153/ad4d3f

Harvey Cao, Dimitris G Angelakis and Daniel Leykam

Quantum many-body scarred systems contain both thermal and non-thermal scar eigenstates in their spectra. When these systems are quenched from special initial states which share high overlap with scar eigenstates, the system undergoes dynamics with atypically slow relaxation and periodic revival. This scarring phenomenon poses a potential avenue for circumventing decoherence in various quantum engineering applications. Given access to an unknown scar system, current approaches for identification of special states leading to non-thermal dynamics rely on costly measures such as entanglement entropy. In this work, we show how two dimensionality reduction techniques, multidimensional scaling and intrinsic dimension estimation, can be used to learn structural properties of dynamics in the PXP model and distinguish between thermal and scar initial states. The latter method is shown to be robust against limited sample sizes and experimental measurement errors.

量子多体瘢痕系统的光谱中同时包含热瘢痕和非热瘢痕特征状态。当这些系统从与疤痕特征态高度重叠的特殊初始态淬火时，系统会发生异常缓慢的弛豫和周期性恢复的动力学过程。这种疤痕现象为在各种量子工程应用中规避退相干现象提供了潜在的途径。在获取未知痕量系统的情况下，目前识别导致非热动力学的特殊状态的方法依赖于昂贵的测量方法，如纠缠熵。在这项工作中，我们展示了如何利用多维缩放和本征维度估计这两种降维技术来学习 PXP 模型的动力学结构特性，并区分热态和疤痕初始态。后一种方法对有限的样本量和实验测量误差具有鲁棒性。

引用次数: 0

Learning the dynamics of a one-dimensional plasma model with graph neural networks 利用图神经网络学习一维等离子体模型的动力学特性

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-05-27 DOI: 10.1088/2632-2153/ad4ba6

Diogo D Carvalho, Diogo R Ferreira and Luís O Silva

We explore the possibility of fully replacing a plasma physics kinetic simulator with a graph neural network-based simulator. We focus on this class of surrogate models given the similarity between their message-passing update mechanism and the traditional physics solver update, and the possibility of enforcing known physical priors into the graph construction and update. We show that our model learns the kinetic plasma dynamics of the one-dimensional plasma model, a predecessor of contemporary kinetic plasma simulation codes, and recovers a wide range of well-known kinetic plasma processes, including plasma thermalization, electrostatic fluctuations about thermal equilibrium, and the drag on a fast sheet and Landau damping. We compare the performance against the original plasma model in terms of run-time, conservation laws, and temporal evolution of key physical quantities. The limitations of the model are presented and possible directions for higher-dimensional surrogate models for kinetic plasmas are discussed.

我们探索了用基于图神经网络的模拟器完全取代等离子体物理动力学模拟器的可能性。鉴于其消息传递更新机制与传统物理求解器更新之间的相似性，以及在图构建和更新中强制执行已知物理先验的可能性，我们将重点放在这类代理模型上。我们的研究表明，我们的模型学习了一维等离子体模型（当代动力学等离子体模拟代码的前身）的动力学等离子体动力学，并恢复了一系列众所周知的动力学等离子体过程，包括等离子体热化、关于热平衡的静电波动、快片上的阻力和朗道阻尼。我们从运行时间、守恒定律和关键物理量的时间演化等方面比较了原始等离子体模型的性能。介绍了模型的局限性，并讨论了动力学等离子体高维代用模型的可能方向。

引用次数: 0

Hybrid quantum physics-informed neural networks for simulating computational fluid dynamics in complex shapes 用于模拟复杂形状计算流体动力学的混合量子物理信息神经网络

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-05-21 DOI: 10.1088/2632-2153/ad43b2

Alexandr Sedykh, Maninadh Podapaka, Asel Sagingalieva, Karan Pinto, Markus Pflitsch and Alexey Melnikov

Finding the distribution of the velocities and pressures of a fluid by solving the Navier–Stokes equations is a principal task in the chemical, energy, and pharmaceutical industries, as well as in mechanical engineering and in design of pipeline systems. With existing solvers, such as OpenFOAM and Ansys, simulations of fluid dynamics in intricate geometries are computationally expensive and require re-simulation whenever the geometric parameters or the initial and boundary conditions are altered. Physics-informed neural networks (PINNs) are a promising tool for simulating fluid flows in complex geometries, as they can adapt to changes in the geometry and mesh definitions, allowing for generalization across fluid parameters and transfer learning across different shapes. We present a hybrid quantum PINN (HQPINN) that simulates laminar fluid flow in 3D Y-shaped mixers. Our approach combines the expressive power of a quantum model with the flexibility of a PINN, resulting in a 21% higher accuracy compared to a purely classical neural network. Our findings highlight the potential of machine learning approaches, and in particular HQPINN, for complex shape optimization tasks in computational fluid dynamics. By improving the accuracy of fluid simulations in complex geometries, our research using hybrid quantum models contributes to the development of more efficient and reliable fluid dynamics solvers.

通过求解纳维-斯托克斯方程来找到流体的速度和压力分布是化学、能源和制药行业以及机械工程和管道系统设计中的一项主要任务。利用现有的求解器（如 OpenFOAM 和 Ansys）模拟复杂几何形状中的流体动力学计算成本很高，而且每当几何参数或初始条件和边界条件发生变化时，都需要重新模拟。物理信息神经网络（PINN）是模拟复杂几何体中流体流动的一种很有前途的工具，因为它能适应几何体和网格定义的变化，从而实现流体参数的泛化和不同形状的迁移学习。我们提出的混合量子 PINN（HQPINN）可以模拟三维 Y 型混合器中的层流流体流动。我们的方法结合了量子模型的表现力和 PINN 的灵活性，与纯经典神经网络相比，准确率提高了 21%。我们的研究结果凸显了机器学习方法，特别是 HQPINN，在计算流体动力学复杂形状优化任务中的潜力。通过提高复杂几何形状中流体模拟的准确性，我们使用混合量子模型的研究有助于开发更高效、更可靠的流体动力学求解器。

{"title":"Hybrid quantum physics-informed neural networks for simulating computational fluid dynamics in complex shapes","authors":"Alexandr Sedykh, Maninadh Podapaka, Asel Sagingalieva, Karan Pinto, Markus Pflitsch and Alexey Melnikov","doi":"10.1088/2632-2153/ad43b2","DOIUrl":"https://doi.org/10.1088/2632-2153/ad43b2","url":null,"abstract":"Finding the distribution of the velocities and pressures of a fluid by solving the Navier–Stokes equations is a principal task in the chemical, energy, and pharmaceutical industries, as well as in mechanical engineering and in design of pipeline systems. With existing solvers, such as OpenFOAM and Ansys, simulations of fluid dynamics in intricate geometries are computationally expensive and require re-simulation whenever the geometric parameters or the initial and boundary conditions are altered. Physics-informed neural networks (PINNs) are a promising tool for simulating fluid flows in complex geometries, as they can adapt to changes in the geometry and mesh definitions, allowing for generalization across fluid parameters and transfer learning across different shapes. We present a hybrid quantum PINN (HQPINN) that simulates laminar fluid flow in 3D Y-shaped mixers. Our approach combines the expressive power of a quantum model with the flexibility of a PINN, resulting in a 21% higher accuracy compared to a purely classical neural network. Our findings highlight the potential of machine learning approaches, and in particular HQPINN, for complex shape optimization tasks in computational fluid dynamics. By improving the accuracy of fluid simulations in complex geometries, our research using hybrid quantum models contributes to the development of more efficient and reliable fluid dynamics solvers.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"56 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141146223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unifying O(3) equivariant neural networks design with tensor-network formalism 用张量网络形式主义统一 O(3) 等变神经网络设计

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-05-20 DOI: 10.1088/2632-2153/ad4a04

Zimu Li, Zihan Pengmei, Han Zheng, Erik Thiede, Junyu Liu and Risi Kondor

Many learning tasks, including learning potential energy surfaces from ab initio calculations, involve global spatial symmetries and permutational symmetry between atoms or general particles. Equivariant graph neural networks are a standard approach to such problems, with one of the most successful methods employing tensor products between various tensors that transform under the spatial group. However, as the number of different tensors and the complexity of relationships between them increase, maintaining parsimony and equivariance becomes increasingly challenging. In this paper, we propose using fusion diagrams, a technique widely employed in simulating SU(2)-symmetric quantum many-body problems, to design new spatial equivariant components for neural networks. This results in a diagrammatic approach to constructing novel neural network architectures. When applied to particles within a given local neighborhood, the resulting components, which we term ‘fusion blocks,’ serve as universal approximators of any continuous equivariant function defined on the neighborhood. We incorporate a fusion block into pre-existing equivariant architectures (Cormorant and MACE), leading to improved performance with fewer parameters on a range of challenging chemical problems. Furthermore, we apply group-equivariant neural networks to study non-adiabatic molecular dynamics of stilbene cis-trans isomerization. Our approach, which combines tensor networks with equivariant neural networks, suggests a potentially fruitful direction for designing more expressive equivariant neural networks.

许多学习任务，包括从原子动力学计算中学习势能面，都涉及原子或一般粒子之间的全局空间对称性和排列对称性。等变图神经网络是解决此类问题的标准方法，其中最成功的方法之一是利用在空间群下变换的各种张量之间的张量乘积。然而，随着不同张量的数量和它们之间关系的复杂性的增加，保持解析性和等差性变得越来越具有挑战性。在本文中，我们建议使用融合图（一种广泛应用于模拟 SU(2)-symmetric 量子多体问题的技术）为神经网络设计新的空间等差数元件。这就产生了一种构建新型神经网络架构的图解方法。当应用于给定局部邻域内的粒子时，由此产生的组件（我们称之为 "融合块"）可作为邻域上定义的任何连续等变函数的通用近似值。我们在已有的等变架构（Cormorant 和 MACE）中加入了融合块，从而在一系列具有挑战性的化学问题上以更少的参数提高了性能。此外，我们还将群等变神经网络用于研究二苯乙烯顺反异构的非绝热分子动力学。我们的方法将张量网络与等变神经网络相结合，为设计更具表现力的等变神经网络提供了一个潜在的富有成效的方向。

{"title":"Unifying O(3) equivariant neural networks design with tensor-network formalism","authors":"Zimu Li, Zihan Pengmei, Han Zheng, Erik Thiede, Junyu Liu and Risi Kondor","doi":"10.1088/2632-2153/ad4a04","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4a04","url":null,"abstract":"Many learning tasks, including learning potential energy surfaces from ab initio calculations, involve global spatial symmetries and permutational symmetry between atoms or general particles. Equivariant graph neural networks are a standard approach to such problems, with one of the most successful methods employing tensor products between various tensors that transform under the spatial group. However, as the number of different tensors and the complexity of relationships between them increase, maintaining parsimony and equivariance becomes increasingly challenging. In this paper, we propose using fusion diagrams, a technique widely employed in simulating SU(2)-symmetric quantum many-body problems, to design new spatial equivariant components for neural networks. This results in a diagrammatic approach to constructing novel neural network architectures. When applied to particles within a given local neighborhood, the resulting components, which we term ‘fusion blocks,’ serve as universal approximators of any continuous equivariant function defined on the neighborhood. We incorporate a fusion block into pre-existing equivariant architectures (Cormorant and MACE), leading to improved performance with fewer parameters on a range of challenging chemical problems. Furthermore, we apply group-equivariant neural networks to study non-adiabatic molecular dynamics of stilbene cis-trans isomerization. Our approach, which combines tensor networks with equivariant neural networks, suggests a potentially fruitful direction for designing more expressive equivariant neural networks.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"46 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141146255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Feature selection for high-dimensional neural network potentials with the adaptive group lasso 利用自适应群套索为高维神经网络电位进行特征选择

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-05-16 DOI: 10.1088/2632-2153/ad450e

Johannes Sandberg, Thomas Voigtmann, Emilie Devijver and Noel Jakse

Neural network potentials are a powerful tool for atomistic simulations, allowing to accurately reproduce ab initio potential energy surfaces with computational performance approaching classical force fields. A central component of such potentials is the transformation of atomic positions into a set of atomic features in a most efficient and informative way. In this work, a feature selection method is introduced for high dimensional neural network potentials, based on the adaptive group lasso (AGL) approach. It is shown that the use of an embedded method, taking into account the interplay between features and their action in the estimator, is necessary to optimize the number of features. The method’s efficiency is tested on three different monoatomic systems, including Lennard–Jones as a simple test case, Aluminium as a system characterized by predominantly radial interactions, and Boron as representative of a system with strongly directional components in the interactions. The AGL is compared with unsupervised filter methods and found to perform consistently better in reducing the number of features needed to reproduce the reference simulation data at a similar level of accuracy as the starting feature set. In particular, our results show the importance of taking into account model predictions in feature selection for interatomic potentials.

神经网络势能是原子模拟的强大工具，可以精确再现原子势能面，计算性能接近经典力场。这类势能的一个核心组成部分是以最有效、信息量最大的方式将原子位置转换成一组原子特征。在这项工作中，基于自适应群套索（AGL）方法，为高维神经网络势能引入了一种特征选择方法。研究表明，考虑到特征之间的相互作用及其在估计器中的作用，有必要使用嵌入式方法来优化特征数量。该方法的效率在三个不同的单原子系统上进行了测试，包括作为简单测试案例的伦纳德-琼斯系统、主要以径向相互作用为特征的铝系统，以及代表相互作用具有强烈方向性成分的硼系统。我们将 AGL 与无监督滤波方法进行了比较，结果发现 AGL 在减少重现参考模拟数据所需的特征数量方面一直表现较好，其准确度与起始特征集的准确度相当。特别是，我们的结果表明了在原子间电位的特征选择中考虑模型预测的重要性。

{"title":"Feature selection for high-dimensional neural network potentials with the adaptive group lasso","authors":"Johannes Sandberg, Thomas Voigtmann, Emilie Devijver and Noel Jakse","doi":"10.1088/2632-2153/ad450e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad450e","url":null,"abstract":"Neural network potentials are a powerful tool for atomistic simulations, allowing to accurately reproduce ab initio potential energy surfaces with computational performance approaching classical force fields. A central component of such potentials is the transformation of atomic positions into a set of atomic features in a most efficient and informative way. In this work, a feature selection method is introduced for high dimensional neural network potentials, based on the adaptive group lasso (AGL) approach. It is shown that the use of an embedded method, taking into account the interplay between features and their action in the estimator, is necessary to optimize the number of features. The method’s efficiency is tested on three different monoatomic systems, including Lennard–Jones as a simple test case, Aluminium as a system characterized by predominantly radial interactions, and Boron as representative of a system with strongly directional components in the interactions. The AGL is compared with unsupervised filter methods and found to perform consistently better in reducing the number of features needed to reproduce the reference simulation data at a similar level of accuracy as the starting feature set. In particular, our results show the importance of taking into account model predictions in feature selection for interatomic potentials.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"51 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141060611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A multifidelity approach to continual learning for physical systems 物理系统持续学习的多保真方法

IF 6.8 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology

Pub Date : 2024-05-15 DOI: 10.1088/2632-2153/ad45b2

Amanda Howard, Yucheng Fu and Panos Stinis

We introduce a novel continual learning method based on multifidelity deep neural networks. This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset, limiting catastrophic forgetting. On its own the multifidelity continual learning method shows robust results that limit forgetting across several datasets. Additionally, we show that the multifidelity method can be combined with existing continual learning methods, including replay and memory aware synapses, to further limit catastrophic forgetting. The proposed continual learning method is especially suited for physical problems where the data satisfy the same physical laws on each domain, or for physics-informed neural networks, because in these cases we expect there to be a strong correlation between the output of the previous model and the model on the current training domain.

我们介绍了一种基于多保真深度神经网络的新型持续学习方法。这种方法可以学习先前训练模型的输出与当前训练数据集上模型期望输出之间的相关性，从而限制灾难性遗忘。多保真度持续学习方法本身就能在多个数据集上显示出限制遗忘的稳健结果。此外，我们还展示了多保真度方法可以与现有的持续学习方法相结合，包括重放和记忆感知突触，以进一步限制灾难性遗忘。所提出的持续学习方法尤其适用于数据在每个域上都满足相同物理定律的物理问题，或者适用于物理信息神经网络，因为在这些情况下，我们希望上一个模型的输出与当前训练域上的模型之间存在很强的相关性。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Machine Learning Science and Technology

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀