Pub Date : 2024-06-02DOI: 10.1088/2632-2153/ad4a1e
Charles Fox, Neil D Tran, F Nikki Nacion, Samiha Sharlin and Tyler R Josephson
Symbolic regression (SR) can generate interpretable, concise expressions that fit a given dataset, allowing for more human understanding of the structure than black-box approaches. The addition of background knowledge (in the form of symbolic mathematical constraints) allows for the generation of expressions that are meaningful with respect to theory while also being consistent with data. We specifically examine the addition of constraints to traditional genetic algorithm (GA) based SR (PySR) as well as a Markov-chain Monte Carlo (MCMC) based Bayesian SR architecture (Bayesian Machine Scientist), and apply these to rediscovering adsorption equations from experimental, historical datasets. We find that, while hard constraints prevent GA and MCMC SR from searching, soft constraints can lead to improved performance both in terms of search effectiveness and model meaningfulness, with computational costs increasing by about an order of magnitude. If the constraints do not correlate well with the dataset or expected models, they can hinder the search of expressions. We find incorporating these constraints in Bayesian SR (as the Bayesian prior) is better than by modifying the fitness function in the GA.
符号回归(SR)可以生成符合给定数据集的可解释的简洁表达式,与黑箱方法相比,它能让人类更好地理解数据结构。增加背景知识(以符号数学约束的形式)可以生成既符合理论又有意义的表达式。我们特别研究了在基于遗传算法(GA)的传统 SR(PySR)和基于马尔可夫链蒙特卡罗(MCMC)的贝叶斯 SR 架构(贝叶斯机器科学家)中添加约束的问题,并将其应用于从实验、历史数据集中重新发现吸附方程。我们发现,虽然硬约束阻碍了 GA 和 MCMC SR 的搜索,但软约束可以在搜索效果和模型意义方面提高性能,计算成本大约增加一个数量级。如果约束条件与数据集或预期模型关联度不高,就会阻碍表达式的搜索。我们发现,将这些约束条件纳入贝叶斯 SR(作为贝叶斯先验)比在 GA 中修改拟合函数效果更好。
{"title":"Incorporating background knowledge in symbolic regression using a computer algebra system","authors":"Charles Fox, Neil D Tran, F Nikki Nacion, Samiha Sharlin and Tyler R Josephson","doi":"10.1088/2632-2153/ad4a1e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4a1e","url":null,"abstract":"Symbolic regression (SR) can generate interpretable, concise expressions that fit a given dataset, allowing for more human understanding of the structure than black-box approaches. The addition of background knowledge (in the form of symbolic mathematical constraints) allows for the generation of expressions that are meaningful with respect to theory while also being consistent with data. We specifically examine the addition of constraints to traditional genetic algorithm (GA) based SR (PySR) as well as a Markov-chain Monte Carlo (MCMC) based Bayesian SR architecture (Bayesian Machine Scientist), and apply these to rediscovering adsorption equations from experimental, historical datasets. We find that, while hard constraints prevent GA and MCMC SR from searching, soft constraints can lead to improved performance both in terms of search effectiveness and model meaningfulness, with computational costs increasing by about an order of magnitude. If the constraints do not correlate well with the dataset or expected models, they can hinder the search of expressions. We find incorporating these constraints in Bayesian SR (as the Bayesian prior) is better than by modifying the fitness function in the GA.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141258806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The demand for specialized hardware to train AI models has increased in tandem with the increase in the model complexity over the recent years. Graphics processing unit (GPU) is one such hardware that is capable of parallelizing operations performed on a large chunk of data. Companies like Nvidia, AMD, and Google have been constantly scaling-up the hardware performance as fast as they can. Nevertheless, there is still a gap between the required processing power and processing capacity of the hardware. To increase the hardware utilization, the software has to be optimized too. In this paper, we present some general GPU optimization techniques we used to efficiently train the optiGAN model, a Generative Adversarial Network that is capable of generating multidimensional probability distributions of optical photons at the photodetector face in radiation detectors, on an 8GB Nvidia Quadro RTX 4000 GPU. We analyze and compare the performances of all the optimizations based on the execution time and the memory consumed using the Nvidia Nsight Systems profiler tool. The optimizations gave approximately a 4.5x increase in the runtime performance when compared to a naive training on the GPU, without compromising the model performance. Finally we discuss optiGANs future work and how we are planning to scale the model on GPUs.
{"title":"GPU optimization techniques to accelerate optiGAN-a particle simulation GAN.","authors":"Anirudh Srikanth, Carlotta Trigila, Emilie Roncali","doi":"10.1088/2632-2153/ad51c9","DOIUrl":"10.1088/2632-2153/ad51c9","url":null,"abstract":"<p><p>The demand for specialized hardware to train AI models has increased in tandem with the increase in the model complexity over the recent years. Graphics processing unit (GPU) is one such hardware that is capable of parallelizing operations performed on a large chunk of data. Companies like Nvidia, AMD, and Google have been constantly scaling-up the hardware performance as fast as they can. Nevertheless, there is still a gap between the required processing power and processing capacity of the hardware. To increase the hardware utilization, the software has to be optimized too. In this paper, we present some general GPU optimization techniques we used to efficiently train the optiGAN model, a Generative Adversarial Network that is capable of generating multidimensional probability distributions of optical photons at the photodetector face in radiation detectors, on an 8GB Nvidia Quadro RTX 4000 GPU. We analyze and compare the performances of all the optimizations based on the execution time and the memory consumed using the Nvidia Nsight Systems profiler tool. The optimizations gave approximately a 4.5x increase in the runtime performance when compared to a naive training on the GPU, without compromising the model performance. Finally we discuss optiGANs future work and how we are planning to scale the model on GPUs.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"5 2","pages":"027001"},"PeriodicalIF":6.8,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170465/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141331906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-30DOI: 10.1088/2632-2153/ad4e03
Matthew L Olson, Shusen Liu, Jayaraman J Thiagarajan, Bogdan Kustowski, Weng-Keen Wong and Rushil Anirudh
Recent advances in machine learning, specifically transformer architecture, have led to significant advancements in commercial domains. These powerful models have demonstrated superior capability to learn complex relationships and often generalize better to new data and problems. This paper presents a novel transformer-powered approach for enhancing prediction accuracy in multi-modal output scenarios, where sparse experimental data is supplemented with simulation data. The proposed approach integrates transformer-based architecture with a novel graph-based hyper-parameter optimization technique. The resulting system not only effectively reduces simulation bias, but also achieves superior prediction accuracy compared to the prior method. We demonstrate the efficacy of our approach on inertial confinement fusion experiments, where only 10 shots of real-world data are available, as well as synthetic versions of these experiments.
{"title":"Transformer-powered surrogates close the ICF simulation-experiment gap with extremely limited data","authors":"Matthew L Olson, Shusen Liu, Jayaraman J Thiagarajan, Bogdan Kustowski, Weng-Keen Wong and Rushil Anirudh","doi":"10.1088/2632-2153/ad4e03","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4e03","url":null,"abstract":"Recent advances in machine learning, specifically transformer architecture, have led to significant advancements in commercial domains. These powerful models have demonstrated superior capability to learn complex relationships and often generalize better to new data and problems. This paper presents a novel transformer-powered approach for enhancing prediction accuracy in multi-modal output scenarios, where sparse experimental data is supplemented with simulation data. The proposed approach integrates transformer-based architecture with a novel graph-based hyper-parameter optimization technique. The resulting system not only effectively reduces simulation bias, but also achieves superior prediction accuracy compared to the prior method. We demonstrate the efficacy of our approach on inertial confinement fusion experiments, where only 10 shots of real-world data are available, as well as synthetic versions of these experiments.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"31 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141195642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-29DOI: 10.1088/2632-2153/ad4ba5
Kevin Zeng, Carlos E Pérez De Jesús, Andrew J Fox and Michael D Graham
While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and L2 regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the mapping functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework’s ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a ‘collective weight variable’ incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices.
{"title":"Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems","authors":"Kevin Zeng, Carlos E Pérez De Jesús, Andrew J Fox and Michael D Graham","doi":"10.1088/2632-2153/ad4ba5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4ba5","url":null,"abstract":"While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and L2 regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the mapping functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework’s ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a ‘collective weight variable’ incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"9 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141195813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1088/2632-2153/ad4d3f
Harvey Cao, Dimitris G Angelakis and Daniel Leykam
Quantum many-body scarred systems contain both thermal and non-thermal scar eigenstates in their spectra. When these systems are quenched from special initial states which share high overlap with scar eigenstates, the system undergoes dynamics with atypically slow relaxation and periodic revival. This scarring phenomenon poses a potential avenue for circumventing decoherence in various quantum engineering applications. Given access to an unknown scar system, current approaches for identification of special states leading to non-thermal dynamics rely on costly measures such as entanglement entropy. In this work, we show how two dimensionality reduction techniques, multidimensional scaling and intrinsic dimension estimation, can be used to learn structural properties of dynamics in the PXP model and distinguish between thermal and scar initial states. The latter method is shown to be robust against limited sample sizes and experimental measurement errors.
{"title":"Unsupervised learning of quantum many-body scars using intrinsic dimension","authors":"Harvey Cao, Dimitris G Angelakis and Daniel Leykam","doi":"10.1088/2632-2153/ad4d3f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4d3f","url":null,"abstract":"Quantum many-body scarred systems contain both thermal and non-thermal scar eigenstates in their spectra. When these systems are quenched from special initial states which share high overlap with scar eigenstates, the system undergoes dynamics with atypically slow relaxation and periodic revival. This scarring phenomenon poses a potential avenue for circumventing decoherence in various quantum engineering applications. Given access to an unknown scar system, current approaches for identification of special states leading to non-thermal dynamics rely on costly measures such as entanglement entropy. In this work, we show how two dimensionality reduction techniques, multidimensional scaling and intrinsic dimension estimation, can be used to learn structural properties of dynamics in the PXP model and distinguish between thermal and scar initial states. The latter method is shown to be robust against limited sample sizes and experimental measurement errors.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"24 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141168597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1088/2632-2153/ad4ba6
Diogo D Carvalho, Diogo R Ferreira and Luís O Silva
We explore the possibility of fully replacing a plasma physics kinetic simulator with a graph neural network-based simulator. We focus on this class of surrogate models given the similarity between their message-passing update mechanism and the traditional physics solver update, and the possibility of enforcing known physical priors into the graph construction and update. We show that our model learns the kinetic plasma dynamics of the one-dimensional plasma model, a predecessor of contemporary kinetic plasma simulation codes, and recovers a wide range of well-known kinetic plasma processes, including plasma thermalization, electrostatic fluctuations about thermal equilibrium, and the drag on a fast sheet and Landau damping. We compare the performance against the original plasma model in terms of run-time, conservation laws, and temporal evolution of key physical quantities. The limitations of the model are presented and possible directions for higher-dimensional surrogate models for kinetic plasmas are discussed.
{"title":"Learning the dynamics of a one-dimensional plasma model with graph neural networks","authors":"Diogo D Carvalho, Diogo R Ferreira and Luís O Silva","doi":"10.1088/2632-2153/ad4ba6","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4ba6","url":null,"abstract":"We explore the possibility of fully replacing a plasma physics kinetic simulator with a graph neural network-based simulator. We focus on this class of surrogate models given the similarity between their message-passing update mechanism and the traditional physics solver update, and the possibility of enforcing known physical priors into the graph construction and update. We show that our model learns the kinetic plasma dynamics of the one-dimensional plasma model, a predecessor of contemporary kinetic plasma simulation codes, and recovers a wide range of well-known kinetic plasma processes, including plasma thermalization, electrostatic fluctuations about thermal equilibrium, and the drag on a fast sheet and Landau damping. We compare the performance against the original plasma model in terms of run-time, conservation laws, and temporal evolution of key physical quantities. The limitations of the model are presented and possible directions for higher-dimensional surrogate models for kinetic plasmas are discussed.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"9 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141168450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-21DOI: 10.1088/2632-2153/ad43b2
Alexandr Sedykh, Maninadh Podapaka, Asel Sagingalieva, Karan Pinto, Markus Pflitsch and Alexey Melnikov
Finding the distribution of the velocities and pressures of a fluid by solving the Navier–Stokes equations is a principal task in the chemical, energy, and pharmaceutical industries, as well as in mechanical engineering and in design of pipeline systems. With existing solvers, such as OpenFOAM and Ansys, simulations of fluid dynamics in intricate geometries are computationally expensive and require re-simulation whenever the geometric parameters or the initial and boundary conditions are altered. Physics-informed neural networks (PINNs) are a promising tool for simulating fluid flows in complex geometries, as they can adapt to changes in the geometry and mesh definitions, allowing for generalization across fluid parameters and transfer learning across different shapes. We present a hybrid quantum PINN (HQPINN) that simulates laminar fluid flow in 3D Y-shaped mixers. Our approach combines the expressive power of a quantum model with the flexibility of a PINN, resulting in a 21% higher accuracy compared to a purely classical neural network. Our findings highlight the potential of machine learning approaches, and in particular HQPINN, for complex shape optimization tasks in computational fluid dynamics. By improving the accuracy of fluid simulations in complex geometries, our research using hybrid quantum models contributes to the development of more efficient and reliable fluid dynamics solvers.
{"title":"Hybrid quantum physics-informed neural networks for simulating computational fluid dynamics in complex shapes","authors":"Alexandr Sedykh, Maninadh Podapaka, Asel Sagingalieva, Karan Pinto, Markus Pflitsch and Alexey Melnikov","doi":"10.1088/2632-2153/ad43b2","DOIUrl":"https://doi.org/10.1088/2632-2153/ad43b2","url":null,"abstract":"Finding the distribution of the velocities and pressures of a fluid by solving the Navier–Stokes equations is a principal task in the chemical, energy, and pharmaceutical industries, as well as in mechanical engineering and in design of pipeline systems. With existing solvers, such as OpenFOAM and Ansys, simulations of fluid dynamics in intricate geometries are computationally expensive and require re-simulation whenever the geometric parameters or the initial and boundary conditions are altered. Physics-informed neural networks (PINNs) are a promising tool for simulating fluid flows in complex geometries, as they can adapt to changes in the geometry and mesh definitions, allowing for generalization across fluid parameters and transfer learning across different shapes. We present a hybrid quantum PINN (HQPINN) that simulates laminar fluid flow in 3D Y-shaped mixers. Our approach combines the expressive power of a quantum model with the flexibility of a PINN, resulting in a 21% higher accuracy compared to a purely classical neural network. Our findings highlight the potential of machine learning approaches, and in particular HQPINN, for complex shape optimization tasks in computational fluid dynamics. By improving the accuracy of fluid simulations in complex geometries, our research using hybrid quantum models contributes to the development of more efficient and reliable fluid dynamics solvers.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"56 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141146223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-20DOI: 10.1088/2632-2153/ad4a04
Zimu Li, Zihan Pengmei, Han Zheng, Erik Thiede, Junyu Liu and Risi Kondor
Many learning tasks, including learning potential energy surfaces from ab initio calculations, involve global spatial symmetries and permutational symmetry between atoms or general particles. Equivariant graph neural networks are a standard approach to such problems, with one of the most successful methods employing tensor products between various tensors that transform under the spatial group. However, as the number of different tensors and the complexity of relationships between them increase, maintaining parsimony and equivariance becomes increasingly challenging. In this paper, we propose using fusion diagrams, a technique widely employed in simulating SU(2)-symmetric quantum many-body problems, to design new spatial equivariant components for neural networks. This results in a diagrammatic approach to constructing novel neural network architectures. When applied to particles within a given local neighborhood, the resulting components, which we term ‘fusion blocks,’ serve as universal approximators of any continuous equivariant function defined on the neighborhood. We incorporate a fusion block into pre-existing equivariant architectures (Cormorant and MACE), leading to improved performance with fewer parameters on a range of challenging chemical problems. Furthermore, we apply group-equivariant neural networks to study non-adiabatic molecular dynamics of stilbene cis-trans isomerization. Our approach, which combines tensor networks with equivariant neural networks, suggests a potentially fruitful direction for designing more expressive equivariant neural networks.
{"title":"Unifying O(3) equivariant neural networks design with tensor-network formalism","authors":"Zimu Li, Zihan Pengmei, Han Zheng, Erik Thiede, Junyu Liu and Risi Kondor","doi":"10.1088/2632-2153/ad4a04","DOIUrl":"https://doi.org/10.1088/2632-2153/ad4a04","url":null,"abstract":"Many learning tasks, including learning potential energy surfaces from ab initio calculations, involve global spatial symmetries and permutational symmetry between atoms or general particles. Equivariant graph neural networks are a standard approach to such problems, with one of the most successful methods employing tensor products between various tensors that transform under the spatial group. However, as the number of different tensors and the complexity of relationships between them increase, maintaining parsimony and equivariance becomes increasingly challenging. In this paper, we propose using fusion diagrams, a technique widely employed in simulating SU(2)-symmetric quantum many-body problems, to design new spatial equivariant components for neural networks. This results in a diagrammatic approach to constructing novel neural network architectures. When applied to particles within a given local neighborhood, the resulting components, which we term ‘fusion blocks,’ serve as universal approximators of any continuous equivariant function defined on the neighborhood. We incorporate a fusion block into pre-existing equivariant architectures (Cormorant and MACE), leading to improved performance with fewer parameters on a range of challenging chemical problems. Furthermore, we apply group-equivariant neural networks to study non-adiabatic molecular dynamics of stilbene cis-trans isomerization. Our approach, which combines tensor networks with equivariant neural networks, suggests a potentially fruitful direction for designing more expressive equivariant neural networks.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"46 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141146255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-16DOI: 10.1088/2632-2153/ad450e
Johannes Sandberg, Thomas Voigtmann, Emilie Devijver and Noel Jakse
Neural network potentials are a powerful tool for atomistic simulations, allowing to accurately reproduce ab initio potential energy surfaces with computational performance approaching classical force fields. A central component of such potentials is the transformation of atomic positions into a set of atomic features in a most efficient and informative way. In this work, a feature selection method is introduced for high dimensional neural network potentials, based on the adaptive group lasso (AGL) approach. It is shown that the use of an embedded method, taking into account the interplay between features and their action in the estimator, is necessary to optimize the number of features. The method’s efficiency is tested on three different monoatomic systems, including Lennard–Jones as a simple test case, Aluminium as a system characterized by predominantly radial interactions, and Boron as representative of a system with strongly directional components in the interactions. The AGL is compared with unsupervised filter methods and found to perform consistently better in reducing the number of features needed to reproduce the reference simulation data at a similar level of accuracy as the starting feature set. In particular, our results show the importance of taking into account model predictions in feature selection for interatomic potentials.
{"title":"Feature selection for high-dimensional neural network potentials with the adaptive group lasso","authors":"Johannes Sandberg, Thomas Voigtmann, Emilie Devijver and Noel Jakse","doi":"10.1088/2632-2153/ad450e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad450e","url":null,"abstract":"Neural network potentials are a powerful tool for atomistic simulations, allowing to accurately reproduce ab initio potential energy surfaces with computational performance approaching classical force fields. A central component of such potentials is the transformation of atomic positions into a set of atomic features in a most efficient and informative way. In this work, a feature selection method is introduced for high dimensional neural network potentials, based on the adaptive group lasso (AGL) approach. It is shown that the use of an embedded method, taking into account the interplay between features and their action in the estimator, is necessary to optimize the number of features. The method’s efficiency is tested on three different monoatomic systems, including Lennard–Jones as a simple test case, Aluminium as a system characterized by predominantly radial interactions, and Boron as representative of a system with strongly directional components in the interactions. The AGL is compared with unsupervised filter methods and found to perform consistently better in reducing the number of features needed to reproduce the reference simulation data at a similar level of accuracy as the starting feature set. In particular, our results show the importance of taking into account model predictions in feature selection for interatomic potentials.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"51 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141060611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-15DOI: 10.1088/2632-2153/ad45b2
Amanda Howard, Yucheng Fu and Panos Stinis
We introduce a novel continual learning method based on multifidelity deep neural networks. This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset, limiting catastrophic forgetting. On its own the multifidelity continual learning method shows robust results that limit forgetting across several datasets. Additionally, we show that the multifidelity method can be combined with existing continual learning methods, including replay and memory aware synapses, to further limit catastrophic forgetting. The proposed continual learning method is especially suited for physical problems where the data satisfy the same physical laws on each domain, or for physics-informed neural networks, because in these cases we expect there to be a strong correlation between the output of the previous model and the model on the current training domain.
{"title":"A multifidelity approach to continual learning for physical systems","authors":"Amanda Howard, Yucheng Fu and Panos Stinis","doi":"10.1088/2632-2153/ad45b2","DOIUrl":"https://doi.org/10.1088/2632-2153/ad45b2","url":null,"abstract":"We introduce a novel continual learning method based on multifidelity deep neural networks. This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset, limiting catastrophic forgetting. On its own the multifidelity continual learning method shows robust results that limit forgetting across several datasets. Additionally, we show that the multifidelity method can be combined with existing continual learning methods, including replay and memory aware synapses, to further limit catastrophic forgetting. The proposed continual learning method is especially suited for physical problems where the data satisfy the same physical laws on each domain, or for physics-informed neural networks, because in these cases we expect there to be a strong correlation between the output of the previous model and the model on the current training domain.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"67 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141060597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}