Pub Date : 2024-02-13DOI: 10.1088/2632-2153/ad2493
A. Milder, A. S. Joglekar, W. Rozmus, D. H. Froula
Parameter estimation using observables is a fundamental concept in the experimental sciences. Mathematical models that represent the physical processes can enable reconstructions of the experimental observables and greatly assist in parameter estimation by turning it into an optimization problem which can be solved by gradient-free or gradient-based methods. In this work, the recent rise in flexible frameworks for developing differentiable scientific computing programs is leveraged in order to dramatically accelerate data analysis of a common experimental diagnostic relevant to laser–plasma and inertial fusion experiments, Thomson scattering. A differentiable Thomson-scattering data analysis tool is developed that uses reverse-mode automatic differentiation (AD) to calculate gradients. By switching from finite differencing to reverse-mode AD, three distinct outcomes are achieved. First, gradient descent is accelerated dramatically to the extent that it enables near real-time usage in laser–plasma experiments. Second, qualitatively novel quantities which require O ( 10 3 ) parameters can now be included in the analysis of data which enables unprecedented measurements of small-scale laser–plasma phenomena. Third, uncertainty estimation approaches that leverage the value of the Hessian become accurate and efficient because reverse-mode AD can be used for calculating the Hessian.
利用观测数据进行参数估计是实验科学的一个基本概念。表示物理过程的数学模型可以重构实验观测值,并通过将其转化为优化问题来极大地帮助参数估计,而优化问题可以通过无梯度或基于梯度的方法来解决。在这项工作中,我们利用了最近兴起的用于开发可微分科学计算程序的灵活框架,以显著加快与激光等离子体和惯性聚变实验相关的常见实验诊断--汤姆逊散射--的数据分析。我们开发了一种可微分的汤姆逊散射数据分析工具,它使用反向模式自动微分(AD)来计算梯度。通过从有限差分转换到反向模式自动差分,实现了三个不同的结果。首先,梯度下降的速度大大加快,在激光等离子体实验中几乎可以实时使用。其次,需要 O ( 10 3 ) 个参数的定性新量现在可以纳入数据分析,从而实现对小尺度激光等离子体现象的前所未有的测量。第三,利用赫塞斯值的不确定性估计方法变得精确而高效,因为反向模式 AD 可用于计算赫塞斯。
{"title":"Qualitative and quantitative enhancement of parameter estimation for model-based diagnostics using automatic differentiation with an application to inertial fusion","authors":"A. Milder, A. S. Joglekar, W. Rozmus, D. H. Froula","doi":"10.1088/2632-2153/ad2493","DOIUrl":"https://doi.org/10.1088/2632-2153/ad2493","url":null,"abstract":"\u0000 Parameter estimation using observables is a fundamental concept in the experimental sciences. Mathematical models that represent the physical processes can enable reconstructions of the experimental observables and greatly assist in parameter estimation by turning it into an optimization problem which can be solved by gradient-free or gradient-based methods. In this work, the recent rise in flexible frameworks for developing differentiable scientific computing programs is leveraged in order to dramatically accelerate data analysis of a common experimental diagnostic relevant to laser–plasma and inertial fusion experiments, Thomson scattering. A differentiable Thomson-scattering data analysis tool is developed that uses reverse-mode automatic differentiation (AD) to calculate gradients. By switching from finite differencing to reverse-mode AD, three distinct outcomes are achieved. First, gradient descent is accelerated dramatically to the extent that it enables near real-time usage in laser–plasma experiments. Second, qualitatively novel quantities which require \u0000 \u0000 \u0000 \u0000 O\u0000 \u0000 (\u0000 \u0000 10\u0000 3\u0000 \u0000 )\u0000 \u0000 \u0000 parameters can now be included in the analysis of data which enables unprecedented measurements of small-scale laser–plasma phenomena. Third, uncertainty estimation approaches that leverage the value of the Hessian become accurate and efficient because reverse-mode AD can be used for calculating the Hessian.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"89 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139840684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-09DOI: 10.1088/2632-2153/ad27e1
Viki Kumar Prasad, Alberto Otero-de-la-Roza, G. Dilabio
Supervised deep learning (DL) models are becoming ubiquitous in computational chemistry because they can efficiently learn complex input-output relationships and predict chemical properties at a cost significantly lower than methods based on quantum mechanics. The central challenge in many deep learning applications is the need to invest considerable computational resources in generating large (N > 1e5) training sets such that the resulting DL model can be generalized reliably to unseen systems. The lack of better alternatives has encouraged the use of low-cost and relatively inaccurate density-functional theory (DFT) methods to generate training data, leading to DL models that lack accuracy and reliability. In this article, we describe a robust and easily implemented approach based on property-specific atom-centered potentials (ACPs) that resolves this central challenge in DL model development. ACPs are one-electron potentials that are applied in combination with a cheap but inaccurate quantum mechanical method (e.g. double-$zeta$ DFT) and fitted against relatively few high-level data ($N approx num{1e3}$--$num{1e4}$), possibly obtained from the literature. The resulting ACP-corrected methods retain the low cost of the double-$zeta$ DFT approach, while generating high-level-quality data in unseen systems for the specific property for which they were designed. With this approach, we demonstrate that ACPs can be used as an intermediate method between high-level approaches and DL model development, enabling the calculation of large and accurate DL training sets for the chemical property of interest. We demonstrate the effectiveness of the proposed approach by predicting bond dissociation enthalpies, reaction barrier heights, and reaction energies with chemical accuracy at a computational cost lower than the DFT methods routinely used for DL training data set generation.
{"title":"Bridging the Gap Between High-Level Quantum Chemical Methods and Deep Learning Models","authors":"Viki Kumar Prasad, Alberto Otero-de-la-Roza, G. Dilabio","doi":"10.1088/2632-2153/ad27e1","DOIUrl":"https://doi.org/10.1088/2632-2153/ad27e1","url":null,"abstract":"\u0000 Supervised deep learning (DL) models are becoming ubiquitous in computational chemistry because they can efficiently learn complex input-output relationships and predict chemical properties at a cost significantly lower than methods based on quantum mechanics. The central challenge in many deep learning applications is the need to invest considerable computational resources in generating large (N > 1e5) training sets such that the resulting DL model can be generalized reliably to unseen systems. The lack of better alternatives has encouraged the use of low-cost and relatively inaccurate density-functional theory (DFT) methods to generate training data, leading to DL models that lack accuracy and reliability. In this article, we describe a robust and easily implemented approach based on property-specific atom-centered potentials (ACPs) that resolves this central challenge in DL model development. ACPs are one-electron potentials that are applied in combination with a cheap but inaccurate quantum mechanical method (e.g. double-$zeta$ DFT) and fitted against relatively few high-level data ($N approx num{1e3}$--$num{1e4}$), possibly obtained from the literature. The resulting ACP-corrected methods retain the low cost of the double-$zeta$ DFT approach, while generating high-level-quality data in unseen systems for the specific property for which they were designed. With this approach, we demonstrate that ACPs can be used as an intermediate method between high-level approaches and DL model development, enabling the calculation of large and accurate DL training sets for the chemical property of interest. We demonstrate the effectiveness of the proposed approach by predicting bond dissociation enthalpies, reaction barrier heights, and reaction energies with chemical accuracy at a computational cost lower than the DFT methods routinely used for DL training data set generation.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"246 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139848594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-09DOI: 10.1088/2632-2153/ad27e1
Viki Kumar Prasad, Alberto Otero-de-la-Roza, G. Dilabio
Supervised deep learning (DL) models are becoming ubiquitous in computational chemistry because they can efficiently learn complex input-output relationships and predict chemical properties at a cost significantly lower than methods based on quantum mechanics. The central challenge in many deep learning applications is the need to invest considerable computational resources in generating large (N > 1e5) training sets such that the resulting DL model can be generalized reliably to unseen systems. The lack of better alternatives has encouraged the use of low-cost and relatively inaccurate density-functional theory (DFT) methods to generate training data, leading to DL models that lack accuracy and reliability. In this article, we describe a robust and easily implemented approach based on property-specific atom-centered potentials (ACPs) that resolves this central challenge in DL model development. ACPs are one-electron potentials that are applied in combination with a cheap but inaccurate quantum mechanical method (e.g. double-$zeta$ DFT) and fitted against relatively few high-level data ($N approx num{1e3}$--$num{1e4}$), possibly obtained from the literature. The resulting ACP-corrected methods retain the low cost of the double-$zeta$ DFT approach, while generating high-level-quality data in unseen systems for the specific property for which they were designed. With this approach, we demonstrate that ACPs can be used as an intermediate method between high-level approaches and DL model development, enabling the calculation of large and accurate DL training sets for the chemical property of interest. We demonstrate the effectiveness of the proposed approach by predicting bond dissociation enthalpies, reaction barrier heights, and reaction energies with chemical accuracy at a computational cost lower than the DFT methods routinely used for DL training data set generation.
{"title":"Bridging the Gap Between High-Level Quantum Chemical Methods and Deep Learning Models","authors":"Viki Kumar Prasad, Alberto Otero-de-la-Roza, G. Dilabio","doi":"10.1088/2632-2153/ad27e1","DOIUrl":"https://doi.org/10.1088/2632-2153/ad27e1","url":null,"abstract":"\u0000 Supervised deep learning (DL) models are becoming ubiquitous in computational chemistry because they can efficiently learn complex input-output relationships and predict chemical properties at a cost significantly lower than methods based on quantum mechanics. The central challenge in many deep learning applications is the need to invest considerable computational resources in generating large (N > 1e5) training sets such that the resulting DL model can be generalized reliably to unseen systems. The lack of better alternatives has encouraged the use of low-cost and relatively inaccurate density-functional theory (DFT) methods to generate training data, leading to DL models that lack accuracy and reliability. In this article, we describe a robust and easily implemented approach based on property-specific atom-centered potentials (ACPs) that resolves this central challenge in DL model development. ACPs are one-electron potentials that are applied in combination with a cheap but inaccurate quantum mechanical method (e.g. double-$zeta$ DFT) and fitted against relatively few high-level data ($N approx num{1e3}$--$num{1e4}$), possibly obtained from the literature. The resulting ACP-corrected methods retain the low cost of the double-$zeta$ DFT approach, while generating high-level-quality data in unseen systems for the specific property for which they were designed. With this approach, we demonstrate that ACPs can be used as an intermediate method between high-level approaches and DL model development, enabling the calculation of large and accurate DL training sets for the chemical property of interest. We demonstrate the effectiveness of the proposed approach by predicting bond dissociation enthalpies, reaction barrier heights, and reaction energies with chemical accuracy at a computational cost lower than the DFT methods routinely used for DL training data set generation.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 27","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139788879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1088/2632-2153/ad2629
Alex Gabel, Rick Quax, E. Gavves
Symmetry detection, the task of discovering the underlying symmetries of a given dataset, has been gaining popularity in the machine learning community, particularly in science and engineering applications. Most previous works focus on detecting "canonical" symmetries such as translation, scaling, and rotation, and cast the task as a modeling problem involving complex inductive biases and architecture design of neural networks. We challenge these assumptions and propose that instead of constructing biases, we can learn to detect symmetries from raw data without prior knowledge. The approach presented in this paper provides a flexible way to scale up the detection procedure to non-canonical symmetries, and has the potential to detect both known and unknown symmetries alike. Concretely, we focus on predicting the generators of Lie point symmetries of PDEs, more specifically, evolutionary equations for ease of data generation. Our results demonstrate that well-established neural network architectures are capable of recognizing symmetry generators, even in unseen dynamical systems. These findings have the potential to make non-canonical symmetries more accessible to applications, including model selection, sparse identification, and data interpretability.
{"title":"Data-driven Lie Point Symmetry Detection for Continuous Dynamical Systems","authors":"Alex Gabel, Rick Quax, E. Gavves","doi":"10.1088/2632-2153/ad2629","DOIUrl":"https://doi.org/10.1088/2632-2153/ad2629","url":null,"abstract":"\u0000 Symmetry detection, the task of discovering the underlying symmetries of a given dataset, has been gaining popularity in the machine learning community, particularly in science and engineering applications. Most previous works focus on detecting \"canonical\" symmetries such as translation, scaling, and rotation, and cast the task as a modeling problem involving complex inductive biases and architecture design of neural networks. We challenge these assumptions and propose that instead of constructing biases, we can learn to detect symmetries from raw data without prior knowledge. The approach presented in this paper provides a flexible way to scale up the detection procedure to non-canonical symmetries, and has the potential to detect both known and unknown symmetries alike. Concretely, we focus on predicting the generators of Lie point symmetries of PDEs, more specifically, evolutionary equations for ease of data generation. Our results demonstrate that well-established neural network architectures are capable of recognizing symmetry generators, even in unseen dynamical systems. These findings have the potential to make non-canonical symmetries more accessible to applications, including model selection, sparse identification, and data interpretability.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"40 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139863788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1088/2632-2153/ad262a
Chad Bustard, John Wu
The coarse-grained propagation of Galactic cosmic rays (CRs) is traditionally constrained by phenomenological models of Milky Way CR propagation fit to a variety of direct and indirect observables; however, constraining the fine-grained transport of CRs along individual magnetic field lines -- for instance, diffusive vs streaming transport models -- is an unsolved challenge. Leveraging a recent training set of magnetohydrodynamic turbulent box simulations, with CRs spanning a range of transport parameters, we use convolutional neural networks (CNNs) trained solely on gas density maps to classify CR transport regimes. We find that even relatively simple CNNs can quite effectively classify density slices to corresponding CR transport parameters, distinguishing between streaming and diffusive transport, as well as magnitude of diffusivity, with class accuracies between 92% and 99%. As we show, the transport-dependent imprints that CRs leave on the gas are not all tied to the resulting density power spectra: classification accuracies are still high even when image spectra are flattened (85% to 98% accuracy), highlighting CR transport-dependent changes to turbulent phase information. We interpret our results with saliency maps and image modifications, and we discuss physical insights and future applications.
{"title":"Deep Learning Cosmic Ray Transport from Density Maps of Simulated, Turbulent Gas","authors":"Chad Bustard, John Wu","doi":"10.1088/2632-2153/ad262a","DOIUrl":"https://doi.org/10.1088/2632-2153/ad262a","url":null,"abstract":"\u0000 The coarse-grained propagation of Galactic cosmic rays (CRs) is traditionally constrained by phenomenological models of Milky Way CR propagation fit to a variety of direct and indirect observables; however, constraining the fine-grained transport of CRs along individual magnetic field lines -- for instance, diffusive vs streaming transport models -- is an unsolved challenge. Leveraging a recent training set of magnetohydrodynamic turbulent box simulations, with CRs spanning a range of transport parameters, we use convolutional neural networks (CNNs) trained solely on gas density maps to classify CR transport regimes. We find that even relatively simple CNNs can quite effectively classify density slices to corresponding CR transport parameters, distinguishing between streaming and diffusive transport, as well as magnitude of diffusivity, with class accuracies between 92% and 99%. As we show, the transport-dependent imprints that CRs leave on the gas are not all tied to the resulting density power spectra: classification accuracies are still high even when image spectra are flattened (85% to 98% accuracy), highlighting CR transport-dependent changes to turbulent phase information. We interpret our results with saliency maps and image modifications, and we discuss physical insights and future applications.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"18 44","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139803432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1088/2632-2153/ad262a
Chad Bustard, John Wu
The coarse-grained propagation of Galactic cosmic rays (CRs) is traditionally constrained by phenomenological models of Milky Way CR propagation fit to a variety of direct and indirect observables; however, constraining the fine-grained transport of CRs along individual magnetic field lines -- for instance, diffusive vs streaming transport models -- is an unsolved challenge. Leveraging a recent training set of magnetohydrodynamic turbulent box simulations, with CRs spanning a range of transport parameters, we use convolutional neural networks (CNNs) trained solely on gas density maps to classify CR transport regimes. We find that even relatively simple CNNs can quite effectively classify density slices to corresponding CR transport parameters, distinguishing between streaming and diffusive transport, as well as magnitude of diffusivity, with class accuracies between 92% and 99%. As we show, the transport-dependent imprints that CRs leave on the gas are not all tied to the resulting density power spectra: classification accuracies are still high even when image spectra are flattened (85% to 98% accuracy), highlighting CR transport-dependent changes to turbulent phase information. We interpret our results with saliency maps and image modifications, and we discuss physical insights and future applications.
{"title":"Deep Learning Cosmic Ray Transport from Density Maps of Simulated, Turbulent Gas","authors":"Chad Bustard, John Wu","doi":"10.1088/2632-2153/ad262a","DOIUrl":"https://doi.org/10.1088/2632-2153/ad262a","url":null,"abstract":"\u0000 The coarse-grained propagation of Galactic cosmic rays (CRs) is traditionally constrained by phenomenological models of Milky Way CR propagation fit to a variety of direct and indirect observables; however, constraining the fine-grained transport of CRs along individual magnetic field lines -- for instance, diffusive vs streaming transport models -- is an unsolved challenge. Leveraging a recent training set of magnetohydrodynamic turbulent box simulations, with CRs spanning a range of transport parameters, we use convolutional neural networks (CNNs) trained solely on gas density maps to classify CR transport regimes. We find that even relatively simple CNNs can quite effectively classify density slices to corresponding CR transport parameters, distinguishing between streaming and diffusive transport, as well as magnitude of diffusivity, with class accuracies between 92% and 99%. As we show, the transport-dependent imprints that CRs leave on the gas are not all tied to the resulting density power spectra: classification accuracies are still high even when image spectra are flattened (85% to 98% accuracy), highlighting CR transport-dependent changes to turbulent phase information. We interpret our results with saliency maps and image modifications, and we discuss physical insights and future applications.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"25 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139863482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1088/2632-2153/ad2629
Alex Gabel, Rick Quax, E. Gavves
Symmetry detection, the task of discovering the underlying symmetries of a given dataset, has been gaining popularity in the machine learning community, particularly in science and engineering applications. Most previous works focus on detecting "canonical" symmetries such as translation, scaling, and rotation, and cast the task as a modeling problem involving complex inductive biases and architecture design of neural networks. We challenge these assumptions and propose that instead of constructing biases, we can learn to detect symmetries from raw data without prior knowledge. The approach presented in this paper provides a flexible way to scale up the detection procedure to non-canonical symmetries, and has the potential to detect both known and unknown symmetries alike. Concretely, we focus on predicting the generators of Lie point symmetries of PDEs, more specifically, evolutionary equations for ease of data generation. Our results demonstrate that well-established neural network architectures are capable of recognizing symmetry generators, even in unseen dynamical systems. These findings have the potential to make non-canonical symmetries more accessible to applications, including model selection, sparse identification, and data interpretability.
{"title":"Data-driven Lie Point Symmetry Detection for Continuous Dynamical Systems","authors":"Alex Gabel, Rick Quax, E. Gavves","doi":"10.1088/2632-2153/ad2629","DOIUrl":"https://doi.org/10.1088/2632-2153/ad2629","url":null,"abstract":"\u0000 Symmetry detection, the task of discovering the underlying symmetries of a given dataset, has been gaining popularity in the machine learning community, particularly in science and engineering applications. Most previous works focus on detecting \"canonical\" symmetries such as translation, scaling, and rotation, and cast the task as a modeling problem involving complex inductive biases and architecture design of neural networks. We challenge these assumptions and propose that instead of constructing biases, we can learn to detect symmetries from raw data without prior knowledge. The approach presented in this paper provides a flexible way to scale up the detection procedure to non-canonical symmetries, and has the potential to detect both known and unknown symmetries alike. Concretely, we focus on predicting the generators of Lie point symmetries of PDEs, more specifically, evolutionary equations for ease of data generation. Our results demonstrate that well-established neural network architectures are capable of recognizing symmetry generators, even in unseen dynamical systems. These findings have the potential to make non-canonical symmetries more accessible to applications, including model selection, sparse identification, and data interpretability.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"1 13","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139804006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1088/2632-2153/ad2626
Dayou Yu, Deep Shankar Pandey, J. Hinz, D. Mihaylov, Valentin V. Karasiev, Suxing Hu, Qi Yu
In this paper, we aim to explore novel machine learning (ML) techniques to facilitate and accelerate the construction of universal Equation-Of-State (EOS) models with a high accuracy while ensuring important thermodynamic consistency. When applying ML to fit a universal EOS model, there are two key requirements: (1) a high prediction accuracy to ensure precise estimation of relevant physics properties and (2) physical interpretability to support important physics-related downstream applications. We first identify a set of fundamental challenges from the accuracy perspective, including an extremely wide range of input/output space and highly sparse training data. We demonstrate that while a neural network (NN) model may fit the EOS data well, the black-box nature makes it difficult to provide physically interpretable results, leading to weak accountability of prediction results outside the training range and lack of guarantee to meet important thermodynamic consistency constraints. To this end, we propose a principled deep regression model that can be trained following a meta-learning style to predict the desired quantities with a high accuracy using scarce training data. We further introduce a uniquely designed kernel-based regularizer for accurate uncertainty quantification. An ensemble technique is leveraged to battle model overfitting with improved prediction stability. Auto-differentiation is conducted to verify that necessary thermodynamic consistency conditions are maintained. Our evaluation results show an excellent fit of the EOS table and the predicted values are ready to use for important physics-related tasks.
在本文中,我们旨在探索新型机器学习(ML)技术,以促进和加速构建高精度的通用状态方程(EOS)模型,同时确保重要的热力学一致性。在应用 ML 拟合通用 EOS 模型时,有两个关键要求:(1) 高预测精度,以确保精确估计相关物理特性;(2) 物理可解释性,以支持重要的物理相关下游应用。我们首先从精度的角度确定了一系列基本挑战,包括极其广泛的输入/输出空间和高度稀疏的训练数据。我们证明,虽然神经网络(NN)模型可以很好地拟合 EOS 数据,但其黑箱性质使其难以提供物理上可解释的结果,导致预测结果在训练范围之外的责任性很弱,并且无法保证满足重要的热力学一致性约束。为此,我们提出了一种有原则的深度回归模型,该模型可以按照元学习的方式进行训练,从而利用稀缺的训练数据高精度地预测所需的量。我们还引入了一种独特设计的基于内核的正则化器,用于准确量化不确定性。利用集合技术来对抗模型过拟合,同时提高预测稳定性。我们还进行了自动区分,以验证是否保持了必要的热力学一致性条件。我们的评估结果表明,EOS 表的拟合效果极佳,预测值可用于重要的物理相关任务。
{"title":"Deep Energy-Pressure Regression for a Thermodynamically Consistent EOS Model","authors":"Dayou Yu, Deep Shankar Pandey, J. Hinz, D. Mihaylov, Valentin V. Karasiev, Suxing Hu, Qi Yu","doi":"10.1088/2632-2153/ad2626","DOIUrl":"https://doi.org/10.1088/2632-2153/ad2626","url":null,"abstract":"\u0000 In this paper, we aim to explore novel machine learning (ML) techniques to facilitate and accelerate the construction of universal Equation-Of-State (EOS) models with a high accuracy while ensuring important thermodynamic consistency. When applying ML to fit a universal EOS model, there are two key requirements: (1) a high prediction accuracy to ensure precise estimation of relevant physics properties and (2) physical interpretability to support important physics-related downstream applications. We first identify a set of fundamental challenges from the accuracy perspective, including an extremely wide range of input/output space and highly sparse training data. We demonstrate that while a neural network (NN) model may fit the EOS data well, the black-box nature makes it difficult to provide physically interpretable results, leading to weak accountability of prediction results outside the training range and lack of guarantee to meet important thermodynamic consistency constraints. To this end, we propose a principled deep regression model that can be trained following a meta-learning style to predict the desired quantities with a high accuracy using scarce training data. We further introduce a uniquely designed kernel-based regularizer for accurate uncertainty quantification. An ensemble technique is leveraged to battle model overfitting with improved prediction stability. Auto-differentiation is conducted to verify that necessary thermodynamic consistency conditions are maintained. Our evaluation results show an excellent fit of the EOS table and the predicted values are ready to use for important physics-related tasks.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"24 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139864144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1088/2632-2153/ad2626
Dayou Yu, Deep Shankar Pandey, J. Hinz, D. Mihaylov, Valentin V. Karasiev, Suxing Hu, Qi Yu
In this paper, we aim to explore novel machine learning (ML) techniques to facilitate and accelerate the construction of universal Equation-Of-State (EOS) models with a high accuracy while ensuring important thermodynamic consistency. When applying ML to fit a universal EOS model, there are two key requirements: (1) a high prediction accuracy to ensure precise estimation of relevant physics properties and (2) physical interpretability to support important physics-related downstream applications. We first identify a set of fundamental challenges from the accuracy perspective, including an extremely wide range of input/output space and highly sparse training data. We demonstrate that while a neural network (NN) model may fit the EOS data well, the black-box nature makes it difficult to provide physically interpretable results, leading to weak accountability of prediction results outside the training range and lack of guarantee to meet important thermodynamic consistency constraints. To this end, we propose a principled deep regression model that can be trained following a meta-learning style to predict the desired quantities with a high accuracy using scarce training data. We further introduce a uniquely designed kernel-based regularizer for accurate uncertainty quantification. An ensemble technique is leveraged to battle model overfitting with improved prediction stability. Auto-differentiation is conducted to verify that necessary thermodynamic consistency conditions are maintained. Our evaluation results show an excellent fit of the EOS table and the predicted values are ready to use for important physics-related tasks.
在本文中,我们旨在探索新型机器学习(ML)技术,以促进和加速构建高精度的通用状态方程(EOS)模型,同时确保重要的热力学一致性。在应用 ML 拟合通用 EOS 模型时,有两个关键要求:(1) 高预测精度,以确保精确估计相关物理特性;(2) 物理可解释性,以支持重要的物理相关下游应用。我们首先从精度的角度确定了一系列基本挑战,包括极其广泛的输入/输出空间和高度稀疏的训练数据。我们证明,虽然神经网络(NN)模型可以很好地拟合 EOS 数据,但其黑箱性质使其难以提供物理上可解释的结果,导致预测结果在训练范围之外的责任性很弱,并且无法保证满足重要的热力学一致性约束。为此,我们提出了一种有原则的深度回归模型,该模型可以按照元学习的方式进行训练,从而利用稀缺的训练数据高精度地预测所需的量。我们还引入了一种独特设计的基于内核的正则化器,用于准确量化不确定性。利用集合技术来对抗模型过拟合,同时提高预测稳定性。我们还进行了自动区分,以验证是否保持了必要的热力学一致性条件。我们的评估结果表明,EOS 表的拟合效果极佳,预测值可用于重要的物理相关任务。
{"title":"Deep Energy-Pressure Regression for a Thermodynamically Consistent EOS Model","authors":"Dayou Yu, Deep Shankar Pandey, J. Hinz, D. Mihaylov, Valentin V. Karasiev, Suxing Hu, Qi Yu","doi":"10.1088/2632-2153/ad2626","DOIUrl":"https://doi.org/10.1088/2632-2153/ad2626","url":null,"abstract":"\u0000 In this paper, we aim to explore novel machine learning (ML) techniques to facilitate and accelerate the construction of universal Equation-Of-State (EOS) models with a high accuracy while ensuring important thermodynamic consistency. When applying ML to fit a universal EOS model, there are two key requirements: (1) a high prediction accuracy to ensure precise estimation of relevant physics properties and (2) physical interpretability to support important physics-related downstream applications. We first identify a set of fundamental challenges from the accuracy perspective, including an extremely wide range of input/output space and highly sparse training data. We demonstrate that while a neural network (NN) model may fit the EOS data well, the black-box nature makes it difficult to provide physically interpretable results, leading to weak accountability of prediction results outside the training range and lack of guarantee to meet important thermodynamic consistency constraints. To this end, we propose a principled deep regression model that can be trained following a meta-learning style to predict the desired quantities with a high accuracy using scarce training data. We further introduce a uniquely designed kernel-based regularizer for accurate uncertainty quantification. An ensemble technique is leveraged to battle model overfitting with improved prediction stability. Auto-differentiation is conducted to verify that necessary thermodynamic consistency conditions are maintained. Our evaluation results show an excellent fit of the EOS table and the predicted values are ready to use for important physics-related tasks.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"85 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139804051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-18DOI: 10.1088/2632-2153/ad200c
Muhammed Enes Ozelbas, E. Tülay, Serhat Ozekes
Motor Imagery Brain-Computer Interfaces (MI-BCIs) have gained a lot of attention in recent years thanks to their potential to enhance rehabilitation and control of prosthetic devices for individuals with motor disabilities. However, accurate classification of motor imagery signals remains a challenging task due to the high inter-subject variability and non-stationarity in the electroencephalogram (EEG) data. In the context of MI-BCIs, with limited data availability, the acquisition of EEG data can be difficult. In this study, several data augmentation techniques have been compared with the proposed data augmentation technique Adaptive Cross-Subject Segment Replacement (ACSSR). This technique, in conjunction with the proposed deep learning framework, allows for a combination of similar subject pairs to take advantage of one another and boost the classification performance of MI-BCIs. The proposed framework features a multi-domain feature extractor based on Common Spatial Patterns (CSP) with a sliding window and a parallel two-branch Convolutional Neural Network (CNN). The performance of the proposed methodology has been evaluated on the multi-class BCI Competition IV Dataset 2a through repeated 10- fold cross-validation. Experimental results indicated that the implementation of the ACSSR method (80.46%) in the proposed framework has led to a considerable improvement in the classification performance compared to the classification without data augmentation (77.63%), and other fundamental data augmentation techniques used in the literature. The study contributes to the advancements for the development of effective MI-BCIs by showcasing the ability of the ACSSR method to address the challenges in motor imagery signal classification tasks.
{"title":"Improving Cross-Subject Classification Performance of Motor Imagery Signals: A Data Augmentation-focused Deep Learning Framework","authors":"Muhammed Enes Ozelbas, E. Tülay, Serhat Ozekes","doi":"10.1088/2632-2153/ad200c","DOIUrl":"https://doi.org/10.1088/2632-2153/ad200c","url":null,"abstract":"\u0000 Motor Imagery Brain-Computer Interfaces (MI-BCIs) have gained a lot of attention in recent years thanks to their potential to enhance rehabilitation and control of prosthetic devices for individuals with motor disabilities. However, accurate classification of motor imagery signals remains a challenging task due to the high inter-subject variability and non-stationarity in the electroencephalogram (EEG) data. In the context of MI-BCIs, with limited data availability, the acquisition of EEG data can be difficult. In this study, several data augmentation techniques have been compared with the proposed data augmentation technique Adaptive Cross-Subject Segment Replacement (ACSSR). This technique, in conjunction with the proposed deep learning framework, allows for a combination of similar subject pairs to take advantage of one another and boost the classification performance of MI-BCIs. The proposed framework features a multi-domain feature extractor based on Common Spatial Patterns (CSP) with a sliding window and a parallel two-branch Convolutional Neural Network (CNN). The performance of the proposed methodology has been evaluated on the multi-class BCI Competition IV Dataset 2a through repeated 10- fold cross-validation. Experimental results indicated that the implementation of the ACSSR method (80.46%) in the proposed framework has led to a considerable improvement in the classification performance compared to the classification without data augmentation (77.63%), and other fundamental data augmentation techniques used in the literature. The study contributes to the advancements for the development of effective MI-BCIs by showcasing the ability of the ACSSR method to address the challenges in motor imagery signal classification tasks.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"114 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139614244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}