Pub Date : 2024-07-15DOI: 10.1088/2632-2153/ad638f
Edoardo Fazzari, H. Loughlin, Chris Stoughton
This study applies an effective methodology based on Reinforcement Learning (RL) to a control system. Using the Pound-Drever-Hall locking scheme, we match the wavelength of a controlled laser to the length of a Fabry-Pérot cavity such that the cavity length is an exact integer multiple of the laser wavelength. Typically, long-term drift of the cavity length and laser wavelength exceeds the dynamic range of this control if only the laser's piezoelectric transducer is actuated, so the same error signal also controls the temperature of the laser crystal. In this work, we instead implement this feedback control grounded on Q-Learning. Our system learns in real-time, eschewing reliance on historical data, and exhibits adaptability to system variations post-training. This adaptive quality ensures continuous updates to the learning agent. This innovative approach maintains lock for eight days on average.
{"title":"Controlling optical-cavity locking using reinforcement learning","authors":"Edoardo Fazzari, H. Loughlin, Chris Stoughton","doi":"10.1088/2632-2153/ad638f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad638f","url":null,"abstract":"\u0000 This study applies an effective methodology based on Reinforcement Learning (RL) to a control system. Using the Pound-Drever-Hall locking scheme, we match the wavelength of a controlled laser to the length of a Fabry-Pérot cavity such that the cavity length is an exact integer multiple of the laser wavelength. Typically, long-term drift of the cavity length and laser wavelength exceeds the dynamic range of this control if only the laser's piezoelectric transducer is actuated, so the same error signal also controls the temperature of the laser crystal. In this work, we instead implement this feedback control grounded on Q-Learning. Our system learns in real-time, eschewing reliance on historical data, and exhibits adaptability to system variations post-training. This adaptive quality ensures continuous updates to the learning agent. This innovative approach maintains lock for eight days on average.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"17 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141648263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-12DOI: 10.1088/2632-2153/ad62ad
Maximilian P. Niroomand, L. Dicks, Edward Pyzer-Knapp, David J. Wales
Prior beliefs about the latent function to shape inductive biases can be incorporated into a Gaussian Process (GP) via the kernel. However, beyond kernel choices, the decision-making process of GP models remains poorly understood. In this work, we contribute an analysis of the loss landscape for GP models using methods from chemical physics. We demonstrate $nu$-continuity for Mat'ern kernels and outline aspects of catastrophe theory at critical points in the loss landscape. By directly including $nu$ in the hyperparameter optimisation for Mat'ern kernels, we find that typical values of $nu$ textcolor{black}{can be} far from optimal in terms of performance. We also provide an textit{a priori} method for evaluating the effect of GP ensembles and discuss various voting approaches based on physical properties of the loss landscape. The utility of these approaches is demonstrated for various synthetic and real datasets. Our findings provide textcolor{black}{insight into hyperparameter optimisation for} GPs and offer practical guidance for improving their performance and interpretability in a range of applications.
关于潜在函数的先验信念可以通过内核纳入高斯过程(GP),从而形成归纳偏差。然而,除了核选择之外,人们对 GP 模型的决策过程仍然知之甚少。在这项工作中,我们使用化学物理学的方法对 GP 模型的损失景观进行了分析。我们证明了 Mat'ern 内核的 $nu$ 连续性,并概述了损失景观临界点的灾难理论的各个方面。通过将 $nu$ 直接纳入 Mat'ern 核的超参数优化,我们发现 $textcolor{black}{can be} 的典型值在性能方面远非最优。我们还提供了一种评估 GP 集合效果的先验方法,并讨论了基于损失景观物理特性的各种投票方法。这些方法的实用性在各种合成和真实数据集上得到了证明。我们的发现为 GPs 的超参数优化提供了启示,并为在一系列应用中提高 GPs 的性能和可解释性提供了实际指导。
{"title":"Explainable Gaussian Processes: a loss landscape perspective","authors":"Maximilian P. Niroomand, L. Dicks, Edward Pyzer-Knapp, David J. Wales","doi":"10.1088/2632-2153/ad62ad","DOIUrl":"https://doi.org/10.1088/2632-2153/ad62ad","url":null,"abstract":"\u0000 Prior beliefs about the latent function to shape inductive biases can be incorporated into a Gaussian Process (GP) via the kernel. However, beyond kernel choices, the decision-making process of GP models remains poorly understood. In this work, we contribute an analysis of the loss landscape for GP models using methods from chemical physics. We demonstrate $nu$-continuity for Mat'ern kernels and outline aspects of catastrophe theory at critical points in the loss landscape. By directly including $nu$ in the hyperparameter optimisation for Mat'ern kernels, we find that typical values of $nu$ textcolor{black}{can be} far from optimal in terms of performance. We also provide an textit{a priori} method for evaluating the effect of GP ensembles and discuss various voting approaches based on physical properties of the loss landscape. The utility of these approaches is demonstrated for various synthetic and real datasets. Our findings provide textcolor{black}{insight into hyperparameter optimisation for} GPs and offer practical guidance for improving their performance and interpretability in a range of applications.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"10 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141652594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-12DOI: 10.1088/2632-2153/ad62ab
Raül Fabra-Boluda, Cèsar Ferri, M. J. Ramírez-Quintana, Fernando Martínez-Plumed
The evaluation of machine learning systems has typically been limited to performance measures on clean and curated datasets, which may not accurately reflect their robustness in real-world situations where data distribution can vary from learning to deployment, and where truthfully predict some instances could be more difficult than others. Therefore, a key aspect in understanding robustness is instance difficulty, which refers to the level of unexpectedness of system failure on a specific instance. We present a framework that evaluates the robustness of different machine learning models using Item Response Theory-based estimates of instance difficulty for supervised tasks. This framework evaluates performance deviations by applying perturbation methods that simulate noise and variability in deployment conditions. Our findings result in the development of a comprehensive taxonomy of machine learning techniques, based on both the robustness of the models and the difficulty of the instances, providing a deeper understanding of the strengths and limitations of specific families of machine learning models. This study is a significant step towards exposing vulnerabilities of particular families of machine learning models.
{"title":"Unveiling the Robustness of Machine Learning Families","authors":"Raül Fabra-Boluda, Cèsar Ferri, M. J. Ramírez-Quintana, Fernando Martínez-Plumed","doi":"10.1088/2632-2153/ad62ab","DOIUrl":"https://doi.org/10.1088/2632-2153/ad62ab","url":null,"abstract":"\u0000 The evaluation of machine learning systems has typically been limited to performance measures on clean and curated datasets, which may not accurately reflect their robustness in real-world situations where data distribution can vary from learning to deployment, and where truthfully predict some instances could be more difficult than others. Therefore, a key aspect in understanding robustness is instance difficulty, which refers to the level of unexpectedness of system failure on a specific instance. We present a framework that evaluates the robustness of different machine learning models using Item Response Theory-based estimates of instance difficulty for supervised tasks. This framework evaluates performance deviations by applying perturbation methods that simulate noise and variability in deployment conditions. Our findings result in the development of a comprehensive taxonomy of machine learning techniques, based on both the robustness of the models and the difficulty of the instances, providing a deeper understanding of the strengths and limitations of specific families of machine learning models. This study is a significant step towards exposing vulnerabilities of particular families of machine learning models.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"61 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141654794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-12DOI: 10.1088/2632-2153/ad62ac
Juan-Esteban Suarez Cardona, Phil-Alexander Hofmann, Michael Hecht
We present a variational approach aimed at enhancing the training of Physics-Informed Neural Networks (PINNs) and more general surrogate models for learning partial differential equations (PDEs). In particular, we extend our formerly introduced notion of Sobolev cubatures to negative orders, enabling the approximation of negative order Sobolev norms. We mathematically prove the effect of negative order Sobolev cubatures in improving the condition number of discrete PDE learning problems, providing balancing scalars that mitigate numerical stiffness issues caused by loss imbalances. Additionally, we consider polynomial surrogate models (PSMs), which maintain the flexibility of PINN formulations while preserving the convexity structure of the PDE operators. The combination of negative order Sobolev cubatures and PSMs delivers well-conditioned discrete optimization problems, solvable via an exponentially fast convergent gradient descent for λ-convex losses. Our theoretical contributions are supported by numerical experiments, addressing linear and non-linear, forward and inverse PDE problems. These experiments show that the Sobolev cubature-based PSMs emerge as the superior state-of-the-art PINN technique.
{"title":"Negative order Sobolev cubatures: preconditioners of partial differential equation learning tasks circumventing numerical stiffness","authors":"Juan-Esteban Suarez Cardona, Phil-Alexander Hofmann, Michael Hecht","doi":"10.1088/2632-2153/ad62ac","DOIUrl":"https://doi.org/10.1088/2632-2153/ad62ac","url":null,"abstract":"\u0000 We present a variational approach aimed at enhancing the training of Physics-Informed Neural Networks (PINNs) and more general surrogate models for learning partial differential equations (PDEs). In particular, we extend our formerly introduced notion of Sobolev cubatures to negative orders, enabling the approximation of negative order Sobolev norms. We mathematically prove the effect of negative order Sobolev cubatures in improving the condition number of discrete PDE learning problems, providing balancing scalars that mitigate numerical stiffness issues caused by loss imbalances. Additionally, we consider polynomial surrogate models (PSMs), which maintain the flexibility of PINN formulations while preserving the convexity structure of the PDE operators. The combination of negative order Sobolev cubatures and PSMs delivers well-conditioned discrete optimization problems, solvable via an exponentially fast convergent gradient descent for λ-convex losses. Our theoretical contributions are supported by numerical experiments, addressing linear and non-linear, forward and inverse PDE problems. These experiments show that the Sobolev cubature-based PSMs emerge as the superior state-of-the-art PINN technique.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"36 17","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141655126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: Proton dose deposition results are influenced by various factors, such as irradiation angle, beamlet energy and other parameters. The calculation of the proton dose deposition matrix (DDM) can be highly complex but is crucial in intensity-modulated proton therapy (IMPT). In this work, we present a novel deep learning (DL) approach using multi-source features for proton DDM prediction. Methods: The DL5 proton DDM prediction method involves five input features containing beamlet geometry, dosimetry and treatment machine information like patient CT data, beamlet energy, distance from voxel to beamlet axis, distance from voxel to body surface, and pencil beam (PB) dose. The dose calculated by Monte Carlo (MC) method was used as the ground truth dose label. A total of 40,000 features, corresponding to 8000 beamlets, were obtained from head patient datasets and used for the training data. Additionally, seventeen head patients not included in the training process were utilized as testing cases. Results: The DL5 method demonstrates high proton beamlet dose prediction accuracy, with an average determination coefficient R2 of 0.93 when compared to the MC dose. Accurate beamlet dose estimation can be achieved in as little as 1.5 milliseconds for an individual proton beamlet. For IMPT plan dose comparisons to the dose calculated by the MC method, the DL5 method exhibited gamma pass rates of γ(2mm, 2%) and γ(3mm, 3%) ranging from 98.15% to 99.89% and 98.80% to 99.98%, respectively, across all 17 testing cases. On average, the DL5 method increased the gamma pass rates to γ(2mm, 2%) from 82.97% to 99.23% and to γ(3mm, 3%) from 85.27% to 99.75% when compared with the PB method. Conclusions: The proposed DL5 model enables rapid and precise dose calculation in IMPT plan, which has the potential to significantly enhance the efficiency and quality of proton radiation therapy.
{"title":"Proton dose deposition matrix prediction using multi-source feature driven deep learning approach","authors":"Peng Zhou, Shengxiu Jiao, Xiaoqian Zhao, Shuzhan Yao, Honghao Xu, Chuan Chen","doi":"10.1088/2632-2153/ad6231","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6231","url":null,"abstract":"\u0000 Purpose: Proton dose deposition results are influenced by various factors, such as irradiation angle, beamlet energy and other parameters. The calculation of the proton dose deposition matrix (DDM) can be highly complex but is crucial in intensity-modulated proton therapy (IMPT). In this work, we present a novel deep learning (DL) approach using multi-source features for proton DDM prediction. Methods: The DL5 proton DDM prediction method involves five input features containing beamlet geometry, dosimetry and treatment machine information like patient CT data, beamlet energy, distance from voxel to beamlet axis, distance from voxel to body surface, and pencil beam (PB) dose. The dose calculated by Monte Carlo (MC) method was used as the ground truth dose label. A total of 40,000 features, corresponding to 8000 beamlets, were obtained from head patient datasets and used for the training data. Additionally, seventeen head patients not included in the training process were utilized as testing cases. Results: The DL5 method demonstrates high proton beamlet dose prediction accuracy, with an average determination coefficient R2 of 0.93 when compared to the MC dose. Accurate beamlet dose estimation can be achieved in as little as 1.5 milliseconds for an individual proton beamlet. For IMPT plan dose comparisons to the dose calculated by the MC method, the DL5 method exhibited gamma pass rates of γ(2mm, 2%) and γ(3mm, 3%) ranging from 98.15% to 99.89% and 98.80% to 99.98%, respectively, across all 17 testing cases. On average, the DL5 method increased the gamma pass rates to γ(2mm, 2%) from 82.97% to 99.23% and to γ(3mm, 3%) from 85.27% to 99.75% when compared with the PB method. Conclusions: The proposed DL5 model enables rapid and precise dose calculation in IMPT plan, which has the potential to significantly enhance the efficiency and quality of proton radiation therapy.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"77 21","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141657834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-11DOI: 10.1088/2632-2153/ad6230
Meysam Hashemi, Abolfazl Ziaeemehr, M. Woodman, Jan Fousek, S. Petkoski, V. Jirsa
Connectome-based models, also known as Virtual Brain Models (VBMs), have been well established in network neuroscience to investigate pathophysiological causes underlying a large range of brain diseases. The integration of an individual's brain imaging data in VBMs has improved patient-specific predictivity, although Bayesian estimation of spatially distributed parameters remains challenging even with state-of-the-art Monte Carlo sampling. VBMs imply latent nonlinear state space models driven by noise and network input, necessitating advanced probabilistic machine learning techniques for widely applicable Bayesian estimation. Here we present Simulation-based Inference on Virtual Brain Models (SBI-VBMs), and demonstrate that training deep neural networks on both spatio-temporal and functional features allows for accurate estimation of generative parameters in brain disorders. The systematic use of brain stimulation provides an effective remedy for the non-identifiability issue in estimating the degradation limited to smaller subset of connections. By prioritizing model structure over data, we show that the hierarchical structure in SBI-VBMs renders the inference more effective, precise and biologically plausible. This approach could broadly advance precision medicine by enabling fast and reliable prediction of patient-specific brain disorders.
{"title":"Simulation-based Inference on Virtual Brain Models of Disorders","authors":"Meysam Hashemi, Abolfazl Ziaeemehr, M. Woodman, Jan Fousek, S. Petkoski, V. Jirsa","doi":"10.1088/2632-2153/ad6230","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6230","url":null,"abstract":"\u0000 Connectome-based models, also known as Virtual Brain Models (VBMs), have been well established in network neuroscience to investigate pathophysiological causes underlying a large range of brain diseases. The integration of an individual's brain imaging data in VBMs has improved patient-specific predictivity, although Bayesian estimation of spatially distributed parameters remains challenging even with state-of-the-art Monte Carlo sampling. VBMs imply latent nonlinear state space models driven by noise and network input, necessitating advanced probabilistic machine learning techniques for widely applicable Bayesian estimation. Here we present Simulation-based Inference on Virtual Brain Models (SBI-VBMs), and demonstrate that training deep neural networks on both spatio-temporal and functional features allows for accurate estimation of generative parameters in brain disorders. The systematic use of brain stimulation provides an effective remedy for the non-identifiability issue in estimating the degradation limited to smaller subset of connections. By prioritizing model structure over data, we show that the hierarchical structure in SBI-VBMs renders the inference more effective, precise and biologically plausible. This approach could broadly advance precision medicine by enabling fast and reliable prediction of patient-specific brain disorders.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"122 27","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141657060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-08DOI: 10.1088/2632-2153/ad605f
M. R. Mahani, I. Nechepurenko, Y. Rahimof, A. Wicht
Acquiring a substantial number of data points for training accurate machine learning (ML) models is a big challenge in scientific fields where data collection is resource-intensive. Here, we propose a novel approach for constructing a minimal yet highly informative database for training ML models in complex multi-dimensional parameter spaces. To achieve this, we mimic the underlying relation between the output and input parameters using Gaussian process regression (GPR). Using a set of known data, GPR provides predictive means and standard deviation for the unknown data. Given the predicted standard deviation by GPR, we select data points using Bayesian optimization to obtain an efficient database for training ML models. We compare the performance of ML models trained on databases obtained through this method, with databases obtained using traditional approaches. Our results demonstrate that the ML models trained on the database obtained using Bayesian optimization approach consistently outperform the other two databases, achieving high accuracy with a significantly smaller number of data points. Our work contributes to the resource-efficient collection of data in high-dimensional complex parameter spaces, to achieve high precision machine learning predictions.
在数据收集资源密集的科学领域,获取大量数据点以训练精确的机器学习(ML)模型是一项巨大挑战。在此,我们提出了一种新方法,用于构建最小但信息量很大的数据库,以训练复杂多维参数空间中的机器学习模型。为此,我们使用高斯过程回归(GPR)来模仿输出和输入参数之间的潜在关系。利用一组已知数据,GPR 可为未知数据提供预测均值和标准偏差。根据 GPR 预测的标准偏差,我们使用贝叶斯优化法选择数据点,从而获得用于训练 ML 模型的高效数据库。我们比较了在通过这种方法获得的数据库上训练的 ML 模型与使用传统方法获得的数据库的性能。结果表明,在使用贝叶斯优化方法获得的数据库上训练的 ML 模型始终优于其他两个数据库,在数据点数量明显较少的情况下实现了高准确度。我们的工作有助于在高维复杂参数空间中高效收集数据,从而实现高精度的机器学习预测。
{"title":"Optimizing data acquisition: a Bayesian approach for efficient machine learning model training","authors":"M. R. Mahani, I. Nechepurenko, Y. Rahimof, A. Wicht","doi":"10.1088/2632-2153/ad605f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad605f","url":null,"abstract":"\u0000 Acquiring a substantial number of data points for training accurate machine learning (ML) models is a big challenge in scientific fields where data collection is resource-intensive. Here, we propose a novel approach for constructing a minimal yet highly informative database for training ML models in complex multi-dimensional parameter spaces. To achieve this, we mimic the underlying relation between the output and input parameters using Gaussian process regression (GPR). Using a set of known data, GPR provides predictive means and standard deviation for the unknown data. Given the predicted standard deviation by GPR, we select data points using Bayesian optimization to obtain an efficient database for training ML models. We compare the performance of ML models trained on databases obtained through this method, with databases obtained using traditional approaches. Our results demonstrate that the ML models trained on the database obtained using Bayesian optimization approach consistently outperform the other two databases, achieving high accuracy with a significantly smaller number of data points. Our work contributes to the resource-efficient collection of data in high-dimensional complex parameter spaces, to achieve high precision machine learning predictions.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"113 28","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141667900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-08DOI: 10.1088/2632-2153/ad605e
Kevin Singh Gill, David R Smith, Semin Joung, B. Geiger, G. McKee, Jefferey Zimmerman, Ryan N Coffee, A. Jalalvand, E. Kolemen
A real-time detection of the plasma confinement regime can enable new advanced plasma control capabilities for both the access to and sustainment of enhanced confinement regimes in fusion devices. For example, a real-time indication of the confinement regime can facilitate transition to the high-performing wide pedestal quiescent H-mode, or avoid unwanted transitions to lower confinement regimes that may induce plasma termination. To demonstrate real-time confinement regime detection, we use the 2D beam emission spectroscopy (BES) diagnostic system to capture localized density fluctuations of long wavelength turbulent modes in the edge region at a 1 MHz sampling rate. BES data from 330 discharges in either L-mode, H-mode, Quiescent H (QH)-mode, or wide-pedestal QH-mode was collected from the DIII-D tokamak and curated to develop a high-quality database to train a deep-learning classification model for real-time confinement detection. We utilize the 6x8 spatial configuration with a time window of 1024 $mu$s and recast the input to obtain spectral-like features via FFT preprocessing. We employ a shallow 3D convolutional neural network for the multivariate time-series classification task and utilize a softmax in the final dense layer to retrieve a probability distribution over the different confinement regimes. Our model classifies the global confinement state on 44 unseen test discharges with an average $F_1$ score of 0.94, using only $sim$1 millisecond snippets of BES data at a time. This activity demonstrates the feasibility for real-time data analysis of fluctuation diagnostics in future devices such as ITER, where the need for reliable and advanced plasma control is urgent.
{"title":"Real-time confinement regime detection in fusion plasmas with convolutional neural networks and high-bandwidth edge fluctuation measurements","authors":"Kevin Singh Gill, David R Smith, Semin Joung, B. Geiger, G. McKee, Jefferey Zimmerman, Ryan N Coffee, A. Jalalvand, E. Kolemen","doi":"10.1088/2632-2153/ad605e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad605e","url":null,"abstract":"\u0000 A real-time detection of the plasma confinement regime can enable new advanced plasma control capabilities for both the access to and sustainment of enhanced confinement regimes in fusion devices. For example, a real-time indication of the confinement regime can facilitate transition to the high-performing wide pedestal quiescent H-mode, or avoid unwanted transitions to lower confinement regimes that may induce plasma termination. To demonstrate real-time confinement regime detection, we use the 2D beam emission spectroscopy (BES) diagnostic system to capture localized density fluctuations of long wavelength turbulent modes in the edge region at a 1 MHz sampling rate. BES data from 330 discharges in either L-mode, H-mode, Quiescent H (QH)-mode, or wide-pedestal QH-mode was collected from the DIII-D tokamak and curated to develop a high-quality database to train a deep-learning classification model for real-time confinement detection. We utilize the 6x8 spatial configuration with a time window of 1024 $mu$s and recast the input to obtain spectral-like features via FFT preprocessing. We employ a shallow 3D convolutional neural network for the multivariate time-series classification task and utilize a softmax in the final dense layer to retrieve a probability distribution over the different confinement regimes. Our model classifies the global confinement state on 44 unseen test discharges with an average $F_1$ score of 0.94, using only $sim$1 millisecond snippets of BES data at a time. This activity demonstrates the feasibility for real-time data analysis of fluctuation diagnostics in future devices such as ITER, where the need for reliable and advanced plasma control is urgent.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"119 45","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141667840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-05DOI: 10.1088/2632-2153/ad5fdd
Vasilis Belis, Patrick Odagiu, Michele Grossi, Florentin Reiter, Günther Dissertori, Sofia Vallecorsa
Quantum machine learning provides a fundamentally different approach to analyzing data. However, many interesting datasets are too complex for currently available quantum computers. Present quantum machine learning applications usually diminish this complexity by reducing the dimensionality of the data, e.g., via auto-encoders, before passing it through the quantum models. Here, we design a classical-quantum paradigm that unifies the dimensionality reduction task with a quantum classification model into a single architecture: the guided quantum compression model. We exemplify how this architecture outperforms conventional quantum machine learning approaches on a challenging binary classification problem: identifying the Higgs boson in proton-proton collisions at the LHC. Furthermore, the guided quantum compression model shows better performance compared to the deep learning benchmark when using solely the kinematic variables in our dataset.
{"title":"Guided quantum compression for high dimensional data classification","authors":"Vasilis Belis, Patrick Odagiu, Michele Grossi, Florentin Reiter, Günther Dissertori, Sofia Vallecorsa","doi":"10.1088/2632-2153/ad5fdd","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5fdd","url":null,"abstract":"\u0000 Quantum machine learning provides a fundamentally different approach to analyzing data. However, many interesting datasets are too complex for currently available quantum computers. Present quantum machine learning applications usually diminish this complexity by reducing the dimensionality of the data, e.g., via auto-encoders, before passing it through the quantum models. Here, we design a classical-quantum paradigm that unifies the dimensionality reduction task with a quantum classification model into a single architecture: the guided quantum compression model. We exemplify how this architecture outperforms conventional quantum machine learning approaches on a challenging binary classification problem: identifying the Higgs boson in proton-proton collisions at the LHC. Furthermore, the guided quantum compression model shows better performance compared to the deep learning benchmark when using solely the kinematic variables in our dataset.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 40","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141675402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-05DOI: 10.1088/2632-2153/ad5fda
Priyanka Gupta, Manik Gupta, Vijay Kumar
Supervised machine learning requires the estimation of multiple parameters by using large amounts of labelled data. Getting labelled data generally requires a substantial allocation of resources in terms of both cost and time. In such scenarios, weak supervised learning techniques like data programming (DP) and active learning (AL) can be advantageous for time-series classification tasks. These paradigms can be used to assign data labels in an automated manner, and time-series classification can subsequently be carried out on the labelled data. This work proposes a novel framework titled active learning enhanced data programming (ActDP). It uses DP and AL for ECG classification using single-lead data. ECG classification is pivotal in cardiology and healthcare for diagnosing a broad spectrum of heart conditions and arrhythmias. To establish the usefulness of this proposed ActDP framework, the experiments have been conducted using the MIT-BIH dataset with 94,224 ECG beats. DP assigns a probabilistic label to each ECG beat using nine novel polar labelling functions and a generative model in this work. Further, AL improves the result of DP by replacing the labels for sampled ECG beats of a generative model with ground truth. Subsequently, a discriminative model is trained on these labels for each iteration. The experimental results show that by incorporating AL to DP in the ActDP framework, the accuracy of ECG classification strictly increases from 85.7 % to 97.34 % in 58 iterations. Comparatively, the proposed framework (ActDP) has demonstrated a higher classification accuracy of 97.34 % In contrast, DP with data augmentation (DA) achieves an accuracy of 92.2 %, while DP without DA results in an accuracy of 85.7 %, majority vote yields an accuracy of 50.2 %, and the generative model achieves an accuracy of only 66.5 %.
{"title":"An active learning enhanced data programming (ActDP) framework for ECG time series","authors":"Priyanka Gupta, Manik Gupta, Vijay Kumar","doi":"10.1088/2632-2153/ad5fda","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5fda","url":null,"abstract":"\u0000 Supervised machine learning requires the estimation of multiple parameters by using large amounts of labelled data. Getting labelled data generally requires a substantial allocation of resources in terms of both cost and time. In such scenarios, weak supervised learning techniques like data programming (DP) and active learning (AL) can be advantageous for time-series classification tasks. These paradigms can be used to assign data labels in an automated manner, and time-series classification can subsequently be carried out on the labelled data. This work proposes a novel framework titled active learning enhanced data programming (ActDP). It uses DP and AL for ECG classification using single-lead data. ECG classification is pivotal in cardiology and healthcare for diagnosing a broad spectrum of heart conditions and arrhythmias. To establish the usefulness of this proposed ActDP framework, the experiments have been conducted using the MIT-BIH dataset with 94,224 ECG beats. DP assigns a probabilistic label to each ECG beat using nine novel polar labelling functions and a generative model in this work. Further, AL improves the result of DP by replacing the labels for sampled ECG beats of a generative model with ground truth. Subsequently, a discriminative model is trained on these labels for each iteration. The experimental results show that by incorporating AL to DP in the ActDP framework, the accuracy of ECG classification strictly increases from 85.7 % to 97.34 % in 58 iterations. Comparatively, the proposed framework (ActDP) has demonstrated a higher classification accuracy of 97.34 % In contrast, DP with data augmentation (DA) achieves an accuracy of 92.2 %, while DP without DA results in an accuracy of 85.7 %, majority vote yields an accuracy of 50.2 %, and the generative model achieves an accuracy of only 66.5 %.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"132 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141674077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}