Machine Learning: Science and Technology最新文献_第2页

Controlling optical-cavity locking using reinforcement learning 利用强化学习控制光腔锁定

Machine Learning: Science and Technology

Pub Date : 2024-07-15 DOI: 10.1088/2632-2153/ad638f

Edoardo Fazzari, H. Loughlin, Chris Stoughton

This study applies an effective methodology based on Reinforcement Learning (RL) to a control system. Using the Pound-Drever-Hall locking scheme, we match the wavelength of a controlled laser to the length of a Fabry-Pérot cavity such that the cavity length is an exact integer multiple of the laser wavelength. Typically, long-term drift of the cavity length and laser wavelength exceeds the dynamic range of this control if only the laser's piezoelectric transducer is actuated, so the same error signal also controls the temperature of the laser crystal. In this work, we instead implement this feedback control grounded on Q-Learning. Our system learns in real-time, eschewing reliance on historical data, and exhibits adaptability to system variations post-training. This adaptive quality ensures continuous updates to the learning agent. This innovative approach maintains lock for eight days on average.

本研究将基于强化学习（RL）的有效方法应用于控制系统。利用 Pound-Drever-Hall 锁定方案，我们将受控激光器的波长与法布里-佩罗腔的长度相匹配，从而使腔长是激光器波长的精确整数倍。通常情况下，如果只驱动激光器的压电传感器，腔长和激光波长的长期漂移会超出这种控制的动态范围，因此同一误差信号还能控制激光晶体的温度。在这项工作中，我们以 Q 学习为基础实现了这种反馈控制。我们的系统是实时学习的，避免了对历史数据的依赖，并能适应训练后的系统变化。这种自适应质量确保了学习代理的持续更新。这种创新方法平均可锁定八天。

引用次数: 0

Explainable Gaussian Processes: a loss landscape perspective 可解释的高斯过程：损失景观视角

Machine Learning: Science and Technology

Pub Date : 2024-07-12 DOI: 10.1088/2632-2153/ad62ad

Maximilian P. Niroomand, L. Dicks, Edward Pyzer-Knapp, David J. Wales

Prior beliefs about the latent function to shape inductive biases can be incorporated into a Gaussian Process (GP) via the kernel. However, beyond kernel choices, the decision-making process of GP models remains poorly understood. In this work, we contribute an analysis of the loss landscape for GP models using methods from chemical physics. We demonstrate $nu$-continuity for Mat'ern kernels and outline aspects of catastrophe theory at critical points in the loss landscape. By directly including $nu$ in the hyperparameter optimisation for Mat'ern kernels, we find that typical values of $nu$ textcolor{black}{can be} far from optimal in terms of performance. We also provide an textit{a priori} method for evaluating the effect of GP ensembles and discuss various voting approaches based on physical properties of the loss landscape. The utility of these approaches is demonstrated for various synthetic and real datasets. Our findings provide textcolor{black}{insight into hyperparameter optimisation for} GPs and offer practical guidance for improving their performance and interpretability in a range of applications.

关于潜在函数的先验信念可以通过内核纳入高斯过程（GP），从而形成归纳偏差。然而，除了核选择之外，人们对 GP 模型的决策过程仍然知之甚少。在这项工作中，我们使用化学物理学的方法对 GP 模型的损失景观进行了分析。我们证明了 Mat'ern 内核的 $nu$ 连续性，并概述了损失景观临界点的灾难理论的各个方面。通过将 $nu$ 直接纳入 Mat'ern 核的超参数优化，我们发现 $textcolor{black}{can be} 的典型值在性能方面远非最优。我们还提供了一种评估 GP 集合效果的先验方法，并讨论了基于损失景观物理特性的各种投票方法。这些方法的实用性在各种合成和真实数据集上得到了证明。我们的发现为 GPs 的超参数优化提供了启示，并为在一系列应用中提高 GPs 的性能和可解释性提供了实际指导。

引用次数: 0

Unveiling the Robustness of Machine Learning Families 揭示机器学习家族的鲁棒性

Machine Learning: Science and Technology

Pub Date : 2024-07-12 DOI: 10.1088/2632-2153/ad62ab

Raül Fabra-Boluda, Cèsar Ferri, M. J. Ramírez-Quintana, Fernando Martínez-Plumed

The evaluation of machine learning systems has typically been limited to performance measures on clean and curated datasets, which may not accurately reflect their robustness in real-world situations where data distribution can vary from learning to deployment, and where truthfully predict some instances could be more difficult than others. Therefore, a key aspect in understanding robustness is instance difficulty, which refers to the level of unexpectedness of system failure on a specific instance. We present a framework that evaluates the robustness of different machine learning models using Item Response Theory-based estimates of instance difficulty for supervised tasks. This framework evaluates performance deviations by applying perturbation methods that simulate noise and variability in deployment conditions. Our findings result in the development of a comprehensive taxonomy of machine learning techniques, based on both the robustness of the models and the difficulty of the instances, providing a deeper understanding of the strengths and limitations of specific families of machine learning models. This study is a significant step towards exposing vulnerabilities of particular families of machine learning models.

对机器学习系统的评估通常局限于对干净且经过精心策划的数据集进行性能测量，这可能无法准确反映其在现实世界中的鲁棒性，因为在现实世界中，数据分布可能因学习和部署而异，而且如实预测某些实例可能比预测其他实例更加困难。因此，理解鲁棒性的一个关键方面是实例难度，它指的是系统在特定实例上发生故障的意外程度。我们提出了一个框架，利用基于项目反应理论（Item Response Theory）的实例难度估算来评估不同机器学习模型的鲁棒性。该框架通过应用扰动方法来模拟部署条件中的噪声和变异性，从而评估性能偏差。我们的研究结果基于模型的鲁棒性和实例的难度，对机器学习技术进行了全面分类，从而加深了对特定机器学习模型系列的优势和局限性的理解。这项研究在揭示特定机器学习模型系列的漏洞方面迈出了重要一步。

{"title":"Unveiling the Robustness of Machine Learning Families","authors":"Raül Fabra-Boluda, Cèsar Ferri, M. J. Ramírez-Quintana, Fernando Martínez-Plumed","doi":"10.1088/2632-2153/ad62ab","DOIUrl":"https://doi.org/10.1088/2632-2153/ad62ab","url":null,"abstract":"\u0000 The evaluation of machine learning systems has typically been limited to performance measures on clean and curated datasets, which may not accurately reflect their robustness in real-world situations where data distribution can vary from learning to deployment, and where truthfully predict some instances could be more difficult than others. Therefore, a key aspect in understanding robustness is instance difficulty, which refers to the level of unexpectedness of system failure on a specific instance. We present a framework that evaluates the robustness of different machine learning models using Item Response Theory-based estimates of instance difficulty for supervised tasks. This framework evaluates performance deviations by applying perturbation methods that simulate noise and variability in deployment conditions. Our findings result in the development of a comprehensive taxonomy of machine learning techniques, based on both the robustness of the models and the difficulty of the instances, providing a deeper understanding of the strengths and limitations of specific families of machine learning models. This study is a significant step towards exposing vulnerabilities of particular families of machine learning models.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"61 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141654794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Negative order Sobolev cubatures: preconditioners of partial differential equation learning tasks circumventing numerical stiffness 负阶索博列夫立方体：偏微分方程学习任务中规避数值僵化的前置条件器

Machine Learning: Science and Technology

Pub Date : 2024-07-12 DOI: 10.1088/2632-2153/ad62ac

Juan-Esteban Suarez Cardona, Phil-Alexander Hofmann, Michael Hecht

We present a variational approach aimed at enhancing the training of Physics-Informed Neural Networks (PINNs) and more general surrogate models for learning partial differential equations (PDEs). In particular, we extend our formerly introduced notion of Sobolev cubatures to negative orders, enabling the approximation of negative order Sobolev norms. We mathematically prove the effect of negative order Sobolev cubatures in improving the condition number of discrete PDE learning problems, providing balancing scalars that mitigate numerical stiffness issues caused by loss imbalances. Additionally, we consider polynomial surrogate models (PSMs), which maintain the flexibility of PINN formulations while preserving the convexity structure of the PDE operators. The combination of negative order Sobolev cubatures and PSMs delivers well-conditioned discrete optimization problems, solvable via an exponentially fast convergent gradient descent for λ-convex losses. Our theoretical contributions are supported by numerical experiments, addressing linear and non-linear, forward and inverse PDE problems. These experiments show that the Sobolev cubature-based PSMs emerge as the superior state-of-the-art PINN technique.

我们提出了一种变分方法，旨在加强物理信息神经网络（PINNs）和学习偏微分方程（PDEs）的更一般代用模型的训练。特别是，我们将以前引入的 Sobolev 立方概念扩展到负阶，从而实现了负阶 Sobolev 准则的近似。我们用数学方法证明了负阶索博列夫立方在改善离散 PDE 学习问题的条件数方面的效果，并提供了平衡标量，以缓解损失不平衡引起的数值僵化问题。此外，我们还考虑了多项式代理模型（PSM），它既保持了 PINN 公式的灵活性，又保留了 PDE 算子的凸性结构。负阶 Sobolev 立方和 PSM 的结合提供了条件良好的离散优化问题，可通过指数级快速收敛梯度下降法解决 λ 凸损失。我们的理论贡献得到了解决线性和非线性、正向和反向 PDE 问题的数值实验的支持。这些实验表明，基于 Sobolev 立方的 PSMs 是最先进的 PINN 技术中的佼佼者。

{"title":"Negative order Sobolev cubatures: preconditioners of partial differential equation learning tasks circumventing numerical stiffness","authors":"Juan-Esteban Suarez Cardona, Phil-Alexander Hofmann, Michael Hecht","doi":"10.1088/2632-2153/ad62ac","DOIUrl":"https://doi.org/10.1088/2632-2153/ad62ac","url":null,"abstract":"\u0000 We present a variational approach aimed at enhancing the training of Physics-Informed Neural Networks (PINNs) and more general surrogate models for learning partial differential equations (PDEs). In particular, we extend our formerly introduced notion of Sobolev cubatures to negative orders, enabling the approximation of negative order Sobolev norms. We mathematically prove the effect of negative order Sobolev cubatures in improving the condition number of discrete PDE learning problems, providing balancing scalars that mitigate numerical stiffness issues caused by loss imbalances. Additionally, we consider polynomial surrogate models (PSMs), which maintain the flexibility of PINN formulations while preserving the convexity structure of the PDE operators. The combination of negative order Sobolev cubatures and PSMs delivers well-conditioned discrete optimization problems, solvable via an exponentially fast convergent gradient descent for λ-convex losses. Our theoretical contributions are supported by numerical experiments, addressing linear and non-linear, forward and inverse PDE problems. These experiments show that the Sobolev cubature-based PSMs emerge as the superior state-of-the-art PINN technique.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"36 17","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141655126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Proton dose deposition matrix prediction using multi-source feature driven deep learning approach 利用多源特征驱动深度学习方法预测质子剂量沉积矩阵

Machine Learning: Science and Technology

Pub Date : 2024-07-11 DOI: 10.1088/2632-2153/ad6231

Peng Zhou, Shengxiu Jiao, Xiaoqian Zhao, Shuzhan Yao, Honghao Xu, Chuan Chen

Purpose: Proton dose deposition results are influenced by various factors, such as irradiation angle, beamlet energy and other parameters. The calculation of the proton dose deposition matrix (DDM) can be highly complex but is crucial in intensity-modulated proton therapy (IMPT). In this work, we present a novel deep learning (DL) approach using multi-source features for proton DDM prediction. Methods: The DL5 proton DDM prediction method involves five input features containing beamlet geometry, dosimetry and treatment machine information like patient CT data, beamlet energy, distance from voxel to beamlet axis, distance from voxel to body surface, and pencil beam (PB) dose. The dose calculated by Monte Carlo (MC) method was used as the ground truth dose label. A total of 40,000 features, corresponding to 8000 beamlets, were obtained from head patient datasets and used for the training data. Additionally, seventeen head patients not included in the training process were utilized as testing cases. Results: The DL5 method demonstrates high proton beamlet dose prediction accuracy, with an average determination coefficient R2 of 0.93 when compared to the MC dose. Accurate beamlet dose estimation can be achieved in as little as 1.5 milliseconds for an individual proton beamlet. For IMPT plan dose comparisons to the dose calculated by the MC method, the DL5 method exhibited gamma pass rates of γ(2mm, 2%) and γ(3mm, 3%) ranging from 98.15% to 99.89% and 98.80% to 99.98%, respectively, across all 17 testing cases. On average, the DL5 method increased the gamma pass rates to γ(2mm, 2%) from 82.97% to 99.23% and to γ(3mm, 3%) from 85.27% to 99.75% when compared with the PB method. Conclusions: The proposed DL5 model enables rapid and precise dose calculation in IMPT plan, which has the potential to significantly enhance the efficiency and quality of proton radiation therapy.

目的：质子剂量沉积结果受多种因素的影响，如照射角度、射束能量和其他参数。质子剂量沉积矩阵（DDM）的计算非常复杂，但在强度调制质子治疗（IMPT）中至关重要。在这项工作中，我们提出了一种利用多源特征进行质子剂量沉积矩阵预测的新型深度学习（DL）方法。方法：DL5质子DDM预测方法涉及五个输入特征，包括小束几何形状、剂量测定和治疗机信息，如患者CT数据、小束能量、体素到小束轴的距离、体素到体表的距离和铅笔束（PB）剂量。蒙特卡洛（Monte Carlo，MC）方法计算出的剂量被用作基本真实剂量标签。从头部患者数据集中共获得 40,000 个特征，对应 8,000 个小束，并将其用作训练数据。此外，17 名未纳入训练过程的头部患者被用作测试案例。结果：DL5 方法显示了较高的质子束剂量预测准确性，与 MC 剂量相比，平均确定系数 R2 为 0.93。对单个质子束而言，只需 1.5 毫秒就能实现精确的质子束剂量估算。在 IMPT 计划剂量与 MC 方法计算的剂量比较中，DL5 方法在所有 17 个测试案例中的伽马通过率分别为 γ(2mm，2%) 和 γ(3mm，3%) ，范围分别为 98.15% 至 99.89% 和 98.80% 至 99.98%。与 PB 方法相比，DL5 方法平均将γ（2 毫米，2%）的伽马通过率从 82.97% 提高到 99.23%，将γ（3 毫米，3%）的伽马通过率从 85.27% 提高到 99.75%。结论：所提出的 DL5 模型能在 IMPT 计划中快速、精确地计算剂量，有望显著提高质子放射治疗的效率和质量。

{"title":"Proton dose deposition matrix prediction using multi-source feature driven deep learning approach","authors":"Peng Zhou, Shengxiu Jiao, Xiaoqian Zhao, Shuzhan Yao, Honghao Xu, Chuan Chen","doi":"10.1088/2632-2153/ad6231","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6231","url":null,"abstract":"\u0000 Purpose: Proton dose deposition results are influenced by various factors, such as irradiation angle, beamlet energy and other parameters. The calculation of the proton dose deposition matrix (DDM) can be highly complex but is crucial in intensity-modulated proton therapy (IMPT). In this work, we present a novel deep learning (DL) approach using multi-source features for proton DDM prediction. Methods: The DL5 proton DDM prediction method involves five input features containing beamlet geometry, dosimetry and treatment machine information like patient CT data, beamlet energy, distance from voxel to beamlet axis, distance from voxel to body surface, and pencil beam (PB) dose. The dose calculated by Monte Carlo (MC) method was used as the ground truth dose label. A total of 40,000 features, corresponding to 8000 beamlets, were obtained from head patient datasets and used for the training data. Additionally, seventeen head patients not included in the training process were utilized as testing cases. Results: The DL5 method demonstrates high proton beamlet dose prediction accuracy, with an average determination coefficient R2 of 0.93 when compared to the MC dose. Accurate beamlet dose estimation can be achieved in as little as 1.5 milliseconds for an individual proton beamlet. For IMPT plan dose comparisons to the dose calculated by the MC method, the DL5 method exhibited gamma pass rates of γ(2mm, 2%) and γ(3mm, 3%) ranging from 98.15% to 99.89% and 98.80% to 99.98%, respectively, across all 17 testing cases. On average, the DL5 method increased the gamma pass rates to γ(2mm, 2%) from 82.97% to 99.23% and to γ(3mm, 3%) from 85.27% to 99.75% when compared with the PB method. Conclusions: The proposed DL5 model enables rapid and precise dose calculation in IMPT plan, which has the potential to significantly enhance the efficiency and quality of proton radiation therapy.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"77 21","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141657834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Simulation-based Inference on Virtual Brain Models of Disorders 基于虚拟大脑模型的疾病模拟推理

Machine Learning: Science and Technology

Pub Date : 2024-07-11 DOI: 10.1088/2632-2153/ad6230

Meysam Hashemi, Abolfazl Ziaeemehr, M. Woodman, Jan Fousek, S. Petkoski, V. Jirsa

Connectome-based models, also known as Virtual Brain Models (VBMs), have been well established in network neuroscience to investigate pathophysiological causes underlying a large range of brain diseases. The integration of an individual's brain imaging data in VBMs has improved patient-specific predictivity, although Bayesian estimation of spatially distributed parameters remains challenging even with state-of-the-art Monte Carlo sampling. VBMs imply latent nonlinear state space models driven by noise and network input, necessitating advanced probabilistic machine learning techniques for widely applicable Bayesian estimation. Here we present Simulation-based Inference on Virtual Brain Models (SBI-VBMs), and demonstrate that training deep neural networks on both spatio-temporal and functional features allows for accurate estimation of generative parameters in brain disorders. The systematic use of brain stimulation provides an effective remedy for the non-identifiability issue in estimating the degradation limited to smaller subset of connections. By prioritizing model structure over data, we show that the hierarchical structure in SBI-VBMs renders the inference more effective, precise and biologically plausible. This approach could broadly advance precision medicine by enabling fast and reliable prediction of patient-specific brain disorders.

基于连接体的模型，也称为虚拟脑模型（VBM），已在网络神经科学领域得到广泛应用，用于研究多种脑部疾病的病理生理原因。在 VBM 中整合个人的脑成像数据提高了针对特定患者的预测能力，但即使采用最先进的蒙特卡洛采样，对空间分布参数进行贝叶斯估计仍然具有挑战性。VBM 意味着由噪声和网络输入驱动的潜在非线性状态空间模型，需要先进的概率机器学习技术来进行广泛适用的贝叶斯估计。在此，我们提出了基于虚拟脑模型的模拟推理（SBI-VBMs），并证明根据时空和功能特征训练深度神经网络可以准确估计脑部疾病的生成参数。系统性地使用脑刺激可以有效解决在估算局限于较小连接子集的退化时的不可识别性问题。通过优先考虑模型结构而非数据，我们证明了 SBI-VBM 中的分层结构能使推断更有效、更精确、更符合生物学原理。这种方法可以快速、可靠地预测特定患者的脑部疾病，从而广泛推进精准医疗的发展。

{"title":"Simulation-based Inference on Virtual Brain Models of Disorders","authors":"Meysam Hashemi, Abolfazl Ziaeemehr, M. Woodman, Jan Fousek, S. Petkoski, V. Jirsa","doi":"10.1088/2632-2153/ad6230","DOIUrl":"https://doi.org/10.1088/2632-2153/ad6230","url":null,"abstract":"\u0000 Connectome-based models, also known as Virtual Brain Models (VBMs), have been well established in network neuroscience to investigate pathophysiological causes underlying a large range of brain diseases. The integration of an individual's brain imaging data in VBMs has improved patient-specific predictivity, although Bayesian estimation of spatially distributed parameters remains challenging even with state-of-the-art Monte Carlo sampling. VBMs imply latent nonlinear state space models driven by noise and network input, necessitating advanced probabilistic machine learning techniques for widely applicable Bayesian estimation. Here we present Simulation-based Inference on Virtual Brain Models (SBI-VBMs), and demonstrate that training deep neural networks on both spatio-temporal and functional features allows for accurate estimation of generative parameters in brain disorders. The systematic use of brain stimulation provides an effective remedy for the non-identifiability issue in estimating the degradation limited to smaller subset of connections. By prioritizing model structure over data, we show that the hierarchical structure in SBI-VBMs renders the inference more effective, precise and biologically plausible. This approach could broadly advance precision medicine by enabling fast and reliable prediction of patient-specific brain disorders.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"122 27","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141657060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimizing data acquisition: a Bayesian approach for efficient machine learning model training 优化数据采集：高效机器学习模型训练的贝叶斯方法

Machine Learning: Science and Technology

Pub Date : 2024-07-08 DOI: 10.1088/2632-2153/ad605f

M. R. Mahani, I. Nechepurenko, Y. Rahimof, A. Wicht

Acquiring a substantial number of data points for training accurate machine learning (ML) models is a big challenge in scientific fields where data collection is resource-intensive. Here, we propose a novel approach for constructing a minimal yet highly informative database for training ML models in complex multi-dimensional parameter spaces. To achieve this, we mimic the underlying relation between the output and input parameters using Gaussian process regression (GPR). Using a set of known data, GPR provides predictive means and standard deviation for the unknown data. Given the predicted standard deviation by GPR, we select data points using Bayesian optimization to obtain an efficient database for training ML models. We compare the performance of ML models trained on databases obtained through this method, with databases obtained using traditional approaches. Our results demonstrate that the ML models trained on the database obtained using Bayesian optimization approach consistently outperform the other two databases, achieving high accuracy with a significantly smaller number of data points. Our work contributes to the resource-efficient collection of data in high-dimensional complex parameter spaces, to achieve high precision machine learning predictions.

在数据收集资源密集的科学领域，获取大量数据点以训练精确的机器学习（ML）模型是一项巨大挑战。在此，我们提出了一种新方法，用于构建最小但信息量很大的数据库，以训练复杂多维参数空间中的机器学习模型。为此，我们使用高斯过程回归（GPR）来模仿输出和输入参数之间的潜在关系。利用一组已知数据，GPR 可为未知数据提供预测均值和标准偏差。根据 GPR 预测的标准偏差，我们使用贝叶斯优化法选择数据点，从而获得用于训练 ML 模型的高效数据库。我们比较了在通过这种方法获得的数据库上训练的 ML 模型与使用传统方法获得的数据库的性能。结果表明，在使用贝叶斯优化方法获得的数据库上训练的 ML 模型始终优于其他两个数据库，在数据点数量明显较少的情况下实现了高准确度。我们的工作有助于在高维复杂参数空间中高效收集数据，从而实现高精度的机器学习预测。

{"title":"Optimizing data acquisition: a Bayesian approach for efficient machine learning model training","authors":"M. R. Mahani, I. Nechepurenko, Y. Rahimof, A. Wicht","doi":"10.1088/2632-2153/ad605f","DOIUrl":"https://doi.org/10.1088/2632-2153/ad605f","url":null,"abstract":"\u0000 Acquiring a substantial number of data points for training accurate machine learning (ML) models is a big challenge in scientific fields where data collection is resource-intensive. Here, we propose a novel approach for constructing a minimal yet highly informative database for training ML models in complex multi-dimensional parameter spaces. To achieve this, we mimic the underlying relation between the output and input parameters using Gaussian process regression (GPR). Using a set of known data, GPR provides predictive means and standard deviation for the unknown data. Given the predicted standard deviation by GPR, we select data points using Bayesian optimization to obtain an efficient database for training ML models. We compare the performance of ML models trained on databases obtained through this method, with databases obtained using traditional approaches. Our results demonstrate that the ML models trained on the database obtained using Bayesian optimization approach consistently outperform the other two databases, achieving high accuracy with a significantly smaller number of data points. Our work contributes to the resource-efficient collection of data in high-dimensional complex parameter spaces, to achieve high precision machine learning predictions.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"113 28","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141667900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Real-time confinement regime detection in fusion plasmas with convolutional neural networks and high-bandwidth edge fluctuation measurements 利用卷积神经网络和高带宽边缘波动测量实时探测聚变等离子体中的约束机制

Machine Learning: Science and Technology

Pub Date : 2024-07-08 DOI: 10.1088/2632-2153/ad605e

Kevin Singh Gill, David R Smith, Semin Joung, B. Geiger, G. McKee, Jefferey Zimmerman, Ryan N Coffee, A. Jalalvand, E. Kolemen

A real-time detection of the plasma confinement regime can enable new advanced plasma control capabilities for both the access to and sustainment of enhanced confinement regimes in fusion devices. For example, a real-time indication of the confinement regime can facilitate transition to the high-performing wide pedestal quiescent H-mode, or avoid unwanted transitions to lower confinement regimes that may induce plasma termination. To demonstrate real-time confinement regime detection, we use the 2D beam emission spectroscopy (BES) diagnostic system to capture localized density fluctuations of long wavelength turbulent modes in the edge region at a 1 MHz sampling rate. BES data from 330 discharges in either L-mode, H-mode, Quiescent H (QH)-mode, or wide-pedestal QH-mode was collected from the DIII-D tokamak and curated to develop a high-quality database to train a deep-learning classification model for real-time confinement detection. We utilize the 6x8 spatial configuration with a time window of 1024 $mu$s and recast the input to obtain spectral-like features via FFT preprocessing. We employ a shallow 3D convolutional neural network for the multivariate time-series classification task and utilize a softmax in the final dense layer to retrieve a probability distribution over the different confinement regimes. Our model classifies the global confinement state on 44 unseen test discharges with an average $F_1$ score of 0.94, using only $sim$1 millisecond snippets of BES data at a time. This activity demonstrates the feasibility for real-time data analysis of fluctuation diagnostics in future devices such as ITER, where the need for reliable and advanced plasma control is urgent.

对等离子体约束机制的实时检测可实现新的先进等离子体控制能力，以便在聚变装置中进入并维持增强型约束机制。例如，禁锢状态的实时指示可以促进向高性能宽基座静态 H 模式的过渡，或避免向可能导致等离子体终止的低禁锢状态的不必要过渡。为了演示实时约束机制检测，我们使用二维束发射光谱（BES）诊断系统，以 1 MHz 的采样率捕捉边缘区域长波长湍流模式的局部密度波动。我们从DIII-D托卡马克收集了330个L模式、H模式、静息H（QH）模式或宽顶QH模式放电的BES数据，并对这些数据进行了整理，以开发一个高质量的数据库，用于训练实时禁闭探测的深度学习分类模型。我们利用 6x8 的空间配置和 1024 $mu$s 的时间窗口，并通过 FFT 预处理重铸输入以获得类似光谱的特征。我们采用浅层三维卷积神经网络来完成多变量时间序列分类任务，并在最后的稠密层中使用软最大值（softmax）来检索不同禁闭状态的概率分布。我们的模型对 44 个未见过的测试放电进行了全局禁闭状态分类，平均 F_1$ 得分为 0.94，每次仅使用 $sim$1 毫秒的 BES 数据片段。这项活动证明了在未来装置（如国际热核聚变实验堆）中对波动诊断进行实时数据分析的可行性，在这种装置中迫切需要可靠和先进的等离子体控制。

{"title":"Real-time confinement regime detection in fusion plasmas with convolutional neural networks and high-bandwidth edge fluctuation measurements","authors":"Kevin Singh Gill, David R Smith, Semin Joung, B. Geiger, G. McKee, Jefferey Zimmerman, Ryan N Coffee, A. Jalalvand, E. Kolemen","doi":"10.1088/2632-2153/ad605e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad605e","url":null,"abstract":"\u0000 A real-time detection of the plasma confinement regime can enable new advanced plasma control capabilities for both the access to and sustainment of enhanced confinement regimes in fusion devices. For example, a real-time indication of the confinement regime can facilitate transition to the high-performing wide pedestal quiescent H-mode, or avoid unwanted transitions to lower confinement regimes that may induce plasma termination. To demonstrate real-time confinement regime detection, we use the 2D beam emission spectroscopy (BES) diagnostic system to capture localized density fluctuations of long wavelength turbulent modes in the edge region at a 1 MHz sampling rate. BES data from 330 discharges in either L-mode, H-mode, Quiescent H (QH)-mode, or wide-pedestal QH-mode was collected from the DIII-D tokamak and curated to develop a high-quality database to train a deep-learning classification model for real-time confinement detection. We utilize the 6x8 spatial configuration with a time window of 1024 $mu$s and recast the input to obtain spectral-like features via FFT preprocessing. We employ a shallow 3D convolutional neural network for the multivariate time-series classification task and utilize a softmax in the final dense layer to retrieve a probability distribution over the different confinement regimes. Our model classifies the global confinement state on 44 unseen test discharges with an average $F_1$ score of 0.94, using only $sim$1 millisecond snippets of BES data at a time. This activity demonstrates the feasibility for real-time data analysis of fluctuation diagnostics in future devices such as ITER, where the need for reliable and advanced plasma control is urgent.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"119 45","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141667840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Guided quantum compression for high dimensional data classification 用于高维数据分类的引导式量子压缩

Machine Learning: Science and Technology

Pub Date : 2024-07-05 DOI: 10.1088/2632-2153/ad5fdd

Vasilis Belis, Patrick Odagiu, Michele Grossi, Florentin Reiter, Günther Dissertori, Sofia Vallecorsa

Quantum machine learning provides a fundamentally different approach to analyzing data. However, many interesting datasets are too complex for currently available quantum computers. Present quantum machine learning applications usually diminish this complexity by reducing the dimensionality of the data, e.g., via auto-encoders, before passing it through the quantum models. Here, we design a classical-quantum paradigm that unifies the dimensionality reduction task with a quantum classification model into a single architecture: the guided quantum compression model. We exemplify how this architecture outperforms conventional quantum machine learning approaches on a challenging binary classification problem: identifying the Higgs boson in proton-proton collisions at the LHC. Furthermore, the guided quantum compression model shows better performance compared to the deep learning benchmark when using solely the kinematic variables in our dataset.

量子机器学习提供了一种根本不同的数据分析方法。然而，许多有趣的数据集对于目前可用的量子计算机来说过于复杂。目前的量子机器学习应用通常通过降低数据的维度（例如通过自动编码器）来减少这种复杂性，然后再将其传递给量子模型。在这里，我们设计了一种经典量子范式，将降维任务与量子分类模型统一到一个架构中：引导量子压缩模型。我们举例说明了这种架构如何在一个具有挑战性的二元分类问题上优于传统的量子机器学习方法：在大型强子对撞机的质子-质子对撞中识别希格斯玻色子。此外，与深度学习基准相比，当只使用我们数据集中的运动学变量时，引导量子压缩模型显示出更好的性能。

引用次数: 0

An active learning enhanced data programming (ActDP) framework for ECG time series 针对心电图时间序列的主动学习增强型数据编程（ActDP）框架

Machine Learning: Science and Technology

Pub Date : 2024-07-05 DOI: 10.1088/2632-2153/ad5fda

Priyanka Gupta, Manik Gupta, Vijay Kumar

Supervised machine learning requires the estimation of multiple parameters by using large amounts of labelled data. Getting labelled data generally requires a substantial allocation of resources in terms of both cost and time. In such scenarios, weak supervised learning techniques like data programming (DP) and active learning (AL) can be advantageous for time-series classification tasks. These paradigms can be used to assign data labels in an automated manner, and time-series classification can subsequently be carried out on the labelled data. This work proposes a novel framework titled active learning enhanced data programming (ActDP). It uses DP and AL for ECG classification using single-lead data. ECG classification is pivotal in cardiology and healthcare for diagnosing a broad spectrum of heart conditions and arrhythmias. To establish the usefulness of this proposed ActDP framework, the experiments have been conducted using the MIT-BIH dataset with 94,224 ECG beats. DP assigns a probabilistic label to each ECG beat using nine novel polar labelling functions and a generative model in this work. Further, AL improves the result of DP by replacing the labels for sampled ECG beats of a generative model with ground truth. Subsequently, a discriminative model is trained on these labels for each iteration. The experimental results show that by incorporating AL to DP in the ActDP framework, the accuracy of ECG classification strictly increases from 85.7 % to 97.34 % in 58 iterations. Comparatively, the proposed framework (ActDP) has demonstrated a higher classification accuracy of 97.34 % In contrast, DP with data augmentation (DA) achieves an accuracy of 92.2 %, while DP without DA results in an accuracy of 85.7 %, majority vote yields an accuracy of 50.2 %, and the generative model achieves an accuracy of only 66.5 %.

有监督的机器学习需要使用大量标记数据来估计多个参数。获取标记数据通常需要在成本和时间上分配大量资源。在这种情况下，数据编程（DP）和主动学习（AL）等弱监督学习技术在时间序列分类任务中具有优势。这些范式可用于自动分配数据标签，随后对标签数据进行时间序列分类。这项工作提出了一个名为主动学习增强数据编程（ActDP）的新框架。它使用 DP 和 AL 对单导联数据进行心电图分类。心电图分类在心脏病学和医疗保健中至关重要，可用于诊断各种心脏疾病和心律失常。为了证明所提出的 ActDP 框架的实用性，我们使用包含 94,224 个心电图节拍的麻省理工学院-BIH 数据集进行了实验。在这项工作中，DP 使用九个新颖的极性标签函数和一个生成模型为每个心电图搏动分配一个概率标签。此外，AL 通过用地面实况替换生成模型的心电图搏动采样标签，改进了 DP 的结果。随后，每次迭代都会根据这些标签训练一个判别模型。实验结果表明，在 ActDP 框架中将 AL 加入 DP 后，心电图分类的准确率在 58 次迭代中从 85.7% 严格提高到 97.34%。相比之下，所提出的框架（ActDP）的分类准确率更高，达到 97.34%。相比之下，带有数据增强（DA）的 DP 的准确率为 92.2%，而不带数据增强的 DP 的准确率为 85.7%，多数投票的准确率为 50.2%，生成模型的准确率仅为 66.5%。

{"title":"An active learning enhanced data programming (ActDP) framework for ECG time series","authors":"Priyanka Gupta, Manik Gupta, Vijay Kumar","doi":"10.1088/2632-2153/ad5fda","DOIUrl":"https://doi.org/10.1088/2632-2153/ad5fda","url":null,"abstract":"\u0000 Supervised machine learning requires the estimation of multiple parameters by using large amounts of labelled data. Getting labelled data generally requires a substantial allocation of resources in terms of both cost and time. In such scenarios, weak supervised learning techniques like data programming (DP) and active learning (AL) can be advantageous for time-series classification tasks. These paradigms can be used to assign data labels in an automated manner, and time-series classification can subsequently be carried out on the labelled data. This work proposes a novel framework titled active learning enhanced data programming (ActDP). It uses DP and AL for ECG classification using single-lead data. ECG classification is pivotal in cardiology and healthcare for diagnosing a broad spectrum of heart conditions and arrhythmias. To establish the usefulness of this proposed ActDP framework, the experiments have been conducted using the MIT-BIH dataset with 94,224 ECG beats. DP assigns a probabilistic label to each ECG beat using nine novel polar labelling functions and a generative model in this work. Further, AL improves the result of DP by replacing the labels for sampled ECG beats of a generative model with ground truth. Subsequently, a discriminative model is trained on these labels for each iteration. The experimental results show that by incorporating AL to DP in the ActDP framework, the accuracy of ECG classification strictly increases from 85.7 % to 97.34 % in 58 iterations. Comparatively, the proposed framework (ActDP) has demonstrated a higher classification accuracy of 97.34 % In contrast, DP with data augmentation (DA) achieves an accuracy of 92.2 %, while DP without DA results in an accuracy of 85.7 %, majority vote yields an accuracy of 50.2 %, and the generative model achieves an accuracy of only 66.5 %.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"132 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141674077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0