Pub Date : 2023-09-04DOI: 10.1088/2632-2153/acf6aa
Nina Baldy, Nicolas Simon, Viktor Jirsa, Meysam Hashemi
Alcohol use disorder (AUD), also called alcohol dependence, is a major public health problem, affecting almost 10% of the world’s population. Baclofen, as a selective GABAB receptor agonist, has emerged as a promising drug for the treatment of AUD. However, the inter-trial, inter-individual and residual variability in drug concentration over time in a population of patients with AUD is unknown. In this study, we use a hierarchical Bayesian workflow to estimate the parameters of a pharmacokinetic (PK) population model from Baclofen administration to patients with AUD. By monitoring various convergence diagnostics, the probabilistic methodology is first validated on synthetic longitudinal datasets and then applied to infer the PK model parameters based on the clinical data that were retrospectively collected from outpatients treated with oral Baclofen. We show that state-of-the-art advances in automatic Bayesian inference using self-tuning Hamiltonian Monte Carlo (HMC) algorithms provide accurate and decisive predictions on Baclofen plasma concentration at both individual and group levels. Importantly, leveraging the information in prior provides faster computation, better convergence diagnostics, and substantially higher out-of-sample prediction accuracy. Moreover, the root mean squared error as a measure of within-sample predictive accuracy can be misleading for model evaluation, whereas the fully Bayesian information criteria correctly select the true data generating parameters. This study points out the capability of non-parametric Bayesian estimation using adaptive HMC sampling methods for easy and reliable estimation in clinical settings to optimize dosing regimens and efficiently treat AUD.
{"title":"Hierarchical Bayesian pharmacometrics analysis of Baclofen for alcohol use disorder","authors":"Nina Baldy, Nicolas Simon, Viktor Jirsa, Meysam Hashemi","doi":"10.1088/2632-2153/acf6aa","DOIUrl":"https://doi.org/10.1088/2632-2153/acf6aa","url":null,"abstract":"Alcohol use disorder (AUD), also called alcohol dependence, is a major public health problem, affecting almost 10% of the world’s population. Baclofen, as a selective GABAB receptor agonist, has emerged as a promising drug for the treatment of AUD. However, the inter-trial, inter-individual and residual variability in drug concentration over time in a population of patients with AUD is unknown. In this study, we use a hierarchical Bayesian workflow to estimate the parameters of a pharmacokinetic (PK) population model from Baclofen administration to patients with AUD. By monitoring various convergence diagnostics, the probabilistic methodology is first validated on synthetic longitudinal datasets and then applied to infer the PK model parameters based on the clinical data that were retrospectively collected from outpatients treated with oral Baclofen. We show that state-of-the-art advances in automatic Bayesian inference using self-tuning Hamiltonian Monte Carlo (HMC) algorithms provide accurate and decisive predictions on Baclofen plasma concentration at both individual and group levels. Importantly, leveraging the information in prior provides faster computation, better convergence diagnostics, and substantially higher out-of-sample prediction accuracy. Moreover, the root mean squared error as a measure of within-sample predictive accuracy can be misleading for model evaluation, whereas the fully Bayesian information criteria correctly select the true data generating parameters. This study points out the capability of non-parametric Bayesian estimation using adaptive HMC sampling methods for easy and reliable estimation in clinical settings to optimize dosing regimens and efficiently treat AUD.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47187991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1088/2632-2153/acee44
Adam Wunderlich, Jack Sklar
Random noise arising from physical processes is an inherent characteristic of measurements and a limiting factor for most signal processing and data analysis tasks. Given the recent interest in generative adversarial networks (GANs) for data-driven modeling, it is important to determine to what extent GANs can faithfully reproduce noise in target data sets. In this paper, we present an empirical investigation that aims to shed light on this issue for time series. Namely, we assess two general-purpose GANs for time series that are based on the popular deep convolutional GAN architecture, a direct time-series model and an image-based model that uses a short-time Fourier transform data representation. The GAN models are trained and quantitatively evaluated using distributions of simulated noise time series with known ground-truth parameters. Target time series distributions include a broad range of noise types commonly encountered in physical measurements, electronics, and communication systems: band-limited thermal noise, power law noise, shot noise, and impulsive noise. We find that GANs are capable of learning many noise types, although they predictably struggle when the GAN architecture is not well suited to some aspects of the noise, e.g. impulsive time-series with extreme outliers. Our findings provide insights into the capabilities and potential limitations of current approaches to time-series GANs and highlight areas for further research. In addition, our battery of tests provides a useful benchmark to aid the development of deep generative models for time series.
物理过程产生的随机噪声是测量的固有特征,也是大多数信号处理和数据分析任务的限制因素。鉴于最近人们对用于数据驱动建模的生成式对抗网络(GANs)的兴趣,确定 GANs 在多大程度上能忠实地再现目标数据集中的噪声非常重要。本文介绍了一项实证调查,旨在揭示时间序列的这一问题。也就是说,我们评估了两种基于流行的深度卷积 GAN 架构的通用时间序列 GAN,一种是直接的时间序列模型,另一种是使用短时傅立叶变换数据表示的基于图像的模型。这些 GAN 模型是利用具有已知真实参数的模拟噪声时间序列分布进行训练和定量评估的。目标时间序列分布包括物理测量、电子和通信系统中常见的各种噪声类型:带限热噪声、幂律噪声、射频噪声和脉冲噪声。我们发现,GAN 能够学习多种类型的噪声,不过当 GAN 架构不能很好地适应噪声的某些方面时,例如带有极端离群值的脉冲时间序列,GAN 就会陷入困境。我们的研究结果让我们深入了解了当前时间序列 GAN 方法的能力和潜在局限性,并突出了有待进一步研究的领域。此外,我们的一系列测试还为时间序列深度生成模型的开发提供了有用的基准。
{"title":"Data-driven modeling of noise time series with convolutional generative adversarial networks.","authors":"Adam Wunderlich, Jack Sklar","doi":"10.1088/2632-2153/acee44","DOIUrl":"10.1088/2632-2153/acee44","url":null,"abstract":"<p><p>Random noise arising from physical processes is an inherent characteristic of measurements and a limiting factor for most signal processing and data analysis tasks. Given the recent interest in generative adversarial networks (GANs) for data-driven modeling, it is important to determine to what extent GANs can faithfully reproduce noise in target data sets. In this paper, we present an empirical investigation that aims to shed light on this issue for time series. Namely, we assess two general-purpose GANs for time series that are based on the popular deep convolutional GAN architecture, a direct time-series model and an image-based model that uses a short-time Fourier transform data representation. The GAN models are trained and quantitatively evaluated using distributions of simulated noise time series with known ground-truth parameters. Target time series distributions include a broad range of noise types commonly encountered in physical measurements, electronics, and communication systems: band-limited thermal noise, power law noise, shot noise, and impulsive noise. We find that GANs are capable of learning many noise types, although they predictably struggle when the GAN architecture is not well suited to some aspects of the noise, e.g. impulsive time-series with extreme outliers. Our findings provide insights into the capabilities and potential limitations of current approaches to time-series GANs and highlight areas for further research. In addition, our battery of tests provides a useful benchmark to aid the development of deep generative models for time series.</p>","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"4 3","pages":""},"PeriodicalIF":6.3,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10484071/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10605697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-30DOI: 10.1088/2632-2153/acf545
A. Fediai, Patrick Reiser, Jorge Enrique Olivares Peña, W. Wenzel, Pascal Friederich
Accurate prediction of the ionization potential and electron affinity energies of small molecules are important for many applications. Density functional theory (DFT) is computationally inexpensive, but can be very inaccurate for frontier orbital energies or ionization energies. The GW method is sufficiently accurate for many relevant applications, but much more expensive than DFT. Here we study how we can learn to predict orbital energies with GW accuracy using machine learning (ML) on molecular graphs and fingerprints using an interpretable delta-learning approach. ML models presented here can be used to predict quasiparticle energies of small organic molecules even beyond the size of the molecules used for training. We furthermore analyze the learned DFT-to-GW corrections by mapping them to specific localized fragments of the molecules, in order to develop an intuitive interpretation of the learned corrections, and thus to better understand DFT errors.
{"title":"Interpretable delta-learning of GW quasiparticle energies from GGA-DFT","authors":"A. Fediai, Patrick Reiser, Jorge Enrique Olivares Peña, W. Wenzel, Pascal Friederich","doi":"10.1088/2632-2153/acf545","DOIUrl":"https://doi.org/10.1088/2632-2153/acf545","url":null,"abstract":"Accurate prediction of the ionization potential and electron affinity energies of small molecules are important for many applications. Density functional theory (DFT) is computationally inexpensive, but can be very inaccurate for frontier orbital energies or ionization energies. The GW method is sufficiently accurate for many relevant applications, but much more expensive than DFT. Here we study how we can learn to predict orbital energies with GW accuracy using machine learning (ML) on molecular graphs and fingerprints using an interpretable delta-learning approach. ML models presented here can be used to predict quasiparticle energies of small organic molecules even beyond the size of the molecules used for training. We furthermore analyze the learned DFT-to-GW corrections by mapping them to specific localized fragments of the molecules, in order to develop an intuitive interpretation of the learned corrections, and thus to better understand DFT errors.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47580192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-30DOI: 10.1088/2632-2153/acf55c
S. Falkner, Alessandro Coretti, Salvatore Romano, P. Geissler, C. Dellago
Understanding the dynamics of complex molecular processes is often linked to the study of infrequent transitions between long-lived stable states. The standard approach to the sampling of such rare events is to generate an ensemble of transition paths using a random walk in trajectory space. This, however, comes with the drawback of strong correlations between subsequently sampled paths and with an intrinsic difficulty in parallelizing the sampling process. We propose a transition path sampling scheme based on neural-network generated configurations. These are obtained employing normalizing flows, a neural network class able to generate statistically independent samples from a given distribution. With this approach, not only are correlations between visited paths removed, but the sampling process becomes easily parallelizable. Moreover, by conditioning the normalizing flow, the sampling of configurations can be steered towards regions of interest. We show that this approach enables the resolution of both the thermodynamics and kinetics of the transition region for systems that can be sampled using exact-likelihood generative models.
{"title":"Conditioning Boltzmann generators for rare event sampling","authors":"S. Falkner, Alessandro Coretti, Salvatore Romano, P. Geissler, C. Dellago","doi":"10.1088/2632-2153/acf55c","DOIUrl":"https://doi.org/10.1088/2632-2153/acf55c","url":null,"abstract":"Understanding the dynamics of complex molecular processes is often linked to the study of infrequent transitions between long-lived stable states. The standard approach to the sampling of such rare events is to generate an ensemble of transition paths using a random walk in trajectory space. This, however, comes with the drawback of strong correlations between subsequently sampled paths and with an intrinsic difficulty in parallelizing the sampling process. We propose a transition path sampling scheme based on neural-network generated configurations. These are obtained employing normalizing flows, a neural network class able to generate statistically independent samples from a given distribution. With this approach, not only are correlations between visited paths removed, but the sampling process becomes easily parallelizable. Moreover, by conditioning the normalizing flow, the sampling of configurations can be steered towards regions of interest. We show that this approach enables the resolution of both the thermodynamics and kinetics of the transition region for systems that can be sampled using exact-likelihood generative models.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49374979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-23DOI: 10.1088/2632-2153/acf362
Zhao-Min Chen, Xin Jin, Xiaoqin Zhang, C. Xia, Zhiyong Pan, Ruoxi Deng, Jie Hu, Heng Chen
Object detection and instance segmentation have been successful on benchmarks with relatively balanced category distribution (e.g. MSCOCO). However, state-of-the-art object detection and segmentation methods still struggle to generalize on long-tailed datasets (e.g. LVIS), where a few classes (head classes) dominate the instance samples, while most classes (tailed classes) have only a few samples. To address this challenge, we propose a plug-and-play module within the Mask R-CNN framework called dynamic instance memory (DIM). Specifically, we augment Mask R-CNN with an auxiliary branch for training. It maintains a dynamic memory bank storing an instance-level prototype representation for each category, and shares the classifier with the existing instance branch. With a simple metric loss, the representations in DIM can be dynamically updated by the instance proposals in the mini-batch during training. Our DIM introduces a bias toward tailed classes to the classifier learning along with a class frequency reversed sampler, which learns generalizable representations from the original data distribution, complementing the existing instance branch. Comprehensive experiments on LVIS demonstrate the effectiveness of DIM, as well as the significant advantages of DIM over the baseline Mask R-CNN.
{"title":"DIM: long-tailed object detection and instance segmentation via dynamic instance memory","authors":"Zhao-Min Chen, Xin Jin, Xiaoqin Zhang, C. Xia, Zhiyong Pan, Ruoxi Deng, Jie Hu, Heng Chen","doi":"10.1088/2632-2153/acf362","DOIUrl":"https://doi.org/10.1088/2632-2153/acf362","url":null,"abstract":"Object detection and instance segmentation have been successful on benchmarks with relatively balanced category distribution (e.g. MSCOCO). However, state-of-the-art object detection and segmentation methods still struggle to generalize on long-tailed datasets (e.g. LVIS), where a few classes (head classes) dominate the instance samples, while most classes (tailed classes) have only a few samples. To address this challenge, we propose a plug-and-play module within the Mask R-CNN framework called dynamic instance memory (DIM). Specifically, we augment Mask R-CNN with an auxiliary branch for training. It maintains a dynamic memory bank storing an instance-level prototype representation for each category, and shares the classifier with the existing instance branch. With a simple metric loss, the representations in DIM can be dynamically updated by the instance proposals in the mini-batch during training. Our DIM introduces a bias toward tailed classes to the classifier learning along with a class frequency reversed sampler, which learns generalizable representations from the original data distribution, complementing the existing instance branch. Comprehensive experiments on LVIS demonstrate the effectiveness of DIM, as well as the significant advantages of DIM over the baseline Mask R-CNN.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46649535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-16DOI: 10.1088/2632-2153/acf116
S. Moschou, Elliot Hicks, Rishi Parekh, Dhruv Mathew, Shoumik Majumdar, N. Vlahakis
Physics-informed neural networks (PINNs) are machine learning models that integrate data-based learning with partial differential equations (PDEs). In this work, for the first time we extend PINNs to model the numerically challenging case of astrophysical shock waves in the presence of a stellar gravitational field. Notably, PINNs suffer from competing losses during gradient descent that can lead to poor performance especially in physical setups involving multiple scales, which is the case for shocks in the gravitationally stratified solar atmosphere. We applied PINNs in three different setups ranging from modeling astrophysical shocks in cases with no or little data to data-intensive cases. Namely, we used PINNs (a) to determine the effective polytropic index controlling the heating mechanism of the space plasma within 1% error, (b) to quantitatively show that data assimilation is seamless in PINNs and small amounts of data can significantly increase the model’s accuracy, and (c) to solve the forward time-dependent problem for different temporal horizons. We addressed the poor performance of PINNs through an effective normalization approach by reformulating the fluid dynamics PDE system to absorb the gravity-caused variability. This led to a huge improvement in the overall model performance with the density accuracy improving between 2 and 16 times. Finally, we present a detailed critique on the strengths and drawbacks of PINNs in tackling realistic physical problems in astrophysics and conclude that PINNs can be a powerful complimentary modeling approach to classical fluid dynamics solvers.
{"title":"Physics-informed neural networks for modeling astrophysical shocks","authors":"S. Moschou, Elliot Hicks, Rishi Parekh, Dhruv Mathew, Shoumik Majumdar, N. Vlahakis","doi":"10.1088/2632-2153/acf116","DOIUrl":"https://doi.org/10.1088/2632-2153/acf116","url":null,"abstract":"Physics-informed neural networks (PINNs) are machine learning models that integrate data-based learning with partial differential equations (PDEs). In this work, for the first time we extend PINNs to model the numerically challenging case of astrophysical shock waves in the presence of a stellar gravitational field. Notably, PINNs suffer from competing losses during gradient descent that can lead to poor performance especially in physical setups involving multiple scales, which is the case for shocks in the gravitationally stratified solar atmosphere. We applied PINNs in three different setups ranging from modeling astrophysical shocks in cases with no or little data to data-intensive cases. Namely, we used PINNs (a) to determine the effective polytropic index controlling the heating mechanism of the space plasma within 1% error, (b) to quantitatively show that data assimilation is seamless in PINNs and small amounts of data can significantly increase the model’s accuracy, and (c) to solve the forward time-dependent problem for different temporal horizons. We addressed the poor performance of PINNs through an effective normalization approach by reformulating the fluid dynamics PDE system to absorb the gravity-caused variability. This led to a huge improvement in the overall model performance with the density accuracy improving between 2 and 16 times. Finally, we present a detailed critique on the strengths and drawbacks of PINNs in tackling realistic physical problems in astrophysics and conclude that PINNs can be a powerful complimentary modeling approach to classical fluid dynamics solvers.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41443900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-16DOI: 10.1088/2632-2153/acf117
M. I. K. Haq, I. Yulita, I. A. Dharmawan
The measurement of physical parameters of porous rock, which constitute reservoirs, is an essential part of hydrocarbon exploration. Typically, the measurement of these physical parameters is carried out through core analysis in a laboratory, which requires considerable time and high costs. Another approach involves using digital rock models, where the physical parameters are calculated through image processing and numerical simulations. However, this method also requires a significant amount of time for estimating the physical parameters of each rock sample. Machine learning, specifically convolutional neural network (CNN) algorithms, has been developed as an alternative method for estimating the physical parameters of porous rock in a shorter time frame. The advancement of CNN, particularly through transfer learning using pre-trained models, has contributed to rapid prediction capabilities. However, not all pre-trained models are suitable for estimating the physical parameters of porous rock. In this study, transfer learning was applied to estimate parameters of sandstones such as porosity, specific surface area, average grain size, average coordination number, and average throat radius. Six types of pre-trained models were utilized: ResNet152, DenseNet201, Xception, InceptionV3, InceptionResNetV2, and MobileNetV2. The results of this study indicate that the DenseNet201 model achieved the best performance with an error rate of 2.11%. Overall, this study highlights the potential of transfer learning to ultimately lead to more efficient and effective computation.
{"title":"A study of transfer learning in digital rock properties measurement","authors":"M. I. K. Haq, I. Yulita, I. A. Dharmawan","doi":"10.1088/2632-2153/acf117","DOIUrl":"https://doi.org/10.1088/2632-2153/acf117","url":null,"abstract":"The measurement of physical parameters of porous rock, which constitute reservoirs, is an essential part of hydrocarbon exploration. Typically, the measurement of these physical parameters is carried out through core analysis in a laboratory, which requires considerable time and high costs. Another approach involves using digital rock models, where the physical parameters are calculated through image processing and numerical simulations. However, this method also requires a significant amount of time for estimating the physical parameters of each rock sample. Machine learning, specifically convolutional neural network (CNN) algorithms, has been developed as an alternative method for estimating the physical parameters of porous rock in a shorter time frame. The advancement of CNN, particularly through transfer learning using pre-trained models, has contributed to rapid prediction capabilities. However, not all pre-trained models are suitable for estimating the physical parameters of porous rock. In this study, transfer learning was applied to estimate parameters of sandstones such as porosity, specific surface area, average grain size, average coordination number, and average throat radius. Six types of pre-trained models were utilized: ResNet152, DenseNet201, Xception, InceptionV3, InceptionResNetV2, and MobileNetV2. The results of this study indicate that the DenseNet201 model achieved the best performance with an error rate of 2.11%. Overall, this study highlights the potential of transfer learning to ultimately lead to more efficient and effective computation.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41584758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-15DOI: 10.1088/2632-2153/acf098
Lea M. Trenkwalder, Andrea López-Incera, Hendrik Poulsen Nautrup, Fulvio Flamini, H. Briegel
In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent’s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent’s policy.
{"title":"Automated gadget discovery in the quantum domain","authors":"Lea M. Trenkwalder, Andrea López-Incera, Hendrik Poulsen Nautrup, Fulvio Flamini, H. Briegel","doi":"10.1088/2632-2153/acf098","DOIUrl":"https://doi.org/10.1088/2632-2153/acf098","url":null,"abstract":"In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent’s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent’s policy.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44342855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-15DOI: 10.1088/2632-2153/acf095
I. Tampu, N. Haj-Hosseini, I. Blystad, A. Eklund
The infiltrative nature of malignant gliomas results in active tumor spreading into the peritumoral edema, which is not visible in conventional magnetic resonance imaging (cMRI) even after contrast injection. MR relaxometry (qMRI) measures relaxation rates dependent on tissue properties and can offer additional contrast mechanisms to highlight the non-enhancing infiltrative tumor. To investigate if qMRI data provides additional information compared to cMRI sequences when considering deep learning-based brain tumor detection and segmentation, preoperative conventional (T1w per- and post-contrast, T2w and FLAIR) and quantitative (pre- and post-contrast R1, R2 and proton density) MR data was obtained from 23 patients with typical radiological findings suggestive of a high-grade glioma. 2D deep learning models were trained on transversal slices (n = 528) for tumor detection and segmentation using either cMRI or qMRI. Moreover, trends in quantitative R1 and R2 rates of regions identified as relevant for tumor detection by model explainability methods were qualitatively analyzed. Tumor detection and segmentation performance for models trained with a combination of qMRI pre- and post-contrast was the highest (detection Matthews correlation coefficient (MCC) = 0.72, segmentation dice similarity coefficient (DSC) = 0.90), however, the difference compared to cMRI was not statistically significant. Overall analysis of the relevant regions identified using model explainability showed no differences between models trained on cMRI or qMRI. When looking at the individual cases, relaxation rates of brain regions outside the annotation and identified as relevant for tumor detection exhibited changes after contrast injection similar to region inside the annotation in the majority of cases. In conclusion, models trained on qMRI data obtained similar detection and segmentation performance to those trained on cMRI data, with the advantage of quantitatively measuring brain tissue properties within a similar scan time. When considering individual patients, the analysis of relaxation rates of regions identified by model explainability suggests the presence of infiltrative tumor outside the cMRI-based tumor annotation.
{"title":"Deep learning-based detection and identification of brain tumor biomarkers in quantitative MR-images","authors":"I. Tampu, N. Haj-Hosseini, I. Blystad, A. Eklund","doi":"10.1088/2632-2153/acf095","DOIUrl":"https://doi.org/10.1088/2632-2153/acf095","url":null,"abstract":"The infiltrative nature of malignant gliomas results in active tumor spreading into the peritumoral edema, which is not visible in conventional magnetic resonance imaging (cMRI) even after contrast injection. MR relaxometry (qMRI) measures relaxation rates dependent on tissue properties and can offer additional contrast mechanisms to highlight the non-enhancing infiltrative tumor. To investigate if qMRI data provides additional information compared to cMRI sequences when considering deep learning-based brain tumor detection and segmentation, preoperative conventional (T1w per- and post-contrast, T2w and FLAIR) and quantitative (pre- and post-contrast R1, R2 and proton density) MR data was obtained from 23 patients with typical radiological findings suggestive of a high-grade glioma. 2D deep learning models were trained on transversal slices (n = 528) for tumor detection and segmentation using either cMRI or qMRI. Moreover, trends in quantitative R1 and R2 rates of regions identified as relevant for tumor detection by model explainability methods were qualitatively analyzed. Tumor detection and segmentation performance for models trained with a combination of qMRI pre- and post-contrast was the highest (detection Matthews correlation coefficient (MCC) = 0.72, segmentation dice similarity coefficient (DSC) = 0.90), however, the difference compared to cMRI was not statistically significant. Overall analysis of the relevant regions identified using model explainability showed no differences between models trained on cMRI or qMRI. When looking at the individual cases, relaxation rates of brain regions outside the annotation and identified as relevant for tumor detection exhibited changes after contrast injection similar to region inside the annotation in the majority of cases. In conclusion, models trained on qMRI data obtained similar detection and segmentation performance to those trained on cMRI data, with the advantage of quantitatively measuring brain tissue properties within a similar scan time. When considering individual patients, the analysis of relaxation rates of regions identified by model explainability suggests the presence of infiltrative tumor outside the cMRI-based tumor annotation.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43076358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-11DOI: 10.1088/2632-2153/acefaa
Dileep Kumar, M. Dominguez-Pumar, Elisa Sayrol-Clols, J. Torres, M. Marín, J. Gómez-Elvira, L. Mora, S. Navarro, J. Rodriguez-Manfredi
Improving the resilience of sensor systems in space exploration is a key objective since the environmental conditions to which they are exposed are very harsh. For example, it is known that the presence of flying debris and Dust Devils on the Martian surface can partially damage sensors present in rovers/landers. The objective of this work is to show how data-driven methods can improve sensor resilience, particularly in the case of complex sensors, with multiple intermediate variables, feeding an inverse algorithm (IA) based on calibration data. The method considers three phases: an initial phase in which the sensor is calibrated in the laboratory and an IA is designed; a second phase, in which the sensor is placed at its intended location and sensor data is used to train data-driven model; and a third phase, once the model has been trained and partial damage is detected, in which the data-driven algorithm is reducing errors. The proposed method is tested with the intermediate data of the wind sensor of the TWINS instrument (NASA InSight mission), consisting of two booms placed on the deck of the lander, and three boards per boom. Wind speed and angle are recovered from the intermediate variables provided by the sensor and predicted by the proposed method. A comparative analysis of various data-driven methods including machine learning and deep learning (DL) methods is carried out for the proposed research. It is shown that even a simple method such as k-nearest neighbor is capable of successfully recovering missing data of a board compared to complex DL models. Depending on the selected missing board, errors are reduced by a factor between 2.43 and 4.78, for horizontal velocity; and by a factor between 1.74 and 4.71, for angle, compared with the situation of using only the two remaining boards.
{"title":"Improving resilience of sensors in planetary exploration using data-driven models","authors":"Dileep Kumar, M. Dominguez-Pumar, Elisa Sayrol-Clols, J. Torres, M. Marín, J. Gómez-Elvira, L. Mora, S. Navarro, J. Rodriguez-Manfredi","doi":"10.1088/2632-2153/acefaa","DOIUrl":"https://doi.org/10.1088/2632-2153/acefaa","url":null,"abstract":"Improving the resilience of sensor systems in space exploration is a key objective since the environmental conditions to which they are exposed are very harsh. For example, it is known that the presence of flying debris and Dust Devils on the Martian surface can partially damage sensors present in rovers/landers. The objective of this work is to show how data-driven methods can improve sensor resilience, particularly in the case of complex sensors, with multiple intermediate variables, feeding an inverse algorithm (IA) based on calibration data. The method considers three phases: an initial phase in which the sensor is calibrated in the laboratory and an IA is designed; a second phase, in which the sensor is placed at its intended location and sensor data is used to train data-driven model; and a third phase, once the model has been trained and partial damage is detected, in which the data-driven algorithm is reducing errors. The proposed method is tested with the intermediate data of the wind sensor of the TWINS instrument (NASA InSight mission), consisting of two booms placed on the deck of the lander, and three boards per boom. Wind speed and angle are recovered from the intermediate variables provided by the sensor and predicted by the proposed method. A comparative analysis of various data-driven methods including machine learning and deep learning (DL) methods is carried out for the proposed research. It is shown that even a simple method such as k-nearest neighbor is capable of successfully recovering missing data of a board compared to complex DL models. Depending on the selected missing board, errors are reduced by a factor between 2.43 and 4.78, for horizontal velocity; and by a factor between 1.74 and 4.71, for angle, compared with the situation of using only the two remaining boards.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.8,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47742489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}