Pub Date : 2023-10-30DOI: 10.1088/2632-2153/acfd09
Pengcheng Ai, Le Xiao, Zhi Deng, Yi Wang, Xiangming Sun, Guangming Huang, Dong Wang, Yulei Li, Xinchi Ran
Abstract Pulse timing is an important topic in nuclear instrumentation, with far-reaching applications from high energy physics to radiation imaging. While high-speed analog-to-digital converters become more and more developed and accessible, their potential uses and merits in nuclear detector signal processing are still uncertain, partially due to associated timing algorithms which are not fully understood and utilized. In this paper, we propose a novel method based on deep learning for timing analysis of modularized detectors without explicit needs of labeling event data. By taking advantage of the intrinsic time correlations, a label-free loss function with a specially designed regularizer is formed to supervise the training of neural networks (NNs) towards a meaningful and accurate mapping function. We mathematically demonstrate the existence of the optimal function desired by the method, and give a systematic algorithm for training and calibration of the model. The proposed method is validated on two experimental datasets based on silicon photomultipliers as main transducers. In the toy experiment, the NN model achieves the single-channel time resolution of 8.8 ps and exhibits robustness against concept drift in the dataset. In the electromagnetic calorimeter experiment, several NN models (fully-connected, convolutional neural network and long short-term memory) are tested to show their conformance to the underlying physical constraint and to judge their performance against traditional methods. In total, the proposed method works well in either ideal or noisy experimental condition and recovers the time information from waveform samples successfully and precisely.
{"title":"Label-free timing analysis of SiPM-based modularized detectors with physics-constrained deep learning","authors":"Pengcheng Ai, Le Xiao, Zhi Deng, Yi Wang, Xiangming Sun, Guangming Huang, Dong Wang, Yulei Li, Xinchi Ran","doi":"10.1088/2632-2153/acfd09","DOIUrl":"https://doi.org/10.1088/2632-2153/acfd09","url":null,"abstract":"Abstract Pulse timing is an important topic in nuclear instrumentation, with far-reaching applications from high energy physics to radiation imaging. While high-speed analog-to-digital converters become more and more developed and accessible, their potential uses and merits in nuclear detector signal processing are still uncertain, partially due to associated timing algorithms which are not fully understood and utilized. In this paper, we propose a novel method based on deep learning for timing analysis of modularized detectors without explicit needs of labeling event data. By taking advantage of the intrinsic time correlations, a label-free loss function with a specially designed regularizer is formed to supervise the training of neural networks (NNs) towards a meaningful and accurate mapping function. We mathematically demonstrate the existence of the optimal function desired by the method, and give a systematic algorithm for training and calibration of the model. The proposed method is validated on two experimental datasets based on silicon photomultipliers as main transducers. In the toy experiment, the NN model achieves the single-channel time resolution of 8.8 ps and exhibits robustness against concept drift in the dataset. In the electromagnetic calorimeter experiment, several NN models (fully-connected, convolutional neural network and long short-term memory) are tested to show their conformance to the underlying physical constraint and to judge their performance against traditional methods. In total, the proposed method works well in either ideal or noisy experimental condition and recovers the time information from waveform samples successfully and precisely.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"23 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136018674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-27DOI: 10.1088/2632-2153/ad0798
Jose Sigut, Francisco José Fumero Batista, Rafael Arnay, José Estévez, Tinguaro Díaz-Alemán
Abstract Background and objective
Deep learning systems, especially in critical fields like medicine, suffer from a significant drawback - their black box nature, which lacks mechanisms for explaining or interpreting their decisions. In this regard, our research aims to evaluate the use of surrogate models for interpreting convolutional neural network decisions in glaucoma diagnosis. Our approach is novel in that we approximate the original model with an interpretable one and also change the input features, replacing pixels with tabular geometric features of the optic disc, cup, and neuroretinal rim.

Method
We trained convolutional neural networks with two types of images: original images of the optic nerve head and simplified images showing only the disc and cup contours on a uniform background. Decision trees were used as surrogate models due to their simplicity and visualization properties, while saliency maps were calculated for some images for comparison.

Results
The experiments carried out with 1271 images of healthy subjects and 721 images of glaucomatous eyes demonstrate that decision trees can closely approximate the predictions of neural networks trained on simplified contour images, with R-squared values near 0.9 for VGG19, Resnet50, InceptionV3 and Xception architectures. Saliency maps proved difficult to interpret and showed inconsistent results across architectures, in contrast to the decision trees. Additionally, some decision trees trained as surrogate models outperformed a decision tree trained on the actual outcomes without surrogation.

Conclusions
Decision trees may be a more interpretable alternative to saliency methods. Moreover, the fact that we matched the performance of a decision tree without surrogation to that obtained by decision trees using knowledge distillation from neural networks is a great advantage since decision trees are inherently interpretable. Therefore, based on our findings, we think this approach would be the most recommendable choice for specialists as a diagnostic tool.
{"title":"Interpretable surrogate models to approximate the predictions of convolutional neural networks in glaucoma diagnosis","authors":"Jose Sigut, Francisco José Fumero Batista, Rafael Arnay, José Estévez, Tinguaro Díaz-Alemán","doi":"10.1088/2632-2153/ad0798","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0798","url":null,"abstract":"Abstract Background and objective
Deep learning systems, especially in critical fields like medicine, suffer from a significant drawback - their black box nature, which lacks mechanisms for explaining or interpreting their decisions. In this regard, our research aims to evaluate the use of surrogate models for interpreting convolutional neural network decisions in glaucoma diagnosis. Our approach is novel in that we approximate the original model with an interpretable one and also change the input features, replacing pixels with tabular geometric features of the optic disc, cup, and neuroretinal rim.

Method
We trained convolutional neural networks with two types of images: original images of the optic nerve head and simplified images showing only the disc and cup contours on a uniform background. Decision trees were used as surrogate models due to their simplicity and visualization properties, while saliency maps were calculated for some images for comparison.

Results
The experiments carried out with 1271 images of healthy subjects and 721 images of glaucomatous eyes demonstrate that decision trees can closely approximate the predictions of neural networks trained on simplified contour images, with R-squared values near 0.9 for VGG19, Resnet50, InceptionV3 and Xception architectures. Saliency maps proved difficult to interpret and showed inconsistent results across architectures, in contrast to the decision trees. Additionally, some decision trees trained as surrogate models outperformed a decision tree trained on the actual outcomes without surrogation.

Conclusions
Decision trees may be a more interpretable alternative to saliency methods. Moreover, the fact that we matched the performance of a decision tree without surrogation to that obtained by decision trees using knowledge distillation from neural networks is a great advantage since decision trees are inherently interpretable. Therefore, based on our findings, we think this approach would be the most recommendable choice for specialists as a diagnostic tool.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"24 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136317945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-26DOI: 10.1088/2632-2153/ad073b
Maxim A. Ziatdinov, Chun Yin Wong, Sergei V. Kalinin
Abstract Recent advances in scanning tunneling and transmission electron microscopies (STM and STEM) have allowed routine generation of large volumes of imaging data containing information on the structure and functionality of materials. The experimental data sets contain signatures of long-range phenomena such as physical order parameter fields, polarization, and strain gradients in STEM, or standing electronic waves and carrier-mediated exchange interactions in STM, all superimposed onto scanning system distortions and gradual changes of contrast due to drift and/or mis-tilt effects. Correspondingly, while the human eye can readily identify certain patterns in the images such as lattice periodicities, repeating structural elements, or microstructures, their automatic extraction and classification are highly non-trivial and universal pathways to accomplish such analyses are absent. We pose that the most distinctive elements of the patterns observed in STM and (S)TEM images are similarity and (almost-) periodicity, behaviors stemming directly from the parsimony of elementary atomic structures, superimposed on the gradual changes reflective of order parameter distributions. However, the discovery of these elements via global Fourier methods is non-trivial due to variability and lack of ideal discrete translation symmetry. To address this problem, we explore the shift-invariant variational autoencoders (shift-VAE) that allow disentangling characteristic repeating features in the images, their variations, and shifts that inevitably occur when randomly sampling the image space. Shift-VAEs balance the uncertainty in the position of the object of interest with the uncertainty in shape reconstruction. This approach is illustrated for model 1D data, and further extended to synthetic and experimental STM and STEM 2D data. We further introduce an approach for training shift-VAEs that allows finding the latent variables that comport to known physical behavior. In this specific case, the condition is that the latent variable maps should be smooth on the length scale of the atomic lattice (as expected for physical order parameters), but other conditions can be imposed. The opportunities and limitations of the shift VAE analysis for pattern discovery are elucidated.
{"title":"Finding simplicity: unsupervised discovery of features, patterns, and order parameters via shift-invariant variational autoencoders","authors":"Maxim A. Ziatdinov, Chun Yin Wong, Sergei V. Kalinin","doi":"10.1088/2632-2153/ad073b","DOIUrl":"https://doi.org/10.1088/2632-2153/ad073b","url":null,"abstract":"Abstract Recent advances in scanning tunneling and transmission electron microscopies (STM and STEM) have allowed routine generation of large volumes of imaging data containing information on the structure and functionality of materials. The experimental data sets contain signatures of long-range phenomena such as physical order parameter fields, polarization, and strain gradients in STEM, or standing electronic waves and carrier-mediated exchange interactions in STM, all superimposed onto scanning system distortions and gradual changes of contrast due to drift and/or mis-tilt effects. Correspondingly, while the human eye can readily identify certain patterns in the images such as lattice periodicities, repeating structural elements, or microstructures, their automatic extraction and classification are highly non-trivial and universal pathways to accomplish such analyses are absent. We pose that the most distinctive elements of the patterns observed in STM and (S)TEM images are similarity and (almost-) periodicity, behaviors stemming directly from the parsimony of elementary atomic structures, superimposed on the gradual changes reflective of order parameter distributions. However, the discovery of these elements via global Fourier methods is non-trivial due to variability and lack of ideal discrete translation symmetry. To address this problem, we explore the shift-invariant variational autoencoders (shift-VAE) that allow disentangling characteristic repeating features in the images, their variations, and shifts that inevitably occur when randomly sampling the image space. Shift-VAEs balance the uncertainty in the position of the object of interest with the uncertainty in shape reconstruction. This approach is illustrated for model 1D data, and further extended to synthetic and experimental STM and STEM 2D data. We further introduce an approach for training shift-VAEs that allows finding the latent variables that comport to known physical behavior. In this specific case, the condition is that the latent variable maps should be smooth on the length scale of the atomic lattice (as expected for physical order parameters), but other conditions can be imposed. The opportunities and limitations of the shift VAE analysis for pattern discovery are elucidated.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"59 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136377189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-20DOI: 10.1088/2632-2153/ad0286
John J. Jairo Molina, Kenta Ogawa, Takashi Taniguchi
Abstract We develop a probabilistic Stokes flow framework, using physics informed Gaussian processes, which can be used to solve both forward/inverse flow problems with missing and/or noisy data. The physics of the problem, specified by the Stokes and continuity equations, is exactly encoded into the inference framework. Crucially, this means that we do not need to explicitly solve the Poisson equation for the pressure field, as a physically meaningful (divergence-free) velocity field will automatically be selected. We test our method on a simple pressure driven flow problem, i.e. flow through a sinusoidal channel, and compare against standard numerical methods (Finite Element and Direct Numerical Simulations). We obtain excellent agreement, even when solving inverse problems given only sub-sampled velocity data on low dimensional sub-spaces (i.e. 1 component of the velocity on 1 D domains to reconstruct 2 D flows). The proposed method will be a valuable tool for analyzing experimental data, where noisy/missing data is the norm.
{"title":"Stokesian Processes : Inferring Stokes Flows using Physics-Informed Gaussian Processes","authors":"John J. Jairo Molina, Kenta Ogawa, Takashi Taniguchi","doi":"10.1088/2632-2153/ad0286","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0286","url":null,"abstract":"Abstract We develop a probabilistic Stokes flow framework, using physics informed Gaussian processes, which can be used to solve both forward/inverse flow problems with missing and/or noisy data. The physics of the problem, specified by the Stokes and continuity equations, is exactly encoded into the inference framework. Crucially, this means that we do not need to explicitly solve the Poisson equation for the pressure field, as a physically meaningful (divergence-free) velocity field will automatically be selected. We test our method on a simple pressure driven flow problem, i.e. flow through a sinusoidal channel, and compare against standard numerical methods (Finite Element and Direct Numerical Simulations). We obtain excellent agreement, even when solving inverse problems given only sub-sampled velocity data on low dimensional sub-spaces (i.e. 1 component of the velocity on 1 D domains to reconstruct 2 D flows). The proposed method will be a valuable tool for analyzing experimental data, where noisy/missing data is the norm.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"53 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135513559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-19DOI: 10.1088/2632-2153/ad04ea
Breno Orzari, Nadezda Chernyavskaya, Raphael Cobe, Javier Mauricio Duarte, Jefferson Fialho, Dimitrios Gunopulos, Raghav Kansal, Maurizio Pierini, Thiago Tomei, Mary Touranakou
Abstract In high energy physics, one of the most important processes for collider data analysis is the comparison of collected and simulated data. Nowadays the state-of-the-art for data generation is in the form of Monte Carlo (MC) generators. However, because of the upcoming high-luminosity upgrade of the LHC, there will not be enough computational power or time to match the amount of needed simulated data using MC methods. An alternative approach under study is the usage of machine learning generative methods to fulfill that task. Since the most common final-state objects of high-energy proton collisions are hadronic jets, which are collections of particles collimated in a given region of space, this work aims to develop a convolutional variational autoencoder (ConVAE) for the generation of particle-based LHC hadronic jets. Given the ConVAE's limitations, a normalizing flow (NF) network is coupled to it in a two-step training process, which shows improvements on the results for the generated jets. The ConVAE+NF network is capable of generating a jet in 18.30 ± 0.04 μs, making it one of the fastest methods for this task up to now.
{"title":"LHC Hadronic Jet Generation Using Convolutional Variational Autoencoders with Normalizing Flows","authors":"Breno Orzari, Nadezda Chernyavskaya, Raphael Cobe, Javier Mauricio Duarte, Jefferson Fialho, Dimitrios Gunopulos, Raghav Kansal, Maurizio Pierini, Thiago Tomei, Mary Touranakou","doi":"10.1088/2632-2153/ad04ea","DOIUrl":"https://doi.org/10.1088/2632-2153/ad04ea","url":null,"abstract":"Abstract In high energy physics, one of the most important processes for collider data analysis is the comparison of collected and simulated data. Nowadays the state-of-the-art for data generation is in the form of Monte Carlo (MC) generators. However, because of the upcoming high-luminosity upgrade of the LHC, there will not be enough computational power or time to match the amount of needed simulated data using MC methods. An alternative approach under study is the usage of machine learning generative methods to fulfill that task. Since the most common final-state objects of high-energy proton collisions are hadronic jets, which are collections of particles collimated in a given region of space, this work aims to develop a convolutional variational autoencoder (ConVAE) for the generation of particle-based LHC hadronic jets. Given the ConVAE's limitations, a normalizing flow (NF) network is coupled to it in a two-step training process, which shows improvements on the results for the generated jets. The ConVAE+NF network is capable of generating a jet in 18.30 ± 0.04 μs, making it one of the fastest methods for this task up to now.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135729213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-19DOI: 10.1088/2632-2153/ad020e
Gergely Hanczár, Marcell Stippinger, Dávid Hanák, Marcell Tamás Kurbucz, Olivér Máté Törteli, Ágnes Chripkó, Zoltán Somogyvári
Abstract In recent years, several screening methods have been published for ultrahigh-dimensional data that contain hundreds of thousands of features, many of which are irrelevant or redundant. However, most of these methods cannot handle data with thousands of classes. Prediction models built to authenticate users based on multichannel biometric data result in this type of problem. In this study, we present a novel method known as random forest-based multiround screening (RFMS) that can be effectively applied under such circumstances. The proposed algorithm divides the feature space into small subsets and executes a series of partial model builds. These partial models are used to implement tournament-based sorting and the selection of features based on their importance. This algorithm successfully filters irrelevant features and also discovers binary and higher-order feature interactions. To benchmark RFMS, a synthetic biometric feature space generator known as BiometricBlender is employed. Based on the results, the RFMS is on par with industry-standard feature screening methods, while simultaneously possessing many advantages over them.
{"title":"Feature space reduction method for ultrahigh-dimensional, multiclass data: Random forest-based multiround screening (RFMS)","authors":"Gergely Hanczár, Marcell Stippinger, Dávid Hanák, Marcell Tamás Kurbucz, Olivér Máté Törteli, Ágnes Chripkó, Zoltán Somogyvári","doi":"10.1088/2632-2153/ad020e","DOIUrl":"https://doi.org/10.1088/2632-2153/ad020e","url":null,"abstract":"Abstract In recent years, several screening methods have been published for ultrahigh-dimensional data that contain hundreds of thousands of features, many of which are irrelevant or redundant. However, most of these methods cannot handle data with thousands of classes. Prediction models built to authenticate users based on multichannel biometric data result in this type of problem. In this study, we present a novel method known as random forest-based multiround screening (RFMS) that can be effectively applied under such circumstances. The proposed algorithm divides the feature space into small subsets and executes a series of partial model builds. These partial models are used to implement tournament-based sorting and the selection of features based on their importance. This algorithm successfully filters irrelevant features and also discovers binary and higher-order feature interactions. To benchmark RFMS, a synthetic biometric feature space generator known as BiometricBlender is employed. Based on the results, the RFMS is on par with industry-standard feature screening methods, while simultaneously possessing many advantages over them.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135667453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-18DOI: 10.1088/2632-2153/ad0102
Marc Stuart Klinger, D S Berman, Alexander George Stapleton
Abstract In this note we present a fully information theoretic approach to renormalization inspired by Bayesian statistical inference, which we refer to as Bayesian renormalization. The main insight of Bayesian renormalization is that the Fisher metric defines a correlation length that plays the role of an emergent renormalization group (RG) scale quantifying the distinguishability between nearby points in the space of probability distributions. This RG scale can be interpreted as a proxy for the maximum number of unique observations that can be made about a given system during a statistical inference experiment. The role of the Bayesian renormalization scheme is subsequently to prepare an effective model for a given system up to a precision which is bounded by the aforementioned scale. In applications of Bayesian renormalization to physical systems, the emergent information theoretic scale is naturally identified with the maximum energy that can be probed by current experimental apparatus, and thus Bayesian renormalization coincides with ordinary renormalization. However, Bayesian renormalization is sufficiently general to apply even in circumstances in which an immediate physical scale is absent, and thus provides an ideal approach to renormalization in data science contexts. To this end, we provide insight into how the Bayesian renormalization scheme relates to existing methods for data compression and data generation such as the information bottleneck and the diffusion learning paradigm. We conclude by designing an explicit form of Bayesian renormalization inspired by Wilson’s momentum shell renormalization scheme in quantum field theory. We apply this Bayesian renormalization scheme to a simple neural network and verify the sense in which it organizes the parameters of the model according to a hierarchy of information theoretic importance.
{"title":"Bayesian Renormalization","authors":"Marc Stuart Klinger, D S Berman, Alexander George Stapleton","doi":"10.1088/2632-2153/ad0102","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0102","url":null,"abstract":"Abstract In this note we present a fully information theoretic approach to renormalization inspired by Bayesian statistical inference, which we refer to as Bayesian renormalization. The main insight of Bayesian renormalization is that the Fisher metric defines a correlation length that plays the role of an emergent renormalization group (RG) scale quantifying the distinguishability between nearby points in the space of probability distributions. This RG scale can be interpreted as a proxy for the maximum number of unique observations that can be made about a given system during a statistical inference experiment. The role of the Bayesian renormalization scheme is subsequently to prepare an effective model for a given system up to a precision which is bounded by the aforementioned scale. In applications of Bayesian renormalization to physical systems, the emergent information theoretic scale is naturally identified with the maximum energy that can be probed by current experimental apparatus, and thus Bayesian renormalization coincides with ordinary renormalization. However, Bayesian renormalization is sufficiently general to apply even in circumstances in which an immediate physical scale is absent, and thus provides an ideal approach to renormalization in data science contexts. To this end, we provide insight into how the Bayesian renormalization scheme relates to existing methods for data compression and data generation such as the information bottleneck and the diffusion learning paradigm. We conclude by designing an explicit form of Bayesian renormalization inspired by Wilson’s momentum shell renormalization scheme in quantum field theory. We apply this Bayesian renormalization scheme to a simple neural network and verify the sense in which it organizes the parameters of the model according to a hierarchy of information theoretic importance.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135823715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The Vlasov-Poisson equation is one of the most fundamental models in plasma physics. It has been widely used in areas such as confined plasmas in thermonuclear research and space plasmas in planetary magnetospheres. In this study, we explore the feasibility of the physics-informed neural networks for solving forward and inverse Vlasov-Poisson equation (PINN-Vlasov). The PINN-Vlasov method employs a multilayer perceptron (MLP) to represent the solution of the Vlasov-Poisson equation. The training dataset comprises the randomly sampled time, space, and velocity coordinates and the corresponding distribution function. We generate training data using the fully kinetic PIC simulation rather than the analytical solution to the Vlasov-Poisson equation to eliminate the correlation between data and equations. The Vlasov equation and Poisson equation are concurrently integrated into the PINN-Vlasov framework using automatic differentiation and the trapezoidal rule, respectively. By minimizing the residuals between the reconstructed distribution function and labeled data, and the physically constrained residuals of the Vlasov-Poisson equation, the PINN-Vlasov method is capable of dealing with both forward and inverse problems. For forward problems, the PINN-Vlasov method can solve the Vlasov-Poisson equation with given initial and boundary conditions. For inverse problems, the completely unknown electric field and equation coefficients can be predicted with the PINN-Vlasov method using little particle distribution data.
{"title":"Physics-informed neural networks for solving forward and inverse Vlasov-Poisson equation via fully kinetic simulation","authors":"Baiyi Zhang, Guobiao Cai, Huiyan Weng, Weizong Wang, Lihui Liu, Bijiao He","doi":"10.1088/2632-2153/ad03d5","DOIUrl":"https://doi.org/10.1088/2632-2153/ad03d5","url":null,"abstract":"Abstract The Vlasov-Poisson equation is one of the most fundamental models in plasma physics. It has been widely used in areas such as confined plasmas in thermonuclear research and space plasmas in planetary magnetospheres. In this study, we explore the feasibility of the physics-informed neural networks for solving forward and inverse Vlasov-Poisson equation (PINN-Vlasov). The PINN-Vlasov method employs a multilayer perceptron (MLP) to represent the solution of the Vlasov-Poisson equation. The training dataset comprises the randomly sampled time, space, and velocity coordinates and the corresponding distribution function. We generate training data using the fully kinetic PIC simulation rather than the analytical solution to the Vlasov-Poisson equation to eliminate the correlation between data and equations. The Vlasov equation and Poisson equation are concurrently integrated into the PINN-Vlasov framework using automatic differentiation and the trapezoidal rule, respectively. By minimizing the residuals between the reconstructed distribution function and labeled data, and the physically constrained residuals of the Vlasov-Poisson equation, the PINN-Vlasov method is capable of dealing with both forward and inverse problems. For forward problems, the PINN-Vlasov method can solve the Vlasov-Poisson equation with given initial and boundary conditions. For inverse problems, the completely unknown electric field and equation coefficients can be predicted with the PINN-Vlasov method using little particle distribution data.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136079961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-16DOI: 10.1088/2632-2153/ad03ad
Myeonghyeon Kim, Junhwan Kwon, Tenzin Rabga, Yong-il Shin
Abstract Quantum vortices in atomic Bose-Einstein condensates (BECs) are topological defects characterized by quantized circulation of particles around them. In experimental studies, vortices are commonly detected by time-of-flight imaging, where their density-depleted cores are enlarged. In this work, we describe a machine learning-based method for detecting vortices in experimental BEC images, particularly focusing on turbulent condensates containing irregularly distributed vortices. Our approach employs a convolutional neural network (CNN) trained solely on synthetic simulated images, eliminating the need for manual labeling of the vortex positions as ground truth. We find that the CNN achieves accurate vortex detection in real experimental images, thereby facilitating analysis of large experimental datasets without being constrained by specific experimental conditions. This novel approach represents a significant advancement in studying quantum vortex dynamics and streamlines the analysis process in the investigation of turbulent BECs.
{"title":"Vortex detection in atomic Bose--Einstein condensates using neural networks trained on synthetic images","authors":"Myeonghyeon Kim, Junhwan Kwon, Tenzin Rabga, Yong-il Shin","doi":"10.1088/2632-2153/ad03ad","DOIUrl":"https://doi.org/10.1088/2632-2153/ad03ad","url":null,"abstract":"Abstract Quantum vortices in atomic Bose-Einstein condensates (BECs) are topological defects characterized by quantized circulation of particles around them. In experimental studies, vortices are commonly detected by time-of-flight imaging, where their density-depleted cores are enlarged. In this work, we describe a machine learning-based method for detecting vortices in experimental BEC images, particularly focusing on turbulent condensates containing irregularly distributed vortices. Our approach employs a convolutional neural network (CNN) trained solely on synthetic simulated images, eliminating the need for manual labeling of the vortex positions as ground truth. We find that the CNN achieves accurate vortex detection in real experimental images, thereby facilitating analysis of large experimental datasets without being constrained by specific experimental conditions. This novel approach represents a significant advancement in studying quantum vortex dynamics and streamlines the analysis process in the investigation of turbulent BECs.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136079964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-16DOI: 10.1088/2632-2153/ad0101
Wanda Hou, Yi-Zhuang You
Abstract We develop a machine-learning renormalization group (MLRG) algorithm to explore and analyze many-body lattice models in statistical physics. Using the representation learning capability of generative modeling, MLRG automatically learns the optimal renormalization group (RG) transformations from self-generated spin configurations and formulates RG equations without human supervision. The algorithm does not focus on simulating any particular lattice model but broadly explores all possible models compatible with the internal and lattice symmetries given the on-site symmetry representation. It can uncover the RG monotone that governs the RG flow, assuming a strong form of the c -theorem. This enables several downstream tasks, including unsupervised classification of phases, automatic location of phase transitions or critical points, controlled estimation of critical exponents, and operator scaling dimensions. We demonstrate the MLRG method in two-dimensional lattice models with Ising symmetry and show that the algorithm correctly identifies and characterizes the Ising criticality.
{"title":"Machine learning renormalization group for statistical physics","authors":"Wanda Hou, Yi-Zhuang You","doi":"10.1088/2632-2153/ad0101","DOIUrl":"https://doi.org/10.1088/2632-2153/ad0101","url":null,"abstract":"Abstract We develop a machine-learning renormalization group (MLRG) algorithm to explore and analyze many-body lattice models in statistical physics. Using the representation learning capability of generative modeling, MLRG automatically learns the optimal renormalization group (RG) transformations from self-generated spin configurations and formulates RG equations without human supervision. The algorithm does not focus on simulating any particular lattice model but broadly explores all possible models compatible with the internal and lattice symmetries given the on-site symmetry representation. It can uncover the RG monotone that governs the RG flow, assuming a strong form of the c -theorem. This enables several downstream tasks, including unsupervised classification of phases, automatic location of phase transitions or critical points, controlled estimation of critical exponents, and operator scaling dimensions. We demonstrate the MLRG method in two-dimensional lattice models with Ising symmetry and show that the algorithm correctly identifies and characterizes the Ising criticality.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136077746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}