Extreme events are the major weather related hazard for humanity. It is then of crucial importance to have a good understanding of their statistics and to be able to forecast them. However, lack of sufficient data makes their study particularly challenging. In this work we provide a simple framework to study extreme events that tackles the lack of data issue by using the whole dataset available, rather than focusing on the extremes in the dataset. To do so, we make the assumption that the set of predictors and the observable used to define the extreme event follow a jointly Gaussian distribution. This naturally gives the notion of an optimal projection of the predictors for forecasting the event. We take as a case study extreme heatwaves over France, and we test our method on an 8000-year-long intermediate complexity climate model time series and on the ERA5 reanalysis dataset. For a-posteriori statistics, we observe and motivate the fact that composite maps of very extreme events look similar to less extreme ones. For prediction, we show that our method is competitive with off-the-shelf neural networks on the long dataset and outperforms them on reanalysis. The optimal projection pattern, which makes our forecast intrinsically interpretable, highlights the importance of soil moisture deficit and quasi-stationary Rossby waves as precursors to extreme heatwaves.
{"title":"Gaussian Framework and Optimal Projection of Weather Fields for Prediction of Extreme Events","authors":"Valeria Mascolo, Alessandro Lovo, Corentin Herbert, Freddy Bouchet","doi":"arxiv-2405.20903","DOIUrl":"https://doi.org/arxiv-2405.20903","url":null,"abstract":"Extreme events are the major weather related hazard for humanity. It is then\u0000of crucial importance to have a good understanding of their statistics and to\u0000be able to forecast them. However, lack of sufficient data makes their study\u0000particularly challenging. In this work we provide a simple framework to study extreme events that\u0000tackles the lack of data issue by using the whole dataset available, rather\u0000than focusing on the extremes in the dataset. To do so, we make the assumption\u0000that the set of predictors and the observable used to define the extreme event\u0000follow a jointly Gaussian distribution. This naturally gives the notion of an\u0000optimal projection of the predictors for forecasting the event. We take as a case study extreme heatwaves over France, and we test our method\u0000on an 8000-year-long intermediate complexity climate model time series and on\u0000the ERA5 reanalysis dataset. For a-posteriori statistics, we observe and motivate the fact that composite\u0000maps of very extreme events look similar to less extreme ones. For prediction, we show that our method is competitive with off-the-shelf\u0000neural networks on the long dataset and outperforms them on reanalysis. The optimal projection pattern, which makes our forecast intrinsically\u0000interpretable, highlights the importance of soil moisture deficit and\u0000quasi-stationary Rossby waves as precursors to extreme heatwaves.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ritam Pal, Brandon Kemerling, Daniel Ryan, Sudhakar Bollapragada, Amrita Basak
Additive manufacturing, especially laser powder bed fusion (L-PBF), is widely used for fabricating metal parts with intricate geometries. However, parts produced via L-PBF suffer from varied surface roughness which affects the dynamic or fatigue properties. Accurate prediction of fatigue properties as a function of surface roughness is a critical requirement for qualifying L-PBF parts. In this work, an analytical methodology is put forth to predict the fatigue life of L-PBF components having heterogeneous surface roughness. Thirty-six Hastelloy X specimens are printed using L-PBF followed by industry-standard heat treatment procedures. Half of these specimens are built with as-printed gauge sections and the other half is printed as cylinders from which fatigue specimens are extracted via machining. Specimens are printed in a vertical orientation and an orientation 30 degree from the vertical axis. The surface roughness of the specimens is measured using computed tomography and parameters such as the maximum valley depth are used to build an extreme value distribution. Fatigue testing is conducted at an isothermal condition of 500-degree F. It is observed that the rough specimens fail much earlier compared to the machined specimens due to the deep valleys present on the surfaces of the former ones. The valleys act as notches leading to high strain localization. Following this observation, a functional relationship is formulated analytically that considers surface valleys as notches and correlates the strain localization around those notches with fatigue life, using the Coffin-Manson-Basquin and Ramberg-Osgood equation. In conclusion, the proposed analytical model successfully predicts the fatigue life of L-PBF specimens at an elevated temperature undergoing different strain loadings.
{"title":"Surface roughness-informed fatigue life prediction of L-PBF Hastelloy X at elevated temperature","authors":"Ritam Pal, Brandon Kemerling, Daniel Ryan, Sudhakar Bollapragada, Amrita Basak","doi":"arxiv-2406.00186","DOIUrl":"https://doi.org/arxiv-2406.00186","url":null,"abstract":"Additive manufacturing, especially laser powder bed fusion (L-PBF), is widely\u0000used for fabricating metal parts with intricate geometries. However, parts\u0000produced via L-PBF suffer from varied surface roughness which affects the\u0000dynamic or fatigue properties. Accurate prediction of fatigue properties as a\u0000function of surface roughness is a critical requirement for qualifying L-PBF\u0000parts. In this work, an analytical methodology is put forth to predict the\u0000fatigue life of L-PBF components having heterogeneous surface roughness.\u0000Thirty-six Hastelloy X specimens are printed using L-PBF followed by\u0000industry-standard heat treatment procedures. Half of these specimens are built\u0000with as-printed gauge sections and the other half is printed as cylinders from\u0000which fatigue specimens are extracted via machining. Specimens are printed in a\u0000vertical orientation and an orientation 30 degree from the vertical axis. The\u0000surface roughness of the specimens is measured using computed tomography and\u0000parameters such as the maximum valley depth are used to build an extreme value\u0000distribution. Fatigue testing is conducted at an isothermal condition of\u0000500-degree F. It is observed that the rough specimens fail much earlier\u0000compared to the machined specimens due to the deep valleys present on the\u0000surfaces of the former ones. The valleys act as notches leading to high strain\u0000localization. Following this observation, a functional relationship is\u0000formulated analytically that considers surface valleys as notches and\u0000correlates the strain localization around those notches with fatigue life,\u0000using the Coffin-Manson-Basquin and Ramberg-Osgood equation. In conclusion, the\u0000proposed analytical model successfully predicts the fatigue life of L-PBF\u0000specimens at an elevated temperature undergoing different strain loadings.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ming Du, Tao Zhou, Junjing Deng, Daniel J. Ching, Steven Henke, Mathew J. Cherukara
Ptychography is a powerful imaging technique that is used in a variety of fields, including materials science, biology, and nanotechnology. However, the accuracy of the reconstructed ptychography image is highly dependent on the accuracy of the recorded probe positions which often contain errors. These errors are typically corrected jointly with phase retrieval through numerical optimization approaches. When the error accumulates along the scan path or when the error magnitude is large, these approaches may not converge with satisfactory result. We propose a fundamentally new approach for ptychography probe position prediction for data with large position errors, where a neural network is used to make single-shot phase retrieval on individual diffraction patterns, yielding the object image at each scan point. The pairwise offsets among these images are then found using a robust image registration method, and the results are combined to yield the complete scan path by constructing and solving a linear equation. We show that our method can achieve good position prediction accuracy for data with large and accumulating errors on the order of $10^2$ pixels, a magnitude that often makes optimization-based algorithms fail to converge. For ptychography instruments without sophisticated position control equipment such as interferometers, our method is of significant practical potential.
{"title":"Predicting ptychography probe positions using single-shot phase retrieval neural network","authors":"Ming Du, Tao Zhou, Junjing Deng, Daniel J. Ching, Steven Henke, Mathew J. Cherukara","doi":"arxiv-2405.20910","DOIUrl":"https://doi.org/arxiv-2405.20910","url":null,"abstract":"Ptychography is a powerful imaging technique that is used in a variety of\u0000fields, including materials science, biology, and nanotechnology. However, the\u0000accuracy of the reconstructed ptychography image is highly dependent on the\u0000accuracy of the recorded probe positions which often contain errors. These\u0000errors are typically corrected jointly with phase retrieval through numerical\u0000optimization approaches. When the error accumulates along the scan path or when\u0000the error magnitude is large, these approaches may not converge with\u0000satisfactory result. We propose a fundamentally new approach for ptychography\u0000probe position prediction for data with large position errors, where a neural\u0000network is used to make single-shot phase retrieval on individual diffraction\u0000patterns, yielding the object image at each scan point. The pairwise offsets\u0000among these images are then found using a robust image registration method, and\u0000the results are combined to yield the complete scan path by constructing and\u0000solving a linear equation. We show that our method can achieve good position\u0000prediction accuracy for data with large and accumulating errors on the order of\u0000$10^2$ pixels, a magnitude that often makes optimization-based algorithms fail\u0000to converge. For ptychography instruments without sophisticated position\u0000control equipment such as interferometers, our method is of significant\u0000practical potential.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Detector simulation and reconstruction are a significant computational bottleneck in particle physics. We develop Particle-flow Neural Assisted Simulations (Parnassus) to address this challenge. Our deep learning model takes as input a point cloud (particles impinging on a detector) and produces a point cloud (reconstructed particles). By combining detector simulations and reconstruction into one step, we aim to minimize resource utilization and enable fast surrogate models suitable for application both inside and outside large collaborations. We demonstrate this approach using a publicly available dataset of jets passed through the full simulation and reconstruction pipeline of the CMS experiment. We show that Parnassus accurately mimics the CMS particle flow algorithm on the (statistically) same events it was trained on and can generalize to jet momentum and type outside of the training distribution.
{"title":"Parnassus: An Automated Approach to Accurate, Precise, and Fast Detector Simulation and Reconstruction","authors":"Etienne Dreyer, Eilam Gross, Dmitrii Kobylianskii, Vinicius Mikuni, Benjamin Nachman, Nathalie Soybelman","doi":"arxiv-2406.01620","DOIUrl":"https://doi.org/arxiv-2406.01620","url":null,"abstract":"Detector simulation and reconstruction are a significant computational\u0000bottleneck in particle physics. We develop Particle-flow Neural Assisted\u0000Simulations (Parnassus) to address this challenge. Our deep learning model\u0000takes as input a point cloud (particles impinging on a detector) and produces a\u0000point cloud (reconstructed particles). By combining detector simulations and\u0000reconstruction into one step, we aim to minimize resource utilization and\u0000enable fast surrogate models suitable for application both inside and outside\u0000large collaborations. We demonstrate this approach using a publicly available\u0000dataset of jets passed through the full simulation and reconstruction pipeline\u0000of the CMS experiment. We show that Parnassus accurately mimics the CMS\u0000particle flow algorithm on the (statistically) same events it was trained on\u0000and can generalize to jet momentum and type outside of the training\u0000distribution.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141252762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
State estimation for nonlinear state space models is a challenging task. Existing assimilation methodologies predominantly assume Gaussian posteriors on physical space, where true posteriors become inevitably non-Gaussian. We propose Deep Bayesian Filtering (DBF) for data assimilation on nonlinear state space models (SSMs). DBF constructs new latent variables $h_t$ on a new latent (``fancy'') space and assimilates observations $o_t$. By (i) constraining the state transition on fancy space to be linear and (ii) learning a Gaussian inverse observation operator $q(h_t|o_t)$, posteriors always remain Gaussian for DBF. Quite distinctively, the structured design of posteriors provides an analytic formula for the recursive computation of posteriors without accumulating Monte-Carlo sampling errors over time steps. DBF seeks the Gaussian inverse observation operators $q(h_t|o_t)$ and other latent SSM parameters (e.g., dynamics matrix) by maximizing the evidence lower bound. Experiments show that DBF outperforms model-based approaches and latent assimilation methods in various tasks and conditions.
{"title":"Deep Bayesian Filter for Bayes-faithful Data Assimilation","authors":"Yuta Tarumi, Keisuke Fukuda, Shin-ichi Maeda","doi":"arxiv-2405.18674","DOIUrl":"https://doi.org/arxiv-2405.18674","url":null,"abstract":"State estimation for nonlinear state space models is a challenging task.\u0000Existing assimilation methodologies predominantly assume Gaussian posteriors on\u0000physical space, where true posteriors become inevitably non-Gaussian. We\u0000propose Deep Bayesian Filtering (DBF) for data assimilation on nonlinear state\u0000space models (SSMs). DBF constructs new latent variables $h_t$ on a new latent\u0000(``fancy'') space and assimilates observations $o_t$. By (i) constraining the\u0000state transition on fancy space to be linear and (ii) learning a Gaussian\u0000inverse observation operator $q(h_t|o_t)$, posteriors always remain Gaussian\u0000for DBF. Quite distinctively, the structured design of posteriors provides an\u0000analytic formula for the recursive computation of posteriors without\u0000accumulating Monte-Carlo sampling errors over time steps. DBF seeks the\u0000Gaussian inverse observation operators $q(h_t|o_t)$ and other latent SSM\u0000parameters (e.g., dynamics matrix) by maximizing the evidence lower bound.\u0000Experiments show that DBF outperforms model-based approaches and latent\u0000assimilation methods in various tasks and conditions.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141192113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, we review the interdisciplinary techniques (borrowed from physics, mathematics, statistics, machine-learning, etc.) and methodological framework that we have used to understand climate systems, which serve as examples of "complex systems". We believe that this would offer valuable insights to comprehend the complexity of climate variability and pave the way for drafting policies for action against climate change, etc. Our basic aim is to analyse time-series data structures across diverse climate parameters, extract Fourier-transformed features to recognize and model the trends/seasonalities in the climate variables using standard methods like detrended residual series analyses, correlation structures among climate parameters, Granger causal models, and other statistical machine-learning techniques. We cite and briefly explain two case studies: (i) the relationship between the Standardised Precipitation Index (SPI) and specific climate variables including Sea Surface Temperature (SST), El Ni~no Southern Oscillation (ENSO), and Indian Ocean Dipole (IOD), uncovering temporal shifts in correlations between SPI and these variables, and reveal complex patterns that drive drought and wet climate conditions in South-West Australia; (ii) the complex interactions of North Atlantic Oscillation (NAO) index, with SST and sea ice extent (SIE), potentially arising from positive feedback loops.
{"title":"Untangling Climate's Complexity: Methodological Insights","authors":"Alka Yadav, Sourish Das, Anirban Chakraborti","doi":"arxiv-2405.18391","DOIUrl":"https://doi.org/arxiv-2405.18391","url":null,"abstract":"In this article, we review the interdisciplinary techniques (borrowed from\u0000physics, mathematics, statistics, machine-learning, etc.) and methodological\u0000framework that we have used to understand climate systems, which serve as\u0000examples of \"complex systems\". We believe that this would offer valuable\u0000insights to comprehend the complexity of climate variability and pave the way\u0000for drafting policies for action against climate change, etc. Our basic aim is\u0000to analyse time-series data structures across diverse climate parameters,\u0000extract Fourier-transformed features to recognize and model the\u0000trends/seasonalities in the climate variables using standard methods like\u0000detrended residual series analyses, correlation structures among climate\u0000parameters, Granger causal models, and other statistical machine-learning\u0000techniques. We cite and briefly explain two case studies: (i) the relationship\u0000between the Standardised Precipitation Index (SPI) and specific climate\u0000variables including Sea Surface Temperature (SST), El Ni~no Southern\u0000Oscillation (ENSO), and Indian Ocean Dipole (IOD), uncovering temporal shifts\u0000in correlations between SPI and these variables, and reveal complex patterns\u0000that drive drought and wet climate conditions in South-West Australia; (ii) the\u0000complex interactions of North Atlantic Oscillation (NAO) index, with SST and\u0000sea ice extent (SIE), potentially arising from positive feedback loops.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141166538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To quantify how well theoretical predictions of structural ensembles agree with experimental measurements, we depend on the accuracy of forward models. These models are computational frameworks that generate observable quantities from molecular configurations based on empirical relationships linking specific molecular properties to experimental measurements. Bayesian Inference of Conformational Populations (BICePs) is a reweighting algorithm that reconciles simulated ensembles with ensemble-averaged experimental observations, even when such observations are sparse and/or noisy. This is achieved by sampling the posterior distribution of conformational populations under experimental restraints as well as sampling the posterior distribution of uncertainties due to random and systematic error. In this study, we enhance the algorithm for the refinement of empirical forward model (FM) parameters. We introduce and evaluate two novel methods for optimizing FM parameters. The first method treats FM parameters as nuisance parameters, integrating over them in the full posterior distribution. The second method employs variational minimization of a quantity called the BICePs score that reports the free energy of `turning on` the experimental restraints. This technique, coupled with improved likelihood functions for handling experimental outliers, facilitates force field validation and optimization, as illustrated in recent studies (Raddi et al. 2023, 2024). Using this approach, we refine parameters that modulate the Karplus relation, crucial for accurate predictions of J-coupling constants based on dihedral angles between interacting nuclei. We validate this approach first with a toy model system, and then for human ubiquitin, predicting six sets of Karplus parameters. This approach, which does not rely on predetermined parameters, enhances predictive accuracy and can be used for many applications.
{"title":"Automatic Forward Model Parameterization with Bayesian Inference of Conformational Populations","authors":"Robert M. Raddi, Tim Marshall, Vincent A. Voelz","doi":"arxiv-2405.18532","DOIUrl":"https://doi.org/arxiv-2405.18532","url":null,"abstract":"To quantify how well theoretical predictions of structural ensembles agree\u0000with experimental measurements, we depend on the accuracy of forward models.\u0000These models are computational frameworks that generate observable quantities\u0000from molecular configurations based on empirical relationships linking specific\u0000molecular properties to experimental measurements. Bayesian Inference of\u0000Conformational Populations (BICePs) is a reweighting algorithm that reconciles\u0000simulated ensembles with ensemble-averaged experimental observations, even when\u0000such observations are sparse and/or noisy. This is achieved by sampling the\u0000posterior distribution of conformational populations under experimental\u0000restraints as well as sampling the posterior distribution of uncertainties due\u0000to random and systematic error. In this study, we enhance the algorithm for the\u0000refinement of empirical forward model (FM) parameters. We introduce and\u0000evaluate two novel methods for optimizing FM parameters. The first method\u0000treats FM parameters as nuisance parameters, integrating over them in the full\u0000posterior distribution. The second method employs variational minimization of a\u0000quantity called the BICePs score that reports the free energy of `turning on`\u0000the experimental restraints. This technique, coupled with improved likelihood\u0000functions for handling experimental outliers, facilitates force field\u0000validation and optimization, as illustrated in recent studies (Raddi et al.\u00002023, 2024). Using this approach, we refine parameters that modulate the\u0000Karplus relation, crucial for accurate predictions of J-coupling constants\u0000based on dihedral angles between interacting nuclei. We validate this approach\u0000first with a toy model system, and then for human ubiquitin, predicting six\u0000sets of Karplus parameters. This approach, which does not rely on predetermined\u0000parameters, enhances predictive accuracy and can be used for many applications.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141192005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present Link Density (LD) computed from the Recurrence Network (RN) of a time series data as an effective measure that can detect dynamical transitions in a system. We illustrate its use using time series from the standard Rossler system in the period doubling transitions and the transition to chaos. Moreover, we find that the standard deviation of LD can be more effective in highlighting the transition points. We also consider the variations in data when the parameter of the system is varying due to internal or intrinsic perturbations but at a time scale much slower than that of the dynamics. In this case also, the measure LD and its standard deviation correctly detect transition points in the underlying dynamics of the system. The computation of LD requires minimal computing resources and time, and works well with short data sets. Hence, we propose this measure as a tool to track transitions in dynamics from data, facilitating quicker and more effective analysis of large number of data sets.
{"title":"Tracking Dynamical Transitions using Link Density of Recurrence Networks","authors":"Rinku Jacob, R. Misra, K P Harikrishnan, G Ambika","doi":"arxiv-2405.19357","DOIUrl":"https://doi.org/arxiv-2405.19357","url":null,"abstract":"We present Link Density (LD) computed from the Recurrence Network (RN) of a\u0000time series data as an effective measure that can detect dynamical transitions\u0000in a system. We illustrate its use using time series from the standard Rossler\u0000system in the period doubling transitions and the transition to chaos.\u0000Moreover, we find that the standard deviation of LD can be more effective in\u0000highlighting the transition points. We also consider the variations in data\u0000when the parameter of the system is varying due to internal or intrinsic\u0000perturbations but at a time scale much slower than that of the dynamics. In\u0000this case also, the measure LD and its standard deviation correctly detect\u0000transition points in the underlying dynamics of the system. The computation of\u0000LD requires minimal computing resources and time, and works well with short\u0000data sets. Hence, we propose this measure as a tool to track transitions in\u0000dynamics from data, facilitating quicker and more effective analysis of large\u0000number of data sets.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141192002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yeonju Go, Dmitrii Torbunov, Timothy Rinn, Yi Huang, Haiwang Yu, Brett Viren, Meifeng Lin, Yihui Ren, Jin Huang
Artificial intelligence (AI) generative models, such as generative adversarial networks (GANs), variational auto-encoders, and normalizing flows, have been widely used and studied as efficient alternatives for traditional scientific simulations. However, they have several drawbacks, including training instability and inability to cover the entire data distribution, especially for regions where data are rare. This is particularly challenging for whole-event, full-detector simulations in high-energy heavy-ion experiments, such as sPHENIX at the Relativistic Heavy Ion Collider and Large Hadron Collider experiments, where thousands of particles are produced per event and interact with the detector. This work investigates the effectiveness of Denoising Diffusion Probabilistic Models (DDPMs) as an AI-based generative surrogate model for the sPHENIX experiment that includes the heavy-ion event generation and response of the entire calorimeter stack. DDPM performance in sPHENIX simulation data is compared with a popular rival, GANs. Results show that both DDPMs and GANs can reproduce the data distribution where the examples are abundant (low-to-medium calorimeter energies). Nonetheless, DDPMs significantly outperform GANs, especially in high-energy regions where data are rare. Additionally, DDPMs exhibit superior stability compared to GANs. The results are consistent between both central and peripheral centrality heavy-ion collision events. Moreover, DDPMs offer a substantial speedup of approximately a factor of 100 compared to the traditional Geant4 simulation method.
{"title":"Effectiveness of denoising diffusion probabilistic models for fast and high-fidelity whole-event simulation in high-energy heavy-ion experiments","authors":"Yeonju Go, Dmitrii Torbunov, Timothy Rinn, Yi Huang, Haiwang Yu, Brett Viren, Meifeng Lin, Yihui Ren, Jin Huang","doi":"arxiv-2406.01602","DOIUrl":"https://doi.org/arxiv-2406.01602","url":null,"abstract":"Artificial intelligence (AI) generative models, such as generative\u0000adversarial networks (GANs), variational auto-encoders, and normalizing flows,\u0000have been widely used and studied as efficient alternatives for traditional\u0000scientific simulations. However, they have several drawbacks, including\u0000training instability and inability to cover the entire data distribution,\u0000especially for regions where data are rare. This is particularly challenging\u0000for whole-event, full-detector simulations in high-energy heavy-ion\u0000experiments, such as sPHENIX at the Relativistic Heavy Ion Collider and Large\u0000Hadron Collider experiments, where thousands of particles are produced per\u0000event and interact with the detector. This work investigates the effectiveness\u0000of Denoising Diffusion Probabilistic Models (DDPMs) as an AI-based generative\u0000surrogate model for the sPHENIX experiment that includes the heavy-ion event\u0000generation and response of the entire calorimeter stack. DDPM performance in\u0000sPHENIX simulation data is compared with a popular rival, GANs. Results show\u0000that both DDPMs and GANs can reproduce the data distribution where the examples\u0000are abundant (low-to-medium calorimeter energies). Nonetheless, DDPMs\u0000significantly outperform GANs, especially in high-energy regions where data are\u0000rare. Additionally, DDPMs exhibit superior stability compared to GANs. The\u0000results are consistent between both central and peripheral centrality heavy-ion\u0000collision events. Moreover, DDPMs offer a substantial speedup of approximately\u0000a factor of 100 compared to the traditional Geant4 simulation method.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"127 19-20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141253745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lu Wang, Hong-Yu Chen, Xiangyu Lyu, En-Kun Li, Yi-Ming Hu
Space-borne gravitational wave detectors like TianQin might encounter data gaps due to factors like micro-meteoroid collisions or hardware failures. Such glitches will cause discontinuity in the data and have been observed in the LISA Pathfinder. The existence of such data gaps presents challenges to the data analysis for TianQin, especially for massive black hole binary mergers, since its signal-to-noise ratio (SNR) accumulates in a non-linear way, a gap near the merger could lead to significant loss of SNR. It could introduce bias in the estimate of noise properties, and furthermore the results of the parameter estimation. In this work, using simulated TianQin data with injected a massive black hole binary merger, we study the window function method, and for the first time, the inpainting method to cope with the data gap, and an iterative estimate scheme is designed to properly estimate the noise spectrum. We find that both methods can properly estimate noise and signal parameters. The easy-to-implement window function method can already perform well, except that it will sacrifice some SNR due to the adoption of the window. The inpainting method is slower, but it can minimize the impact of the data gap.
{"title":"Window and inpainting: dealing with data gaps for TianQin","authors":"Lu Wang, Hong-Yu Chen, Xiangyu Lyu, En-Kun Li, Yi-Ming Hu","doi":"arxiv-2405.14274","DOIUrl":"https://doi.org/arxiv-2405.14274","url":null,"abstract":"Space-borne gravitational wave detectors like TianQin might encounter data\u0000gaps due to factors like micro-meteoroid collisions or hardware failures. Such\u0000glitches will cause discontinuity in the data and have been observed in the\u0000LISA Pathfinder. The existence of such data gaps presents challenges to the\u0000data analysis for TianQin, especially for massive black hole binary mergers,\u0000since its signal-to-noise ratio (SNR) accumulates in a non-linear way, a gap\u0000near the merger could lead to significant loss of SNR. It could introduce bias\u0000in the estimate of noise properties, and furthermore the results of the\u0000parameter estimation. In this work, using simulated TianQin data with injected\u0000a massive black hole binary merger, we study the window function method, and\u0000for the first time, the inpainting method to cope with the data gap, and an\u0000iterative estimate scheme is designed to properly estimate the noise spectrum.\u0000We find that both methods can properly estimate noise and signal parameters.\u0000The easy-to-implement window function method can already perform well, except\u0000that it will sacrifice some SNR due to the adoption of the window. The\u0000inpainting method is slower, but it can minimize the impact of the data gap.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141152901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}