Iacopo Tirelli, Miguel Alfonso Mendez, Andrea Ianiro, Stefano Discetti
Complex phenomena can be better understood when broken down into a limited number of simpler "components". Linear statistical methods such as the principal component analysis and its variants are widely used across various fields of applied science to identify and rank these components based on the variance they represent in the data. These methods can be seen as factorizations of the matrix collecting all the data, which are assumed to be a collection of time series sampled from fixed points in space. However, when data sampling locations vary over time, as with mobile monitoring stations in meteorology and oceanography or with particle tracking velocimetry in experimental fluid dynamics, advanced interpolation techniques are required to project the data onto a fixed grid before carrying out the factorization. This interpolation is often expensive and inaccurate. This work proposes a method to decompose scattered data without interpolating. The approach is based on physics-constrained radial basis function regression to compute inner products in space and time. The method provides an analytical and mesh-independent decomposition in space and time, demonstrating higher accuracy than the traditional approach. Our results show that it is possible to distill the most relevant "components" even for measurements whose natural output is a distribution of data scattered in space and time, maintaining high accuracy and mesh independence.
{"title":"A meshless method to compute the proper orthogonal decomposition and its variants from scattered data","authors":"Iacopo Tirelli, Miguel Alfonso Mendez, Andrea Ianiro, Stefano Discetti","doi":"arxiv-2407.03173","DOIUrl":"https://doi.org/arxiv-2407.03173","url":null,"abstract":"Complex phenomena can be better understood when broken down into a limited\u0000number of simpler \"components\". Linear statistical methods such as the\u0000principal component analysis and its variants are widely used across various\u0000fields of applied science to identify and rank these components based on the\u0000variance they represent in the data. These methods can be seen as\u0000factorizations of the matrix collecting all the data, which are assumed to be a\u0000collection of time series sampled from fixed points in space. However, when\u0000data sampling locations vary over time, as with mobile monitoring stations in\u0000meteorology and oceanography or with particle tracking velocimetry in\u0000experimental fluid dynamics, advanced interpolation techniques are required to\u0000project the data onto a fixed grid before carrying out the factorization. This\u0000interpolation is often expensive and inaccurate. This work proposes a method to\u0000decompose scattered data without interpolating. The approach is based on\u0000physics-constrained radial basis function regression to compute inner products\u0000in space and time. The method provides an analytical and mesh-independent\u0000decomposition in space and time, demonstrating higher accuracy than the\u0000traditional approach. Our results show that it is possible to distill the most\u0000relevant \"components\" even for measurements whose natural output is a\u0000distribution of data scattered in space and time, maintaining high accuracy and\u0000mesh independence.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141548088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Valérian Jacques-Dumas, Henk A. Dijkstra, Christian Kuehn
We address the issue of resilience of the Atlantic Meridional Overturning Circulation (AMOC) given the many indications that this dynamical system is in a multi-stable regime. A novel approach to resilience based on rare event techniques is presented which leads to a measure capturing `resistance to change` and `ability to return' aspects in a probabilistic way. The application of this measure to a conceptual model demonstrates its suitability for assessing AMOC resilience but also shows its potential use in many other non-autonomous dynamical systems. This framework is then extended to compute the probability that the AMOC undergoes a transition conditioned on an external forcing. Such conditional probability can be estimated by exploiting the information available when computing the resilience of this system. This allows us to provide a probabilistic view on safe operating spaces by defining a conditional safe operating space as a subset of the parameter space of the (possibly transient) imposed forcing.
{"title":"Resilience of the Atlantic Meridional Overturning Circulation","authors":"Valérian Jacques-Dumas, Henk A. Dijkstra, Christian Kuehn","doi":"arxiv-2407.04740","DOIUrl":"https://doi.org/arxiv-2407.04740","url":null,"abstract":"We address the issue of resilience of the Atlantic Meridional Overturning\u0000Circulation (AMOC) given the many indications that this dynamical system is in\u0000a multi-stable regime. A novel approach to resilience based on rare event\u0000techniques is presented which leads to a measure capturing `resistance to\u0000change` and `ability to return' aspects in a probabilistic way. The application\u0000of this measure to a conceptual model demonstrates its suitability for\u0000assessing AMOC resilience but also shows its potential use in many other\u0000non-autonomous dynamical systems. This framework is then extended to compute\u0000the probability that the AMOC undergoes a transition conditioned on an external\u0000forcing. Such conditional probability can be estimated by exploiting the\u0000information available when computing the resilience of this system. This allows\u0000us to provide a probabilistic view on safe operating spaces by defining a\u0000conditional safe operating space as a subset of the parameter space of the\u0000(possibly transient) imposed forcing.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141575444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Signal-agnostic data exploration based on machine learning could unveil very subtle statistical deviations of collider data from the expected Standard Model of particle physics. The beneficial impact of a large training sample on machine learning solutions motivates the exploration of increasingly large and inclusive samples of acquired data with resource efficient computational methods. In this work we consider the New Physics Learning Machine (NPLM), a multivariate goodness-of-fit test built on the Neyman-Pearson maximum-likelihood-ratio construction, and we address the problem of testing large size samples under computational and storage resource constraints. We propose to perform parallel NPLM routines over batches of the data, and to combine them by locally aggregating over the data-to-reference density ratios learnt by each batch. The resulting data hypothesis defining the likelihood-ratio test is thus shared over the batches, and complies with the assumption that the expected rate of new physical processes is time invariant. We show that this method outperforms the simple sum of the independent tests run over the batches, and can recover, or even surpass, the sensitivity of the single test run over the full data. Beside the significant advantage for the offline application of NPLM to large size samples, the proposed approach offers new prospects toward the use of NPLM to construct anomaly-aware summary statistics in quasi-online data streaming scenarios.
{"title":"Anomaly-aware summary statistic from data batches","authors":"Gaia Grosso","doi":"arxiv-2407.01249","DOIUrl":"https://doi.org/arxiv-2407.01249","url":null,"abstract":"Signal-agnostic data exploration based on machine learning could unveil very\u0000subtle statistical deviations of collider data from the expected Standard Model\u0000of particle physics. The beneficial impact of a large training sample on\u0000machine learning solutions motivates the exploration of increasingly large and\u0000inclusive samples of acquired data with resource efficient computational\u0000methods. In this work we consider the New Physics Learning Machine (NPLM), a\u0000multivariate goodness-of-fit test built on the Neyman-Pearson\u0000maximum-likelihood-ratio construction, and we address the problem of testing\u0000large size samples under computational and storage resource constraints. We\u0000propose to perform parallel NPLM routines over batches of the data, and to\u0000combine them by locally aggregating over the data-to-reference density ratios\u0000learnt by each batch. The resulting data hypothesis defining the\u0000likelihood-ratio test is thus shared over the batches, and complies with the\u0000assumption that the expected rate of new physical processes is time invariant.\u0000We show that this method outperforms the simple sum of the independent tests\u0000run over the batches, and can recover, or even surpass, the sensitivity of the\u0000single test run over the full data. Beside the significant advantage for the\u0000offline application of NPLM to large size samples, the proposed approach offers\u0000new prospects toward the use of NPLM to construct anomaly-aware summary\u0000statistics in quasi-online data streaming scenarios.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mads Carlsen, Florencia Malamud, Peter Modregger, Anna Wildeis, Markus Hartmann, Robert Brandt, Andreas Menzel, Marianne Liebi
We demonstrate a novel approach to the reconstruction of scanning probe x-ray diffraction tomography data with anisotropic poly crystalline samples. The method involves reconstructing a voxel map containing an orientation distribution function in each voxel of an extended 3D sample. This method differs from existing approaches by not relying on a peak-finding and is therefore applicable to sample systems consisting of small and highly mosaic crystalline domains that are not handled well by existing methods. Samples of interest include bio-minerals and a range of small-graines microstructures common in engineering metals. By choosing a particular kind of basis functions, we can effectively utilize non-negativity in orientation-space for samples with sparse texture. This enables us to achieve stable solutions at high angular resolutions where the problem would otherwise be under determined. We demonstrate the new approach using data from a shot peened martensite sample where we are able to map the twinning micro structure in the interior of a bulk sample without resolving the individual lattice domains. We also demonstrate the approach on a piece of gastropods shell with a mosaic micro structure.
我们展示了一种利用各向异性多晶体样品重建扫描探针 X 射线衍射断层扫描数据的新方法。该方法包括重建一个体素图,其中包含扩展三维样本中每个体素的取向分布函数。这种方法与现有方法不同,它不依赖于峰值发现,因此适用于现有方法无法很好处理的由小的高度镶嵌结晶域组成的样品系统。感兴趣的样品包括生物矿物和工程金属中常见的一系列小晶粒微结构。通过选择特定类型的基函数,我们可以有效地利用取向空间的非负性来处理纹理稀疏的样本。这使我们能够在高角度分辨率下获得稳定的解决方案,否则问题将无法确定。我们使用喷丸马氏体样品的数据演示了这种新方法,在该样品中,我们无需解析单个晶格域就能绘制出大量样品内部的孪晶微结构。我们还在一块具有马赛克微结构的腹足类贝壳上演示了这种方法。
{"title":"Texture tomography with high angular resolution utilizing sparsity","authors":"Mads Carlsen, Florencia Malamud, Peter Modregger, Anna Wildeis, Markus Hartmann, Robert Brandt, Andreas Menzel, Marianne Liebi","doi":"arxiv-2407.11022","DOIUrl":"https://doi.org/arxiv-2407.11022","url":null,"abstract":"We demonstrate a novel approach to the reconstruction of scanning probe x-ray\u0000diffraction tomography data with anisotropic poly crystalline samples. The\u0000method involves reconstructing a voxel map containing an orientation\u0000distribution function in each voxel of an extended 3D sample. This method\u0000differs from existing approaches by not relying on a peak-finding and is\u0000therefore applicable to sample systems consisting of small and highly mosaic\u0000crystalline domains that are not handled well by existing methods. Samples of\u0000interest include bio-minerals and a range of small-graines microstructures\u0000common in engineering metals. By choosing a particular kind of basis functions,\u0000we can effectively utilize non-negativity in orientation-space for samples with\u0000sparse texture. This enables us to achieve stable solutions at high angular\u0000resolutions where the problem would otherwise be under determined. We\u0000demonstrate the new approach using data from a shot peened martensite sample\u0000where we are able to map the twinning micro structure in the interior of a bulk\u0000sample without resolving the individual lattice domains. We also demonstrate\u0000the approach on a piece of gastropods shell with a mosaic micro structure.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study provides a summary of the theory which enables the analysis of extreme values, i.e., of measurements acquired from the observation of extraordinary/rare physical phenomena. The formalism is developed in a transparent way, tailored to the application to real-world problems. Three examples of the application of the theory are detailed: first, an old problem, relating to the return period of floods caused by the Rh{^o}ne river, is revisited; second, the frequency of occurrence and the severity of meteorite impacts on the Earth are examined; the data, which are analysed in the third example, relate to the longevity of supercentenarians.
{"title":"Extreme-value Statistics: Rudiments and applications","authors":"Evangelos Matsinos","doi":"arxiv-2407.00725","DOIUrl":"https://doi.org/arxiv-2407.00725","url":null,"abstract":"This study provides a summary of the theory which enables the analysis of\u0000extreme values, i.e., of measurements acquired from the observation of\u0000extraordinary/rare physical phenomena. The formalism is developed in a\u0000transparent way, tailored to the application to real-world problems. Three\u0000examples of the application of the theory are detailed: first, an old problem,\u0000relating to the return period of floods caused by the Rh{^o}ne river, is\u0000revisited; second, the frequency of occurrence and the severity of meteorite\u0000impacts on the Earth are examined; the data, which are analysed in the third\u0000example, relate to the longevity of supercentenarians.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Florian Kraushofer, Alexander M. Imre, Giada Franceschi, Tilman Kißlinger, Erik Rheinfrank, Michael Schmid, Ulrike Diebold, Lutz Hammer, Michele Riva
Low-energy electron diffraction (LEED) is a widely used technique in surface-science. Yet, it is rarely used to its full potential. The quantitative information about the surface structure, contained in the modulation of the intensities of the diffracted beams as a function of incident electron energy, LEED I(V), is underutilized. To acquire these data, minor adjustments would be required in most experimental setups, but existing analysis software is cumbersome to use. ViPErLEED (Vienna package for Erlangen LEED) lowers these barriers, introducing a combined solution for data acquisition, extraction, and computational analysis. These parts are discussed in three separate publications. Here, the focus is on the computational part of ViPErLEED, which performs automated LEED-I(V) calculations and structural optimization. Minimal user input is required, and the functionality is significantly enhanced compared to existing solutions. Computation is performed by embedding the Erlangen tensor-LEED package (TensErLEED). ViPErLEED manages parallelization, monitors convergence, and processes input and output. This makes LEED I(V) more accessible to new users while minimizing the potential for errors and the manual labor. Added functionality includes structure-dependent defaults, automatic detection of bulk and surface symmetries and their relationship, automated symmetry-preserving search procedures, adjustments to the TensErLEED code to handle larger systems, as well as parallelization and optimization. Modern file formats are used as input and output, and there is a direct interface to the Atomic Simulation Environment (ASE) package. The software is implemented primarily in Python (version >=3.7) and provided as an open-source package (GNU GPLv3 or later). A structure determination of the $alpha$-Fe2O3(1-102)-(1x1) surface is presented as an example for the application of the software.
{"title":"ViPErLEED package I: Calculation of $I(V)$ curves and structural optimization","authors":"Florian Kraushofer, Alexander M. Imre, Giada Franceschi, Tilman Kißlinger, Erik Rheinfrank, Michael Schmid, Ulrike Diebold, Lutz Hammer, Michele Riva","doi":"arxiv-2406.18821","DOIUrl":"https://doi.org/arxiv-2406.18821","url":null,"abstract":"Low-energy electron diffraction (LEED) is a widely used technique in\u0000surface-science. Yet, it is rarely used to its full potential. The quantitative\u0000information about the surface structure, contained in the modulation of the\u0000intensities of the diffracted beams as a function of incident electron energy,\u0000LEED I(V), is underutilized. To acquire these data, minor adjustments would be\u0000required in most experimental setups, but existing analysis software is\u0000cumbersome to use. ViPErLEED (Vienna package for Erlangen LEED) lowers these\u0000barriers, introducing a combined solution for data acquisition, extraction, and\u0000computational analysis. These parts are discussed in three separate\u0000publications. Here, the focus is on the computational part of ViPErLEED, which\u0000performs automated LEED-I(V) calculations and structural optimization. Minimal\u0000user input is required, and the functionality is significantly enhanced\u0000compared to existing solutions. Computation is performed by embedding the\u0000Erlangen tensor-LEED package (TensErLEED). ViPErLEED manages parallelization,\u0000monitors convergence, and processes input and output. This makes LEED I(V) more\u0000accessible to new users while minimizing the potential for errors and the\u0000manual labor. Added functionality includes structure-dependent defaults,\u0000automatic detection of bulk and surface symmetries and their relationship,\u0000automated symmetry-preserving search procedures, adjustments to the TensErLEED\u0000code to handle larger systems, as well as parallelization and optimization.\u0000Modern file formats are used as input and output, and there is a direct\u0000interface to the Atomic Simulation Environment (ASE) package. The software is\u0000implemented primarily in Python (version >=3.7) and provided as an open-source\u0000package (GNU GPLv3 or later). A structure determination of the\u0000$alpha$-Fe2O3(1-102)-(1x1) surface is presented as an example for the\u0000application of the software.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present techniques for estimating the effects of systematic uncertainties in unbinned data analyses at the LHC. Our primary focus is constraining the Wilson coefficients in the standard model effective field theory (SMEFT), but the methodology applies to broader parametric models of phenomena beyond the standard model (BSM). We elevate the well-established procedures for binned Poisson counting experiments to the unbinned case by utilizing machine-learned surrogates of the likelihood ratio. This approach can be applied to various theoretical, modeling, and experimental uncertainties. By establishing a common statistical framework for BSM and systematic effects, we lay the groundwork for future unbinned analyses at the LHC. Additionally, we introduce a novel tree-boosting algorithm capable of learning highly accurate parameterizations of systematic effects. This algorithm extends the existing toolkit with a versatile and robust alternative. We demonstrate our approach using the example of an SMEFT interpretation of highly energetic top quark pair production in proton-proton collisions.
{"title":"Refinable modeling for unbinned SMEFT analyses","authors":"Robert Schöfbeck","doi":"arxiv-2406.19076","DOIUrl":"https://doi.org/arxiv-2406.19076","url":null,"abstract":"We present techniques for estimating the effects of systematic uncertainties\u0000in unbinned data analyses at the LHC. Our primary focus is constraining the\u0000Wilson coefficients in the standard model effective field theory (SMEFT), but\u0000the methodology applies to broader parametric models of phenomena beyond the\u0000standard model (BSM). We elevate the well-established procedures for binned\u0000Poisson counting experiments to the unbinned case by utilizing machine-learned\u0000surrogates of the likelihood ratio. This approach can be applied to various\u0000theoretical, modeling, and experimental uncertainties. By establishing a common\u0000statistical framework for BSM and systematic effects, we lay the groundwork for\u0000future unbinned analyses at the LHC. Additionally, we introduce a novel\u0000tree-boosting algorithm capable of learning highly accurate parameterizations\u0000of systematic effects. This algorithm extends the existing toolkit with a\u0000versatile and robust alternative. We demonstrate our approach using the example\u0000of an SMEFT interpretation of highly energetic top quark pair production in\u0000proton-proton collisions.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Exponential increases in scientific experimental data are outstripping the rate of progress in silicon technology. As a result, heterogeneous combinations of architectures and process or device technologies are increasingly important to meet the computing demands of future scientific experiments. However, the complexity of heterogeneous computing systems requires systematic modeling to understand performance. We present a model which addresses this need by framing key aspects of data collection pipelines and constraints, and combines them with the important vectors of technology that shape alternatives, computing metrics that allow complex alternatives to be compared. For instance, a data collection pipeline may be characterized by parameters such as sensor sampling rates, amount of data collected, and the overall relevancy of retrieved samples. Alternatives to this pipeline are enabled by hardware development vectors including advancing CMOS, GPUs, neuromorphic computing, and edge computing. By calculating metrics for each alternative such as overall F1 score, power, hardware cost, and energy expended per relevant sample, this model allows alternate data collection systems to be rigorously compared. To demonstrate this model's capability, we apply it to the CMS experiment (and planned HL-LHC upgrade) to evaluate and compare the application of novel technologies in the data acquisition system (DAQ). We demonstrate that improvements to early stages in the DAQ are highly beneficial, greatly reducing the resources required at later stages of processing (such as a 60% power reduction) and increasing the amount of relevant data retrieved from the experiment per unit power (improving from 0.065 to 0.31 samples/kJ) However, we predict further advances will be required in order to meet overall power and cost constraints for the DAQ.
{"title":"Modeling Performance of Data Collection Systems for High-Energy Physics","authors":"Wilkie Olin-Ammentorp, Xingfu Wu, Andrew A. Chien","doi":"arxiv-2407.00123","DOIUrl":"https://doi.org/arxiv-2407.00123","url":null,"abstract":"Exponential increases in scientific experimental data are outstripping the\u0000rate of progress in silicon technology. As a result, heterogeneous combinations\u0000of architectures and process or device technologies are increasingly important\u0000to meet the computing demands of future scientific experiments. However, the\u0000complexity of heterogeneous computing systems requires systematic modeling to\u0000understand performance. We present a model which addresses this need by framing key aspects of data\u0000collection pipelines and constraints, and combines them with the important\u0000vectors of technology that shape alternatives, computing metrics that allow\u0000complex alternatives to be compared. For instance, a data collection pipeline\u0000may be characterized by parameters such as sensor sampling rates, amount of\u0000data collected, and the overall relevancy of retrieved samples. Alternatives to\u0000this pipeline are enabled by hardware development vectors including advancing\u0000CMOS, GPUs, neuromorphic computing, and edge computing. By calculating metrics\u0000for each alternative such as overall F1 score, power, hardware cost, and energy\u0000expended per relevant sample, this model allows alternate data collection\u0000systems to be rigorously compared. To demonstrate this model's capability, we apply it to the CMS experiment\u0000(and planned HL-LHC upgrade) to evaluate and compare the application of novel\u0000technologies in the data acquisition system (DAQ). We demonstrate that\u0000improvements to early stages in the DAQ are highly beneficial, greatly reducing\u0000the resources required at later stages of processing (such as a 60% power\u0000reduction) and increasing the amount of relevant data retrieved from the\u0000experiment per unit power (improving from 0.065 to 0.31 samples/kJ) However, we\u0000predict further advances will be required in order to meet overall power and\u0000cost constraints for the DAQ.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Schmid, Florian Kraushofer, Alexander M. Imre, Tilman Kißlinger, Lutz Hammer, Ulrike Diebold, Michele Riva
As part of the ViPErLEED project (Vienna package for Erlangen LEED, low-energy electron diffraction), computer programs have been developed for facile and user-friendly data extraction from movies of LEED images. The programs make use of some concepts from astronomical image processing and analysis. As a first step, flat-field and dark-frame corrections reduce the effects of inhomogeneities of the camera and screen. In a second step, for identifying all diffraction maxima ("spots"), it is sufficient to manually mark and label a single spot or very few spots. Then the program can automatically identify all other spots and determine the distortions of the image. This forms the basis for automatic spot tracking (following the "beams" as they move across the LEED screen) and intensity measurement. Even for complex structures with hundreds to a few thousand diffraction beams, this step takes less than a minute. The package also includes a program for further processing of these I(V) curves (averaging of equivalent beams, manual and/or automatic selection, smoothing) as well as several utilities. The software is implemented as a set of plugins for the public-domain image processing program ImageJ and provided as an open-source package.
{"title":"ViPErLEED package II: Spot tracking, extraction and processing of I(V) curves","authors":"Michael Schmid, Florian Kraushofer, Alexander M. Imre, Tilman Kißlinger, Lutz Hammer, Ulrike Diebold, Michele Riva","doi":"arxiv-2406.18413","DOIUrl":"https://doi.org/arxiv-2406.18413","url":null,"abstract":"As part of the ViPErLEED project (Vienna package for Erlangen LEED,\u0000low-energy electron diffraction), computer programs have been developed for\u0000facile and user-friendly data extraction from movies of LEED images. The\u0000programs make use of some concepts from astronomical image processing and\u0000analysis. As a first step, flat-field and dark-frame corrections reduce the\u0000effects of inhomogeneities of the camera and screen. In a second step, for\u0000identifying all diffraction maxima (\"spots\"), it is sufficient to manually mark\u0000and label a single spot or very few spots. Then the program can automatically\u0000identify all other spots and determine the distortions of the image. This forms\u0000the basis for automatic spot tracking (following the \"beams\" as they move\u0000across the LEED screen) and intensity measurement. Even for complex structures\u0000with hundreds to a few thousand diffraction beams, this step takes less than a\u0000minute. The package also includes a program for further processing of these\u0000I(V) curves (averaging of equivalent beams, manual and/or automatic selection,\u0000smoothing) as well as several utilities. The software is implemented as a set\u0000of plugins for the public-domain image processing program ImageJ and provided\u0000as an open-source package.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. E. LosImperial College London, C. ArranUniversity of York, E. GerstmayrImperial College LondonQueens University BelfastSLAC National Accelerator Laboratory, M. J. V. StreeterQueens University Belfast, Z. NajmudinImperial College London, C. P. RidgersUniversity of York, G. SarriQueens University Belfast, S. P. D ManglesImperial College London
Recent experiments aiming to measure phenomena predicted by strong field quantum electrodynamics have done so by colliding relativistic electron beams and high-power lasers. In such experiments, measurements of the collision parameters are not always feasible, however, precise knowledge of these parameters is required for accurate tests of strong-field quantum electrodynamics. Here, we present a novel Bayesian inference procedure which infers collision parameters that could not be measured on-shot. This procedure is applicable to all-optical non-linear Compton scattering experiments investigating radiation reaction. The framework allows multiple diagnostics to be combined self-consistently and facilitates the inclusion of prior or known information pertaining to the collision parameters. Using this Bayesian analysis, the relative validity of the classical, quantum-continuous and quantum-stochastic models of radiation reaction were compared for a series of test cases, which demonstrate the accuracy and model selection capability of the framework and and highlight its robustness in the event that the experimental values of fixed parameters differ from their values in the models.
{"title":"A Bayesian Framework to Investigate Radiation Reaction in Strong Fields","authors":"E. E. LosImperial College London, C. ArranUniversity of York, E. GerstmayrImperial College LondonQueens University BelfastSLAC National Accelerator Laboratory, M. J. V. StreeterQueens University Belfast, Z. NajmudinImperial College London, C. P. RidgersUniversity of York, G. SarriQueens University Belfast, S. P. D ManglesImperial College London","doi":"arxiv-2406.19420","DOIUrl":"https://doi.org/arxiv-2406.19420","url":null,"abstract":"Recent experiments aiming to measure phenomena predicted by strong field\u0000quantum electrodynamics have done so by colliding relativistic electron beams\u0000and high-power lasers. In such experiments, measurements of the collision\u0000parameters are not always feasible, however, precise knowledge of these\u0000parameters is required for accurate tests of strong-field quantum\u0000electrodynamics. Here, we present a novel Bayesian inference procedure which\u0000infers collision parameters that could not be measured on-shot. This procedure\u0000is applicable to all-optical non-linear Compton scattering experiments\u0000investigating radiation reaction. The framework allows multiple diagnostics to\u0000be combined self-consistently and facilitates the inclusion of prior or known\u0000information pertaining to the collision parameters. Using this Bayesian\u0000analysis, the relative validity of the classical, quantum-continuous and\u0000quantum-stochastic models of radiation reaction were compared for a series of\u0000test cases, which demonstrate the accuracy and model selection capability of\u0000the framework and and highlight its robustness in the event that the\u0000experimental values of fixed parameters differ from their values in the models.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}