首页 > 最新文献

arXiv - PHYS - Data Analysis, Statistics and Probability最新文献

英文 中文
A meshless method to compute the proper orthogonal decomposition and its variants from scattered data 从零散数据计算适当正交分解及其变体的无网格方法
Pub Date : 2024-07-03 DOI: arxiv-2407.03173
Iacopo Tirelli, Miguel Alfonso Mendez, Andrea Ianiro, Stefano Discetti
Complex phenomena can be better understood when broken down into a limitednumber of simpler "components". Linear statistical methods such as theprincipal component analysis and its variants are widely used across variousfields of applied science to identify and rank these components based on thevariance they represent in the data. These methods can be seen asfactorizations of the matrix collecting all the data, which are assumed to be acollection of time series sampled from fixed points in space. However, whendata sampling locations vary over time, as with mobile monitoring stations inmeteorology and oceanography or with particle tracking velocimetry inexperimental fluid dynamics, advanced interpolation techniques are required toproject the data onto a fixed grid before carrying out the factorization. Thisinterpolation is often expensive and inaccurate. This work proposes a method todecompose scattered data without interpolating. The approach is based onphysics-constrained radial basis function regression to compute inner productsin space and time. The method provides an analytical and mesh-independentdecomposition in space and time, demonstrating higher accuracy than thetraditional approach. Our results show that it is possible to distill the mostrelevant "components" even for measurements whose natural output is adistribution of data scattered in space and time, maintaining high accuracy andmesh independence.
如果将复杂现象分解成数量有限的较简单 "成分",就能更好地理解复杂现象。线性统计方法(如主要成分分析及其变体)被广泛应用于应用科学的各个领域,以根据数据中这些成分所代表的方差来识别和排列这些成分。这些方法可以看作是对收集所有数据的矩阵的因子化,而这些数据被假定为从空间定点采样的时间序列的集合。然而,当数据采样位置随时间变化时,如气象学和海洋学中的移动监测站,或实验流体动力学中的粒子跟踪测速仪,在进行因式分解之前,需要先进的插值技术将数据投影到固定网格上。这种插值往往成本高昂,而且不准确。本研究提出了一种无需插值的分散数据分解方法。该方法基于物理约束径向基函数回归来计算空间和时间的内积。该方法提供了一种分析性的、与网格无关的时空分解方法,与传统方法相比具有更高的精度。我们的结果表明,即使测量的自然输出是分散在空间和时间中的数据分布,也有可能提炼出最相关的 "成分",同时保持高精度和网格独立性。
{"title":"A meshless method to compute the proper orthogonal decomposition and its variants from scattered data","authors":"Iacopo Tirelli, Miguel Alfonso Mendez, Andrea Ianiro, Stefano Discetti","doi":"arxiv-2407.03173","DOIUrl":"https://doi.org/arxiv-2407.03173","url":null,"abstract":"Complex phenomena can be better understood when broken down into a limited\u0000number of simpler \"components\". Linear statistical methods such as the\u0000principal component analysis and its variants are widely used across various\u0000fields of applied science to identify and rank these components based on the\u0000variance they represent in the data. These methods can be seen as\u0000factorizations of the matrix collecting all the data, which are assumed to be a\u0000collection of time series sampled from fixed points in space. However, when\u0000data sampling locations vary over time, as with mobile monitoring stations in\u0000meteorology and oceanography or with particle tracking velocimetry in\u0000experimental fluid dynamics, advanced interpolation techniques are required to\u0000project the data onto a fixed grid before carrying out the factorization. This\u0000interpolation is often expensive and inaccurate. This work proposes a method to\u0000decompose scattered data without interpolating. The approach is based on\u0000physics-constrained radial basis function regression to compute inner products\u0000in space and time. The method provides an analytical and mesh-independent\u0000decomposition in space and time, demonstrating higher accuracy than the\u0000traditional approach. Our results show that it is possible to distill the most\u0000relevant \"components\" even for measurements whose natural output is a\u0000distribution of data scattered in space and time, maintaining high accuracy and\u0000mesh independence.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141548088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resilience of the Atlantic Meridional Overturning Circulation 大西洋经向翻转环流的复原力
Pub Date : 2024-07-02 DOI: arxiv-2407.04740
Valérian Jacques-Dumas, Henk A. Dijkstra, Christian Kuehn
We address the issue of resilience of the Atlantic Meridional OverturningCirculation (AMOC) given the many indications that this dynamical system is ina multi-stable regime. A novel approach to resilience based on rare eventtechniques is presented which leads to a measure capturing `resistance tochange` and `ability to return' aspects in a probabilistic way. The applicationof this measure to a conceptual model demonstrates its suitability forassessing AMOC resilience but also shows its potential use in many othernon-autonomous dynamical systems. This framework is then extended to computethe probability that the AMOC undergoes a transition conditioned on an externalforcing. Such conditional probability can be estimated by exploiting theinformation available when computing the resilience of this system. This allowsus to provide a probabilistic view on safe operating spaces by defining aconditional safe operating space as a subset of the parameter space of the(possibly transient) imposed forcing.
鉴于有许多迹象表明大西洋经向翻转环流(AMOC)处于多稳定状态,我们探讨了这一动态系统的恢复力问题。我们提出了一种基于罕见事件技术的新的复原力方法,该方法以概率方式捕捉 "对变化的抵抗力 "和 "回归能力"。将这一指标应用于概念模型,不仅证明了它适用于评估 AMOC 的恢复能力,而且还显示了它在许多其他非自主动态系统中的潜在用途。这一框架随后被扩展到计算 AMOC 在外部强迫条件下发生转变的概率。这种条件概率可以通过利用计算该系统弹性时获得的信息来估算。这样,我们就可以通过将有条件的安全运行空间定义为(可能是瞬时的)外加强迫参数空间的子集,来提供安全运行空间的概率观点。
{"title":"Resilience of the Atlantic Meridional Overturning Circulation","authors":"Valérian Jacques-Dumas, Henk A. Dijkstra, Christian Kuehn","doi":"arxiv-2407.04740","DOIUrl":"https://doi.org/arxiv-2407.04740","url":null,"abstract":"We address the issue of resilience of the Atlantic Meridional Overturning\u0000Circulation (AMOC) given the many indications that this dynamical system is in\u0000a multi-stable regime. A novel approach to resilience based on rare event\u0000techniques is presented which leads to a measure capturing `resistance to\u0000change` and `ability to return' aspects in a probabilistic way. The application\u0000of this measure to a conceptual model demonstrates its suitability for\u0000assessing AMOC resilience but also shows its potential use in many other\u0000non-autonomous dynamical systems. This framework is then extended to compute\u0000the probability that the AMOC undergoes a transition conditioned on an external\u0000forcing. Such conditional probability can be estimated by exploiting the\u0000information available when computing the resilience of this system. This allows\u0000us to provide a probabilistic view on safe operating spaces by defining a\u0000conditional safe operating space as a subset of the parameter space of the\u0000(possibly transient) imposed forcing.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141575444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Anomaly-aware summary statistic from data batches 数据批次的异常感知汇总统计
Pub Date : 2024-07-01 DOI: arxiv-2407.01249
Gaia Grosso
Signal-agnostic data exploration based on machine learning could unveil verysubtle statistical deviations of collider data from the expected Standard Modelof particle physics. The beneficial impact of a large training sample onmachine learning solutions motivates the exploration of increasingly large andinclusive samples of acquired data with resource efficient computationalmethods. In this work we consider the New Physics Learning Machine (NPLM), amultivariate goodness-of-fit test built on the Neyman-Pearsonmaximum-likelihood-ratio construction, and we address the problem of testinglarge size samples under computational and storage resource constraints. Wepropose to perform parallel NPLM routines over batches of the data, and tocombine them by locally aggregating over the data-to-reference density ratioslearnt by each batch. The resulting data hypothesis defining thelikelihood-ratio test is thus shared over the batches, and complies with theassumption that the expected rate of new physical processes is time invariant.We show that this method outperforms the simple sum of the independent testsrun over the batches, and can recover, or even surpass, the sensitivity of thesingle test run over the full data. Beside the significant advantage for theoffline application of NPLM to large size samples, the proposed approach offersnew prospects toward the use of NPLM to construct anomaly-aware summarystatistics in quasi-online data streaming scenarios.
基于机器学习的信号识别数据探索可以揭示对撞机数据与预期粒子物理标准模型之间非常微妙的统计偏差。大量训练样本对机器学习解决方案的有利影响,促使人们利用资源高效的计算方法,探索越来越多的大型、包容性数据样本。在这项工作中,我们考虑了新物理学习机(NPLM)--一种基于奈曼-皮尔逊最大似然比结构的多变量拟合优度测试,并解决了在计算和存储资源限制下测试大型样本的问题。我们建议对成批数据执行并行 NPLM 例程,并通过对每批数据所获得的数据与参考密度比进行局部聚合来合并它们。我们的研究表明,这种方法优于在各批次数据上运行的独立检验的简单总和,可以恢复甚至超过在全部数据上运行的单一检验的灵敏度。除了 NPLM 在大样本离线应用中的显著优势外,所提出的方法还为在准在线数据流场景中使用 NPLM 构建异常感知汇总统计提供了新的前景。
{"title":"Anomaly-aware summary statistic from data batches","authors":"Gaia Grosso","doi":"arxiv-2407.01249","DOIUrl":"https://doi.org/arxiv-2407.01249","url":null,"abstract":"Signal-agnostic data exploration based on machine learning could unveil very\u0000subtle statistical deviations of collider data from the expected Standard Model\u0000of particle physics. The beneficial impact of a large training sample on\u0000machine learning solutions motivates the exploration of increasingly large and\u0000inclusive samples of acquired data with resource efficient computational\u0000methods. In this work we consider the New Physics Learning Machine (NPLM), a\u0000multivariate goodness-of-fit test built on the Neyman-Pearson\u0000maximum-likelihood-ratio construction, and we address the problem of testing\u0000large size samples under computational and storage resource constraints. We\u0000propose to perform parallel NPLM routines over batches of the data, and to\u0000combine them by locally aggregating over the data-to-reference density ratios\u0000learnt by each batch. The resulting data hypothesis defining the\u0000likelihood-ratio test is thus shared over the batches, and complies with the\u0000assumption that the expected rate of new physical processes is time invariant.\u0000We show that this method outperforms the simple sum of the independent tests\u0000run over the batches, and can recover, or even surpass, the sensitivity of the\u0000single test run over the full data. Beside the significant advantage for the\u0000offline application of NPLM to large size samples, the proposed approach offers\u0000new prospects toward the use of NPLM to construct anomaly-aware summary\u0000statistics in quasi-online data streaming scenarios.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Texture tomography with high angular resolution utilizing sparsity 利用稀疏性实现高角度分辨率纹理断层成像
Pub Date : 2024-07-01 DOI: arxiv-2407.11022
Mads Carlsen, Florencia Malamud, Peter Modregger, Anna Wildeis, Markus Hartmann, Robert Brandt, Andreas Menzel, Marianne Liebi
We demonstrate a novel approach to the reconstruction of scanning probe x-raydiffraction tomography data with anisotropic poly crystalline samples. Themethod involves reconstructing a voxel map containing an orientationdistribution function in each voxel of an extended 3D sample. This methoddiffers from existing approaches by not relying on a peak-finding and istherefore applicable to sample systems consisting of small and highly mosaiccrystalline domains that are not handled well by existing methods. Samples ofinterest include bio-minerals and a range of small-graines microstructurescommon in engineering metals. By choosing a particular kind of basis functions,we can effectively utilize non-negativity in orientation-space for samples withsparse texture. This enables us to achieve stable solutions at high angularresolutions where the problem would otherwise be under determined. Wedemonstrate the new approach using data from a shot peened martensite samplewhere we are able to map the twinning micro structure in the interior of a bulksample without resolving the individual lattice domains. We also demonstratethe approach on a piece of gastropods shell with a mosaic micro structure.
我们展示了一种利用各向异性多晶体样品重建扫描探针 X 射线衍射断层扫描数据的新方法。该方法包括重建一个体素图,其中包含扩展三维样本中每个体素的取向分布函数。这种方法与现有方法不同,它不依赖于峰值发现,因此适用于现有方法无法很好处理的由小的高度镶嵌结晶域组成的样品系统。感兴趣的样品包括生物矿物和工程金属中常见的一系列小晶粒微结构。通过选择特定类型的基函数,我们可以有效地利用取向空间的非负性来处理纹理稀疏的样本。这使我们能够在高角度分辨率下获得稳定的解决方案,否则问题将无法确定。我们使用喷丸马氏体样品的数据演示了这种新方法,在该样品中,我们无需解析单个晶格域就能绘制出大量样品内部的孪晶微结构。我们还在一块具有马赛克微结构的腹足类贝壳上演示了这种方法。
{"title":"Texture tomography with high angular resolution utilizing sparsity","authors":"Mads Carlsen, Florencia Malamud, Peter Modregger, Anna Wildeis, Markus Hartmann, Robert Brandt, Andreas Menzel, Marianne Liebi","doi":"arxiv-2407.11022","DOIUrl":"https://doi.org/arxiv-2407.11022","url":null,"abstract":"We demonstrate a novel approach to the reconstruction of scanning probe x-ray\u0000diffraction tomography data with anisotropic poly crystalline samples. The\u0000method involves reconstructing a voxel map containing an orientation\u0000distribution function in each voxel of an extended 3D sample. This method\u0000differs from existing approaches by not relying on a peak-finding and is\u0000therefore applicable to sample systems consisting of small and highly mosaic\u0000crystalline domains that are not handled well by existing methods. Samples of\u0000interest include bio-minerals and a range of small-graines microstructures\u0000common in engineering metals. By choosing a particular kind of basis functions,\u0000we can effectively utilize non-negativity in orientation-space for samples with\u0000sparse texture. This enables us to achieve stable solutions at high angular\u0000resolutions where the problem would otherwise be under determined. We\u0000demonstrate the new approach using data from a shot peened martensite sample\u0000where we are able to map the twinning micro structure in the interior of a bulk\u0000sample without resolving the individual lattice domains. We also demonstrate\u0000the approach on a piece of gastropods shell with a mosaic micro structure.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extreme-value Statistics: Rudiments and applications 极值统计:基础与应用
Pub Date : 2024-06-30 DOI: arxiv-2407.00725
Evangelos Matsinos
This study provides a summary of the theory which enables the analysis ofextreme values, i.e., of measurements acquired from the observation ofextraordinary/rare physical phenomena. The formalism is developed in atransparent way, tailored to the application to real-world problems. Threeexamples of the application of the theory are detailed: first, an old problem,relating to the return period of floods caused by the Rh{^o}ne river, isrevisited; second, the frequency of occurrence and the severity of meteoriteimpacts on the Earth are examined; the data, which are analysed in the thirdexample, relate to the longevity of supercentenarians.
本研究概述了能够分析极值的理论,即通过观测超常/罕见物理现象获得的测量值。形式主义是以透明的方式发展起来的,适合应用于现实世界的问题。本文详细介绍了该理论的三个应用实例:首先,我们重温了一个老问题,它与 Rh{^o}ne 河引发的洪水的回归期有关;其次,我们研究了陨石撞击地球的发生频率和严重程度;第三个实例中分析的数据与超百岁老人的寿命有关。
{"title":"Extreme-value Statistics: Rudiments and applications","authors":"Evangelos Matsinos","doi":"arxiv-2407.00725","DOIUrl":"https://doi.org/arxiv-2407.00725","url":null,"abstract":"This study provides a summary of the theory which enables the analysis of\u0000extreme values, i.e., of measurements acquired from the observation of\u0000extraordinary/rare physical phenomena. The formalism is developed in a\u0000transparent way, tailored to the application to real-world problems. Three\u0000examples of the application of the theory are detailed: first, an old problem,\u0000relating to the return period of floods caused by the Rh{^o}ne river, is\u0000revisited; second, the frequency of occurrence and the severity of meteorite\u0000impacts on the Earth are examined; the data, which are analysed in the third\u0000example, relate to the longevity of supercentenarians.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ViPErLEED package I: Calculation of $I(V)$ curves and structural optimization ViPErLEED 软件包 I:I(V)$ 曲线计算和结构优化
Pub Date : 2024-06-27 DOI: arxiv-2406.18821
Florian Kraushofer, Alexander M. Imre, Giada Franceschi, Tilman Kißlinger, Erik Rheinfrank, Michael Schmid, Ulrike Diebold, Lutz Hammer, Michele Riva
Low-energy electron diffraction (LEED) is a widely used technique insurface-science. Yet, it is rarely used to its full potential. The quantitativeinformation about the surface structure, contained in the modulation of theintensities of the diffracted beams as a function of incident electron energy,LEED I(V), is underutilized. To acquire these data, minor adjustments would berequired in most experimental setups, but existing analysis software iscumbersome to use. ViPErLEED (Vienna package for Erlangen LEED) lowers thesebarriers, introducing a combined solution for data acquisition, extraction, andcomputational analysis. These parts are discussed in three separatepublications. Here, the focus is on the computational part of ViPErLEED, whichperforms automated LEED-I(V) calculations and structural optimization. Minimaluser input is required, and the functionality is significantly enhancedcompared to existing solutions. Computation is performed by embedding theErlangen tensor-LEED package (TensErLEED). ViPErLEED manages parallelization,monitors convergence, and processes input and output. This makes LEED I(V) moreaccessible to new users while minimizing the potential for errors and themanual labor. Added functionality includes structure-dependent defaults,automatic detection of bulk and surface symmetries and their relationship,automated symmetry-preserving search procedures, adjustments to the TensErLEEDcode to handle larger systems, as well as parallelization and optimization.Modern file formats are used as input and output, and there is a directinterface to the Atomic Simulation Environment (ASE) package. The software isimplemented primarily in Python (version >=3.7) and provided as an open-sourcepackage (GNU GPLv3 or later). A structure determination of the$alpha$-Fe2O3(1-102)-(1x1) surface is presented as an example for theapplication of the software.
低能电子衍射(LEED)是一种广泛应用于表面科学的技术。然而,它却很少被充分利用。LEED I(V)衍射光束的密度随入射电子能量的变化而变化,其中所包含的有关表面结构的定量信息尚未得到充分利用。要获取这些数据,只需对大多数实验装置稍作调整,但现有的分析软件使用起来非常麻烦。ViPErLEED(埃尔兰根 LEED 维也纳软件包)降低了这些障碍,为数据采集、提取和计算分析提供了综合解决方案。这些部分将在三本不同的出版物中讨论。这里的重点是 ViPErLEED 的计算部分,它可以自动执行 LEED-I(V)计算和结构优化。与现有的解决方案相比,ViPErLEED 只需用户输入最少的信息,功能却大大增强。计算是通过嵌入埃尔兰根张量-LEED 软件包(TensErLEED)进行的。ViPErLEED 可管理并行化、监控收敛并处理输入和输出。这使得新用户更容易使用 LEED I(V),同时最大限度地减少了可能出现的错误和人工劳动。新增功能包括与结构相关的默认设置、自动检测体对称性和表面对称性及其关系、自动对称性保留搜索程序、调整 TensErLEED 代码以处理更大的系统,以及并行化和优化。该软件主要用 Python(版本大于等于 3.7)实现,并作为开源软件包(GNU GPLv3 或更高版本)提供。现以$α$-Fe2O3(1-102)-(1x1)表面的结构测定为例,介绍该软件的应用。
{"title":"ViPErLEED package I: Calculation of $I(V)$ curves and structural optimization","authors":"Florian Kraushofer, Alexander M. Imre, Giada Franceschi, Tilman Kißlinger, Erik Rheinfrank, Michael Schmid, Ulrike Diebold, Lutz Hammer, Michele Riva","doi":"arxiv-2406.18821","DOIUrl":"https://doi.org/arxiv-2406.18821","url":null,"abstract":"Low-energy electron diffraction (LEED) is a widely used technique in\u0000surface-science. Yet, it is rarely used to its full potential. The quantitative\u0000information about the surface structure, contained in the modulation of the\u0000intensities of the diffracted beams as a function of incident electron energy,\u0000LEED I(V), is underutilized. To acquire these data, minor adjustments would be\u0000required in most experimental setups, but existing analysis software is\u0000cumbersome to use. ViPErLEED (Vienna package for Erlangen LEED) lowers these\u0000barriers, introducing a combined solution for data acquisition, extraction, and\u0000computational analysis. These parts are discussed in three separate\u0000publications. Here, the focus is on the computational part of ViPErLEED, which\u0000performs automated LEED-I(V) calculations and structural optimization. Minimal\u0000user input is required, and the functionality is significantly enhanced\u0000compared to existing solutions. Computation is performed by embedding the\u0000Erlangen tensor-LEED package (TensErLEED). ViPErLEED manages parallelization,\u0000monitors convergence, and processes input and output. This makes LEED I(V) more\u0000accessible to new users while minimizing the potential for errors and the\u0000manual labor. Added functionality includes structure-dependent defaults,\u0000automatic detection of bulk and surface symmetries and their relationship,\u0000automated symmetry-preserving search procedures, adjustments to the TensErLEED\u0000code to handle larger systems, as well as parallelization and optimization.\u0000Modern file formats are used as input and output, and there is a direct\u0000interface to the Atomic Simulation Environment (ASE) package. The software is\u0000implemented primarily in Python (version >=3.7) and provided as an open-source\u0000package (GNU GPLv3 or later). A structure determination of the\u0000$alpha$-Fe2O3(1-102)-(1x1) surface is presented as an example for the\u0000application of the software.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Refinable modeling for unbinned SMEFT analyses 非分层 SMEFT 分析的可细化建模
Pub Date : 2024-06-27 DOI: arxiv-2406.19076
Robert Schöfbeck
We present techniques for estimating the effects of systematic uncertaintiesin unbinned data analyses at the LHC. Our primary focus is constraining theWilson coefficients in the standard model effective field theory (SMEFT), butthe methodology applies to broader parametric models of phenomena beyond thestandard model (BSM). We elevate the well-established procedures for binnedPoisson counting experiments to the unbinned case by utilizing machine-learnedsurrogates of the likelihood ratio. This approach can be applied to varioustheoretical, modeling, and experimental uncertainties. By establishing a commonstatistical framework for BSM and systematic effects, we lay the groundwork forfuture unbinned analyses at the LHC. Additionally, we introduce a noveltree-boosting algorithm capable of learning highly accurate parameterizationsof systematic effects. This algorithm extends the existing toolkit with aversatile and robust alternative. We demonstrate our approach using the exampleof an SMEFT interpretation of highly energetic top quark pair production inproton-proton collisions.
我们介绍了在大型强子对撞机非分选数据分析中估算系统不确定性影响的技术。我们的主要重点是约束标准模型有效场理论(SMEFT)中的威尔逊系数,但这一方法也适用于标准模型(BSM)之外的更广泛的现象参数模型。我们利用机器学习的似然比代用指标,将二进制泊松计数实验的成熟程序提升到非二进制情况。这种方法适用于各种理论、建模和实验不确定性。通过为 BSM 和系统效应建立一个通用统计框架,我们为未来在大型强子对撞机上进行无分度分析奠定了基础。此外,我们还引入了一种新颖的三增强算法,能够学习高精度的系统效应参数。这种算法扩展了现有的工具包,提供了一种通用而稳健的替代方法。我们以质子-质子对撞中高能顶夸克对产生的 SMEFT 解释为例,演示了我们的方法。
{"title":"Refinable modeling for unbinned SMEFT analyses","authors":"Robert Schöfbeck","doi":"arxiv-2406.19076","DOIUrl":"https://doi.org/arxiv-2406.19076","url":null,"abstract":"We present techniques for estimating the effects of systematic uncertainties\u0000in unbinned data analyses at the LHC. Our primary focus is constraining the\u0000Wilson coefficients in the standard model effective field theory (SMEFT), but\u0000the methodology applies to broader parametric models of phenomena beyond the\u0000standard model (BSM). We elevate the well-established procedures for binned\u0000Poisson counting experiments to the unbinned case by utilizing machine-learned\u0000surrogates of the likelihood ratio. This approach can be applied to various\u0000theoretical, modeling, and experimental uncertainties. By establishing a common\u0000statistical framework for BSM and systematic effects, we lay the groundwork for\u0000future unbinned analyses at the LHC. Additionally, we introduce a novel\u0000tree-boosting algorithm capable of learning highly accurate parameterizations\u0000of systematic effects. This algorithm extends the existing toolkit with a\u0000versatile and robust alternative. We demonstrate our approach using the example\u0000of an SMEFT interpretation of highly energetic top quark pair production in\u0000proton-proton collisions.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Performance of Data Collection Systems for High-Energy Physics 高能物理数据采集系统性能建模
Pub Date : 2024-06-27 DOI: arxiv-2407.00123
Wilkie Olin-Ammentorp, Xingfu Wu, Andrew A. Chien
Exponential increases in scientific experimental data are outstripping therate of progress in silicon technology. As a result, heterogeneous combinationsof architectures and process or device technologies are increasingly importantto meet the computing demands of future scientific experiments. However, thecomplexity of heterogeneous computing systems requires systematic modeling tounderstand performance. We present a model which addresses this need by framing key aspects of datacollection pipelines and constraints, and combines them with the importantvectors of technology that shape alternatives, computing metrics that allowcomplex alternatives to be compared. For instance, a data collection pipelinemay be characterized by parameters such as sensor sampling rates, amount ofdata collected, and the overall relevancy of retrieved samples. Alternatives tothis pipeline are enabled by hardware development vectors including advancingCMOS, GPUs, neuromorphic computing, and edge computing. By calculating metricsfor each alternative such as overall F1 score, power, hardware cost, and energyexpended per relevant sample, this model allows alternate data collectionsystems to be rigorously compared. To demonstrate this model's capability, we apply it to the CMS experiment(and planned HL-LHC upgrade) to evaluate and compare the application of noveltechnologies in the data acquisition system (DAQ). We demonstrate thatimprovements to early stages in the DAQ are highly beneficial, greatly reducingthe resources required at later stages of processing (such as a 60% powerreduction) and increasing the amount of relevant data retrieved from theexperiment per unit power (improving from 0.065 to 0.31 samples/kJ) However, wepredict further advances will be required in order to meet overall power andcost constraints for the DAQ.
科学实验数据的指数级增长超过了硅技术的进步速度。因此,为了满足未来科学实验的计算需求,架构与工艺或器件技术的异构组合变得越来越重要。然而,异构计算系统的复杂性需要系统建模来了解其性能。为了满足这一需求,我们提出了一个模型,该模型将数据收集管道和约束条件的关键方面框定出来,并将它们与影响替代方案的重要技术向量、计算指标结合起来,以便对复杂的替代方案进行比较。例如,数据收集管道可以通过传感器采样率、收集的数据量和检索样本的总体相关性等参数来表征。该流水线的替代方案由硬件开发矢量支持,包括不断进步的CMOS、GPU、神经形态计算和边缘计算。通过计算每个替代方案的指标,如总体 F1 分数、功率、硬件成本和每个相关样本的能源消耗,该模型允许对替代数据收集系统进行严格比较。为了证明该模型的能力,我们将其应用于 CMS 实验(以及计划中的 HL-LHC 升级),以评估和比较数据采集系统(DAQ)中新型技术的应用。我们证明,对数据采集系统早期阶段的改进是非常有益的,大大减少了后期处理阶段所需的资源(例如降低了 60% 的功率),并提高了单位功率从实验中检索到的相关数据量(从 0.065 样本/千焦提高到 0.31 样本/千焦),但是,我们预测还需要进一步的改进才能满足数据采集系统的总体功率和成本限制。
{"title":"Modeling Performance of Data Collection Systems for High-Energy Physics","authors":"Wilkie Olin-Ammentorp, Xingfu Wu, Andrew A. Chien","doi":"arxiv-2407.00123","DOIUrl":"https://doi.org/arxiv-2407.00123","url":null,"abstract":"Exponential increases in scientific experimental data are outstripping the\u0000rate of progress in silicon technology. As a result, heterogeneous combinations\u0000of architectures and process or device technologies are increasingly important\u0000to meet the computing demands of future scientific experiments. However, the\u0000complexity of heterogeneous computing systems requires systematic modeling to\u0000understand performance. We present a model which addresses this need by framing key aspects of data\u0000collection pipelines and constraints, and combines them with the important\u0000vectors of technology that shape alternatives, computing metrics that allow\u0000complex alternatives to be compared. For instance, a data collection pipeline\u0000may be characterized by parameters such as sensor sampling rates, amount of\u0000data collected, and the overall relevancy of retrieved samples. Alternatives to\u0000this pipeline are enabled by hardware development vectors including advancing\u0000CMOS, GPUs, neuromorphic computing, and edge computing. By calculating metrics\u0000for each alternative such as overall F1 score, power, hardware cost, and energy\u0000expended per relevant sample, this model allows alternate data collection\u0000systems to be rigorously compared. To demonstrate this model's capability, we apply it to the CMS experiment\u0000(and planned HL-LHC upgrade) to evaluate and compare the application of novel\u0000technologies in the data acquisition system (DAQ). We demonstrate that\u0000improvements to early stages in the DAQ are highly beneficial, greatly reducing\u0000the resources required at later stages of processing (such as a 60% power\u0000reduction) and increasing the amount of relevant data retrieved from the\u0000experiment per unit power (improving from 0.065 to 0.31 samples/kJ) However, we\u0000predict further advances will be required in order to meet overall power and\u0000cost constraints for the DAQ.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ViPErLEED package II: Spot tracking, extraction and processing of I(V) curves ViPErLEED 软件包 II:I(V) 曲线的点跟踪、提取和处理
Pub Date : 2024-06-26 DOI: arxiv-2406.18413
Michael Schmid, Florian Kraushofer, Alexander M. Imre, Tilman Kißlinger, Lutz Hammer, Ulrike Diebold, Michele Riva
As part of the ViPErLEED project (Vienna package for Erlangen LEED,low-energy electron diffraction), computer programs have been developed forfacile and user-friendly data extraction from movies of LEED images. Theprograms make use of some concepts from astronomical image processing andanalysis. As a first step, flat-field and dark-frame corrections reduce theeffects of inhomogeneities of the camera and screen. In a second step, foridentifying all diffraction maxima ("spots"), it is sufficient to manually markand label a single spot or very few spots. Then the program can automaticallyidentify all other spots and determine the distortions of the image. This formsthe basis for automatic spot tracking (following the "beams" as they moveacross the LEED screen) and intensity measurement. Even for complex structureswith hundreds to a few thousand diffraction beams, this step takes less than aminute. The package also includes a program for further processing of theseI(V) curves (averaging of equivalent beams, manual and/or automatic selection,smoothing) as well as several utilities. The software is implemented as a setof plugins for the public-domain image processing program ImageJ and providedas an open-source package.
作为 ViPErLEED 项目(埃尔兰根 LEED 低能电子衍射维也纳软件包)的一部分,我们开发了计算机程序,以便从 LEED 图像的电影中提取方便易用的数据。这些程序利用了天文图像处理和分析中的一些概念。第一步,平场和暗帧校正减少了相机和屏幕不均匀性的影响。第二步,为了识别所有衍射最大值("光斑"),只需手动标记和标注一个或几个光斑。然后,程序就可以自动识别所有其他光斑,并确定图像的失真度。这就为自动光斑跟踪(跟随 "光束 "在 LEED 屏幕上移动)和强度测量奠定了基础。即使是数百到数千个衍射光束的复杂结构,这一步骤也只需不到一分钟的时间。软件包还包括一个用于进一步处理这些 I(V) 曲线的程序(等效光束平均、手动和/或自动选择、平滑)以及若干实用程序。该软件是作为公共领域图像处理程序 ImageJ 的一套插件实现的,并作为一个开源软件包提供。
{"title":"ViPErLEED package II: Spot tracking, extraction and processing of I(V) curves","authors":"Michael Schmid, Florian Kraushofer, Alexander M. Imre, Tilman Kißlinger, Lutz Hammer, Ulrike Diebold, Michele Riva","doi":"arxiv-2406.18413","DOIUrl":"https://doi.org/arxiv-2406.18413","url":null,"abstract":"As part of the ViPErLEED project (Vienna package for Erlangen LEED,\u0000low-energy electron diffraction), computer programs have been developed for\u0000facile and user-friendly data extraction from movies of LEED images. The\u0000programs make use of some concepts from astronomical image processing and\u0000analysis. As a first step, flat-field and dark-frame corrections reduce the\u0000effects of inhomogeneities of the camera and screen. In a second step, for\u0000identifying all diffraction maxima (\"spots\"), it is sufficient to manually mark\u0000and label a single spot or very few spots. Then the program can automatically\u0000identify all other spots and determine the distortions of the image. This forms\u0000the basis for automatic spot tracking (following the \"beams\" as they move\u0000across the LEED screen) and intensity measurement. Even for complex structures\u0000with hundreds to a few thousand diffraction beams, this step takes less than a\u0000minute. The package also includes a program for further processing of these\u0000I(V) curves (averaging of equivalent beams, manual and/or automatic selection,\u0000smoothing) as well as several utilities. The software is implemented as a set\u0000of plugins for the public-domain image processing program ImageJ and provided\u0000as an open-source package.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian Framework to Investigate Radiation Reaction in Strong Fields 研究强场辐射反应的贝叶斯框架
Pub Date : 2024-06-26 DOI: arxiv-2406.19420
E. E. LosImperial College London, C. ArranUniversity of York, E. GerstmayrImperial College LondonQueens University BelfastSLAC National Accelerator Laboratory, M. J. V. StreeterQueens University Belfast, Z. NajmudinImperial College London, C. P. RidgersUniversity of York, G. SarriQueens University Belfast, S. P. D ManglesImperial College London
Recent experiments aiming to measure phenomena predicted by strong fieldquantum electrodynamics have done so by colliding relativistic electron beamsand high-power lasers. In such experiments, measurements of the collisionparameters are not always feasible, however, precise knowledge of theseparameters is required for accurate tests of strong-field quantumelectrodynamics. Here, we present a novel Bayesian inference procedure whichinfers collision parameters that could not be measured on-shot. This procedureis applicable to all-optical non-linear Compton scattering experimentsinvestigating radiation reaction. The framework allows multiple diagnostics tobe combined self-consistently and facilitates the inclusion of prior or knowninformation pertaining to the collision parameters. Using this Bayesiananalysis, the relative validity of the classical, quantum-continuous andquantum-stochastic models of radiation reaction were compared for a series oftest cases, which demonstrate the accuracy and model selection capability ofthe framework and and highlight its robustness in the event that theexperimental values of fixed parameters differ from their values in the models.
最近,一些旨在测量强场量子电动力学所预言现象的实验是通过相对论电子束和高功率激光器的碰撞来实现的。在这类实验中,对碰撞参数的测量并不总是可行的,然而,要准确地测试强场量子电动力学,就必须精确了解这些参数。在这里,我们提出了一种新颖的贝叶斯推理程序,它可以推导出无法在现场测量的碰撞参数。该程序适用于研究辐射反应的全光非线性康普顿散射实验。该框架允许将多种诊断自洽地结合在一起,并便于纳入与碰撞参数有关的先验或已知信息。利用这种贝叶斯分析方法,在一系列测试案例中比较了辐射反应的经典模型、量子连续模型和量子随机模型的相对有效性,证明了该框架的准确性和模型选择能力,并突出了其在固定参数的实验值与模型值不同时的稳健性。
{"title":"A Bayesian Framework to Investigate Radiation Reaction in Strong Fields","authors":"E. E. LosImperial College London, C. ArranUniversity of York, E. GerstmayrImperial College LondonQueens University BelfastSLAC National Accelerator Laboratory, M. J. V. StreeterQueens University Belfast, Z. NajmudinImperial College London, C. P. RidgersUniversity of York, G. SarriQueens University Belfast, S. P. D ManglesImperial College London","doi":"arxiv-2406.19420","DOIUrl":"https://doi.org/arxiv-2406.19420","url":null,"abstract":"Recent experiments aiming to measure phenomena predicted by strong field\u0000quantum electrodynamics have done so by colliding relativistic electron beams\u0000and high-power lasers. In such experiments, measurements of the collision\u0000parameters are not always feasible, however, precise knowledge of these\u0000parameters is required for accurate tests of strong-field quantum\u0000electrodynamics. Here, we present a novel Bayesian inference procedure which\u0000infers collision parameters that could not be measured on-shot. This procedure\u0000is applicable to all-optical non-linear Compton scattering experiments\u0000investigating radiation reaction. The framework allows multiple diagnostics to\u0000be combined self-consistently and facilitates the inclusion of prior or known\u0000information pertaining to the collision parameters. Using this Bayesian\u0000analysis, the relative validity of the classical, quantum-continuous and\u0000quantum-stochastic models of radiation reaction were compared for a series of\u0000test cases, which demonstrate the accuracy and model selection capability of\u0000the framework and and highlight its robustness in the event that the\u0000experimental values of fixed parameters differ from their values in the models.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"65 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - PHYS - Data Analysis, Statistics and Probability
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1